# Text classification

Text classification

1. Finetune [DistilBERT](https://huggingface.co/distilbert-base-uncased)
2. Use your finetuned model for inference.

## Load IMDb dataset

Start by loading the IMDb dataset from the ðŸ¤— Datasets library:

In [2]:
from datasets import load_dataset, DatasetDict

# imdb = load_dataset("imdb")
# spam_data = load_dataset("TrainingDataPro/email-spam-classification")
# spam_data = load_dataset("ucirvine/sms_spam")

In [3]:
intent_data = DatasetDict.load_from_disk('./data/intent_data')
intent_data

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 80
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 20
    })
})

In [4]:
def map_labels(example):
    if example["label"] == "irrelevant":
        example["label"] = 0
    elif example["label"] == "relevant":
        example["label"] = 1
    return example

# spam_data = spam_data.rename_column("type", "label")
# spam_data = spam_data.remove_columns(["title"])
# spam_data = spam_data.map(map_labels)
intent_data = intent_data.map(map_labels)

In [5]:
intent_data

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 80
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 20
    })
})

Then take a look at an example:

In [6]:
intent_data["train"][0]

{'text': 'Find the pattern of total network cost for 2023 and percentage change up to 2 decimal place',
 'label': 1}

There are two fields in this dataset:

- `text`: the text.
- `label`: a value that is either `0` for a negative review or `1`

## Preprocess

The next step is to load a DistilBERT tokenizer to preprocess the `text` field:

In [7]:
from transformers import AutoTokenizer, AutoConfig

# mapping
id2label = {0: "irrelevant", 1: "relevant"}
label2id = {"irrelevant": 0, "relevant": 1}

model_name = 'microsoft/xtremedistil-l6-h256-uncased'

config = AutoConfig.from_pretrained(
    pretrained_model_name_or_path=model_name,
    num_labels=2,
    finetuning_task="text-classification",
    force_download=False,
    trust_remote_code=False,
    return_unused_kwargs=False,
    id2label=id2label, label2id=label2id
)

tokenizer = AutoTokenizer.from_pretrained(
    pretrained_model_name_or_path=model_name,
    use_fast=True,
    trust_remote_code=False,
)



Create a preprocessing function to tokenize `text` and truncate sequences to be no longer than DistilBERT's maximum input length:

In [8]:
def preprocess_function(examples):
    return tokenizer(examples["text"], padding=True, max_length=512, truncation=True)

tokenized_intent_data = intent_data.map(preprocess_function, batched=True)
# tokenized_spam_data = spam_data.map(preprocess_function, batched=True)
# tokenized_imdb = imdb.map(preprocess_function, batched=True)

Map:   0%|          | 0/80 [00:00<?, ? examples/s]

Map:   0%|          | 0/20 [00:00<?, ? examples/s]

Now create a batch of examples using [DataCollatorWithPadding](https://huggingface.co/docs/transformers/main/en/main_classes/data_collator#transformers.DataCollatorWithPadding). It's more efficient to *dynamically pad* the sentences to the longest length in a batch during collation, instead of padding the whole dataset to the maximum length.

In [9]:
tokenized_intent_data

DatasetDict({
    train: Dataset({
        features: ['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask'],
        num_rows: 80
    })
    test: Dataset({
        features: ['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask'],
        num_rows: 20
    })
})

## Evaluate

Then create a function that passes your predictions and labels to [compute](https://huggingface.co/docs/evaluate/main/en/package_reference/main_classes#evaluate.EvaluationModule.compute) to calculate the accuracy:

In [10]:
import numpy as np
from transformers import EvalPrediction
import evaluate

accuracy_metric = evaluate.load("accuracy")
precision_metric = evaluate.load("precision")
recall_metric = evaluate.load("recall")
f1_metric = evaluate.load("f1")

# def compute_metrics(eval_pred):
#     predictions, labels = eval_pred
#     predictions = np.argmax(predictions, axis=1)
#     return metric.compute(predictions=predictions, references=labels)

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    preds = np.argmax(logits, axis=-1)
    accuracy = accuracy_metric.compute(predictions=preds, references=labels)
    precision = precision_metric.compute(predictions=preds, references=labels, average='weighted')
    recall = recall_metric.compute(predictions=preds, references=labels, average='weighted')
    f1 = f1_metric.compute(predictions=preds, references=labels, average='weighted')
    
    return {
        'accuracy': accuracy['accuracy'],
        'precision': precision['precision'],
        'recall': recall['recall'],
        'f1': f1['f1'],
    }

Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.

## Train

In [11]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer, pad_to_multiple_of=2)

In [12]:
from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer, TrainerCallback

# model = AutoModelForSequenceClassification.from_pretrained(
#     "distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
# )
model = AutoModelForSequenceClassification.from_pretrained(
    pretrained_model_name_or_path=model_name,
    # from_tf=,
    config=config,
    trust_remote_code=False,
)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at microsoft/xtremedistil-l6-h256-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [13]:
# # freeze layers
# for param in model.base_model.embeddings.parameters():
#     param.requires_grad = False
# for param in model.base_model.transformer.layer[:3].parameters():
#     param.requires_grad = False

# trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
# trainable_params #total: 66955010

In [17]:
class LogCallback(TrainerCallback):
    def on_log(self, args, state, control, logs=None, **kwargs):
        if logs is not None:
            # Remove the 'total_flos' log as it is usually not needed
            logs.pop("total_flos", None) 
            if state.is_local_process_zero:
                print(logs)

training_args = TrainingArguments(
    output_dir="saved_model/query_intent_model",
    overwrite_output_dir=True,
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=25,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    push_to_hub=False,
    # save_steps=1000,
    save_total_limit=1,  # Keep only the best model and the latest checkpoint

)

# Initialize our Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_intent_data["train"],
    eval_dataset=tokenized_intent_data["test"],
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    callbacks=[LogCallback()]  # Add the custom callback here
)

train_result = trainer.train()

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False)


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.497756,1.0,1.0,1.0,1.0
2,No log,0.449155,1.0,1.0,1.0,1.0
3,No log,0.407346,1.0,1.0,1.0,1.0
4,No log,0.370888,1.0,1.0,1.0,1.0
5,No log,0.34197,1.0,1.0,1.0,1.0
6,No log,0.314892,1.0,1.0,1.0,1.0
7,No log,0.29248,1.0,1.0,1.0,1.0
8,No log,0.273308,1.0,1.0,1.0,1.0
9,No log,0.255825,1.0,1.0,1.0,1.0
10,No log,0.240962,1.0,1.0,1.0,1.0


{'eval_loss': 0.4977564811706543, 'eval_accuracy': 1.0, 'eval_precision': 1.0, 'eval_recall': 1.0, 'eval_f1': 1.0, 'eval_runtime': 0.0673, 'eval_samples_per_second': 297.241, 'eval_steps_per_second': 44.586, 'epoch': 1.0}
{'eval_loss': 0.4491547644138336, 'eval_accuracy': 1.0, 'eval_precision': 1.0, 'eval_recall': 1.0, 'eval_f1': 1.0, 'eval_runtime': 0.0673, 'eval_samples_per_second': 297.15, 'eval_steps_per_second': 44.573, 'epoch': 2.0}
{'eval_loss': 0.40734606981277466, 'eval_accuracy': 1.0, 'eval_precision': 1.0, 'eval_recall': 1.0, 'eval_f1': 1.0, 'eval_runtime': 0.0772, 'eval_samples_per_second': 259.202, 'eval_steps_per_second': 38.88, 'epoch': 3.0}
{'eval_loss': 0.37088775634765625, 'eval_accuracy': 1.0, 'eval_precision': 1.0, 'eval_recall': 1.0, 'eval_f1': 1.0, 'eval_runtime': 0.0835, 'eval_samples_per_second': 239.5, 'eval_steps_per_second': 35.925, 'epoch': 4.0}
{'eval_loss': 0.34196987748146057, 'eval_accuracy': 1.0, 'eval_precision': 1.0, 'eval_recall': 1.0, 'eval_f1': 1.0

In [18]:
# save best model
best_model_dir = "./saved_model/query_intent_model/best_model"
trainer.model.save_pretrained(best_model_dir, )
tokenizer.save_pretrained(best_model_dir)

('./saved_model/query_intent_model/best_model/tokenizer_config.json',
 './saved_model/query_intent_model/best_model/special_tokens_map.json',
 './saved_model/query_intent_model/best_model/vocab.txt',
 './saved_model/query_intent_model/best_model/added_tokens.json',
 './saved_model/query_intent_model/best_model/tokenizer.json')

In [20]:
# trainer.model.push_to_hub("apurvasf/query_intent_model", token="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")

model.safetensors: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 51.0M/51.0M [00:13<00:00, 3.81MB/s]


CommitInfo(commit_url='https://huggingface.co/apurvasf/query_intent_model/commit/ee088d9ae948061a7d5ed0b496c2ae58b7b2e96e', commit_message='Upload BertForSequenceClassification', commit_description='', oid='ee088d9ae948061a7d5ed0b496c2ae58b7b2e96e', pr_url=None, pr_revision=None, pr_num=None)

In [19]:
train_result.metrics

{'train_runtime': 55.8571,
 'train_samples_per_second': 35.806,
 'train_steps_per_second': 4.476,
 'train_loss': 0.26376043701171875,
 'epoch': 25.0}

In [21]:
metrics = trainer.evaluate(eval_dataset=tokenized_intent_data["test"])
metrics

{'eval_loss': 0.1533588469028473, 'eval_accuracy': 1.0, 'eval_precision': 1.0, 'eval_recall': 1.0, 'eval_f1': 1.0, 'eval_runtime': 0.068, 'eval_samples_per_second': 294.3, 'eval_steps_per_second': 44.145, 'epoch': 25.0}


{'eval_loss': 0.1533588469028473,
 'eval_accuracy': 1.0,
 'eval_precision': 1.0,
 'eval_recall': 1.0,
 'eval_f1': 1.0,
 'eval_runtime': 0.068,
 'eval_samples_per_second': 294.3,
 'eval_steps_per_second': 44.145,
 'epoch': 25.0}

## Inference

Great, now that you've finetuned a model, you can use it for inference!

Grab some text you'd like to run inference on:

In [22]:
predict_dataset = tokenized_intent_data["test"].remove_columns("label")
predictions = trainer.predict(predict_dataset, metric_key_prefix="predict").predictions
predictions = np.argmax(predictions, axis=1)
predictions

array([1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0])

In [23]:
text1 = "How is Rupee values against Dollar right now?"
text2 = "What is the per month overall cost per subs for FY 2023"

In [24]:
from transformers import pipeline

classifier = pipeline("text-classification", model="./saved_model/query_intent_model/best_model")

print(classifier(text1), classifier(text2))

[{'label': 'irrelevant', 'score': 0.6782032251358032}] [{'label': 'relevant', 'score': 0.6907285451889038}]


In [23]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("./saved_model/query_intent_model/best_model")
inputs = tokenizer(text1, return_tensors="pt")

# Pass your inputs to the model and return the `logits`
from transformers import AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("./saved_model/query_intent_model/best_model")
with torch.no_grad():
    logits = model(**inputs).logits

# Get the class with the highest probability, and use the model's `id2label` mapping to convert it to a text label
import torch.nn.functional as F

probabilities = F.softmax(logits, dim=1)
predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id], probabilities.max().item()

('relevant', 0.5066925287246704)

## TF

In [24]:
from transformers import AutoTokenizer, BertTokenizer, AutoConfig

# mapping
id2label = {0: "irrelevant", 1: "relevant"}
label2id = {"irrelevant": 0, "relevant": 1}

config = AutoConfig.from_pretrained(
    pretrained_model_name_or_path="saved_model/query_intent_model/best_model",
    num_labels=2,
    finetuning_task="text-classification",
    force_download=False,
    trust_remote_code=False,
    return_unused_kwargs=False,
    id2label=id2label, label2id=label2id
)

tokenizer = AutoTokenizer.from_pretrained(
    pretrained_model_name_or_path="saved_model/query_intent_model/best_model",
    use_fast=True,
    trust_remote_code=False,
)

In [45]:
def preprocess_function(examples):
    return tokenizer(examples["text"], padding=True, max_length=512, truncation=True)

tokenized_intent_data = intent_data.map(preprocess_function, batched=True)

Map:   0%|          | 0/80 [00:00<?, ? examples/s]

Map: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 80/80 [00:00<00:00, 2821.67 examples/s]
Map: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 20/20 [00:00<00:00, 651.65 examples/s]


In [46]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer, return_tensors="tf")

In [27]:
from transformers import create_optimizer
import tensorflow as tf

batch_size = 4
num_epochs = 1
batches_per_epoch = len(tokenized_intent_data["train"]) // batch_size
total_train_steps = int(batches_per_epoch * num_epochs)
optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)

In [25]:
from transformers import TFAutoModelForSequenceClassification, TrainingArguments, Trainer, TrainerCallback
from transformers import TFBertForSequenceClassification, BertTokenizer, TextClassificationPipeline

model = TFAutoModelForSequenceClassification.from_pretrained(
    pretrained_model_name_or_path="saved_model/query_intent_model/best_model",
    # from_tf=,
    config=config,
    trust_remote_code=False,
)

tf_train_set = model.prepare_tf_dataset(
    tokenized_intent_data["train"],
    shuffle=True,
    batch_size=4,
    collate_fn=data_collator,
)

tf_validation_set = model.prepare_tf_dataset(
    tokenized_intent_data["test"],
    shuffle=False,
    batch_size=4,
    collate_fn=data_collator,
)

All PyTorch model weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.


In [16]:
from transformers.keras_callbacks import KerasMetricCallback

model.compile(optimizer=optimizer)  # No loss argument!

metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
callbacks = [metric_callback]

model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=num_epochs, callbacks=callbacks)

Cause: for/else statement not yet supported
Cause: for/else statement not yet supported


2024-07-12 12:28:59.384346: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


<tf_keras.src.callbacks.History at 0x7fa4cc2923e0>

In [26]:
model.summary()

Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bert (TFBertMainLayer)      multiple                  12750080  
                                                                 
 dropout_19 (Dropout)        multiple                  0 (unused)
                                                                 
 classifier (Dense)          multiple                  514       
                                                                 
Total params: 12750594 (48.64 MB)
Trainable params: 12750594 (48.64 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### Inference

In [29]:
import tensorflow as tf

tokenizer = AutoTokenizer.from_pretrained("./saved_model/query_intent_model/best_model")
inputs = tokenizer(text2, return_tensors="tf")

logits = model(**inputs).logits

probabilities = tf.nn.softmax(logits)
predicted_class_id = int(tf.math.argmax(logits, axis=-1)[0])
model.config.id2label[predicted_class_id], probabilities.numpy().max()

('relevant', 0.5289333)

In [30]:
model.save("saved_model/tf_query_intent_model", overwrite=True, save_format="tf")

INFO:tensorflow:Assets written to: saved_model/tf_query_intent_model/assets


INFO:tensorflow:Assets written to: saved_model/tf_query_intent_model/assets


In [31]:
# from tensorflow.keras.utils import custom_object_scope

loaded_model = tf.keras.models.load_model('saved_model/tf_query_intent_model')

# Show the model architecture
loaded_model.summary()





Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bert (TFBertMainLayer)      multiple                  12750080  
                                                                 
 dropout_19 (Dropout)        multiple                  0         
                                                                 
 classifier (Dense)          multiple                  514       
                                                                 
Total params: 12750594 (48.64 MB)
Trainable params: 12750594 (48.64 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### Tensorflow.js format

In [33]:
del loaded_model

In [42]:
import tensorflow as tf
from transformers import TFAutoModelForSequenceClassification, AutoTokenizer, TFDistilBertForSequenceClassification

# Convert the PyTorch model to TensorFlow
model = TFDistilBertForSequenceClassification.from_pretrained("saved_model/query_intent_model/best_model")

You are using a model of type bert to instantiate a model of type distilbert. This is not supported for all configurations of models and can yield errors.
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFDistilBertForSequenceClassification: ['bert.encoder.layer.5.attention.self.value.weight', 'bert.encoder.layer.3.intermediate.dense.bias', 'bert.encoder.layer.0.output.dense.weight', 'bert.encoder.layer.1.output.dense.weight', 'bert.encoder.layer.5.attention.self.value.bias', 'bert.encoder.layer.2.attention.self.value.bias', 'bert.encoder.layer.2.attention.output.LayerNorm.weight', 'bert.encoder.layer.0.attention.self.value.bias', 'bert.encoder.layer.4.output.dense.bias', 'bert.encoder.layer.2.attention.output.dense.weight', 'bert.encoder.layer.0.attention.self.value.weight', 'bert.pooler.dense.bias', 'bert.encoder.layer.3.intermediate.dense.weight', 'bert.encoder.layer.3.attention.self.key.bias', 'bert.encoder.layer.2.output.dense.weight', 'bert.enco

In [38]:
text1 = "How is Rupee values against Dollar right now?"
text2 = "What is the per month overall cost per subs for FY 2023"

tokenizer = AutoTokenizer.from_pretrained("saved_model/query_intent_model/best_model")
inputs = tokenizer(text2, return_tensors="tf")

logits = model(**inputs).logits

probabilities = tf.nn.softmax(logits)
predicted_class_id = int(tf.math.argmax(logits, axis=-1)[0])
model.config.id2label[predicted_class_id], probabilities.numpy().max()

In [21]:
# Save the TensorFlow model
model.save("saved_model/tf_query_intent_model", overwrite=True)
# tokenizer.save_pretrained("saved_model/tf_query_intent_model")

INFO:tensorflow:Assets written to: saved_model/tf_query_intent_model/assets


INFO:tensorflow:Assets written to: saved_model/tf_query_intent_model/assets


In [64]:
!tensorflowjs_converter \
    --input_format=tf_saved_model \
    --output_format=tfjs_graph_model \
    saved_model/tf_query_intent_model \
    saved_model/tfjs_model


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Traceback (most recent call last):
  File "/home/apurva/anaconda3/envs/clm_ja/bin/tensorflowjs_converter", line 8, in <module>
    sys.exit(pip_main())
  File "/home/apurva/anaconda3/envs/clm_ja/lib/python3.10/site-packages/tensorflowjs/converters/converter.py", line 959, in pip_main
    main([' '.join(sys.argv[1:])])
  File "/home/apurva/anaconda3/envs/clm_ja/lib/python3.10/site-packages/tensorflowjs/converters/converter.py", line 963, in main
    convert(argv[0].split(' '))
  File "/home/apurva/anaconda3/envs/clm_ja/lib/python3.10/site-packages/tensorflowjs/converters/converter.py", line 949, in convert
    _dispatch_converter(input_format, output_format, args, quantization_dtype_map,
  File "/home/apurva/anaconda3/envs/clm_ja/lib/python3.10/site-packages/tensorflowjs/converters/converter.py", line 655, in _dispatch_converter
    tf_saved_model_conversion_v2.convert_tf_saved_model(
  File "/home/apurva/anaconda3/envs/clm_ja/lib/python3.10/site-packages/tensorflowjs/converters/tf_save

In [64]:
!tensorflowjs_converter --help

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


usage: TensorFlow.js model converters. [-h]
                                       [--input_format {tf_hub,keras,tf_frozen_model,tfjs_layers_model,keras_keras,keras_saved_model,tf_saved_model}]
                                       [--output_format {keras,tfjs_layers_model,keras_keras,keras_saved_model,tfjs_graph_model}]
                                       [--signature_name SIGNATURE_NAME]
                                       [--saved_model_tags SAVED_MODEL_TAGS]
                                       [--quantize_float16 [QUANTIZE_FLOAT16]]
                                       [--quantize_uint8 [QUANTIZE_UINT8]]
                                       [--quantize_uint16 [QUANTIZE_UINT16]]
                                       [--quantization_bytes {1,2}]
                                       [--split_weights_by_layer] [--version]
                                       [--skip_op_check]
                                       [--strip_debug_ops STRIP_DEBUG_OPS]
                 

## ONNXRuntime

In [29]:
from transformers import pipeline

classifier_pipeline = pipeline("text-classification", model="./saved_model/query_intent_model/best_model")

print(classifier_pipeline(text1), classifier_pipeline(text2))

[{'label': 'irrelevant', 'score': 0.9627820253372192}] [{'label': 'relevant', 'score': 0.9815863966941833}]


In [26]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("./saved_model/query_intent_model/best_model")
inputs = tokenizer(text1, return_tensors="pt")

# Pass your inputs to the model and return the `logits`
from transformers import AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("./saved_model/query_intent_model/best_model")
with torch.no_grad():
    logits = model(**inputs).logits

# Get the class with the highest probability, and use the model's `id2label` mapping to convert it to a text label
import torch.nn.functional as F

probabilities = F.softmax(logits, dim=1)
predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id], probabilities.max().item()

('irrelevant', 0.9627820253372192)

In [33]:
import transformers
import transformers.convert_graph_to_onnx as onnx_convert
from pathlib import Path

model = model.to("cpu")

onnx_convert.convert_pytorch(classifier_pipeline, opset=11, output=Path("saved_model/query_intent_model.onnx"), use_external_format=False)

Using framework PyTorch: 2.3.1+cu121
Found input input_ids with shape: {0: 'batch', 1: 'sequence'}
Found input attention_mask with shape: {0: 'batch', 1: 'sequence'}
Found output output_0 with shape: {0: 'batch'}
Ensuring inputs are in correct order
head_mask is not present in the generated input list.
Generated inputs order: ['input_ids', 'attention_mask']


  mask, torch.tensor(torch.finfo(scores.dtype).min)


In [1]:
import tensorflow as tf
from transformers import TFDistilBertForQuestionAnswering

distilbert = TFDistilBertForQuestionAnswering.from_pretrained('distilbert-base-cased-distilled-squad')
callable = tf.function(distilbert.call)

2024-07-15 15:45:32.342334: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-15 15:45:32.967857: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  from .autonotebook import tqdm as notebook_tqdm
All PyTorch model weights were used when initializing TFDistilBertForQuestionAnswering.

All the weights of TFDistilBertForQuestionAnswering were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForQuestionAnswering for predi

In [2]:
concrete_function = callable.get_concrete_function([tf.TensorSpec([None, 384], tf.int32, name="input_ids"), tf.TensorSpec([None, 384], tf.int32, name="attention_mask")])

In [5]:
tf.saved_model.save(distilbert, 'saved_model/distilbert_cased_savedmodel', signatures=concrete_function)

























INFO:tensorflow:Assets written to: saved_model/distilbert_cased_savedmodel/assets


INFO:tensorflow:Assets written to: saved_model/distilbert_cased_savedmodel/assets


In [6]:
!saved_model_cli show --dir saved_model/distilbert_cased_savedmodel --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['attention_mask'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 384)
      name: serving_default_attention_mask:0
  inputs['input_ids'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 384)
      name: serving_default_input_ids:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['end_logits'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 384)
      name: StatefulPartitionedCall:0
  outputs['start_logits'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 384)
      name: StatefulPartitionedCall:1
Method name is: tensorflow/serving/predict
