## Hands-On Example : Text Classification for customer feedback to evaluate the level of their satisfaction.
 This is hands-on example for text classification using DistilBERT, a distilled version of BERT. The example demonstrates how to use the Hugging Face Transformers library for this task.

##### Import Required Libraries

In [98]:
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
import torch

- DistilBertTokenizer: This is used for converting text into tokens that the DistilBERT model can understand.
- DistilBertForSequenceClassification: This is the actual DistilBERT model designed for text classification.
- torch: PyTorch library, required for various tensor operations.

#### intialise the tokenizer and model 

In [99]:
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.weight', 'classifier.bias', 'pre_classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


- from_pretrained('distilbert-base-uncased'): This loads the pre-trained DistilBERT model and tokenizer. The 'uncased' version means that the text will be converted to lowercase.

### Tokenize the Text

In [100]:
text = "Iam not statisfied with the product. It is not working as expected."
inputs = tokenizer(text, return_tensors="pt")

- text: The sample text you want to classify.
- tokenizer(text, return_tensors="pt"): This converts the sample text into tokens. 

The return_tensors="pt" argument indicates that the output should be PyTorch tensors.

### Run the Model

In [101]:
outputs = model(**inputs)
logits = outputs.logits

- model(**inputs): This runs the tokenized text through the DistilBERT model.

- outputs.logits: The model returns logits, which are raw, unnormalized scores for each class in the classification task.

### Make Predictions

In [102]:
import torch.nn.functional as F

probs = F.softmax(logits, dim=1)
prediction = torch.argmax(probs, dim=1)

- F.softmax(logits, dim=1): This applies the Softmax function to convert the logits into probabilities.

- torch.argmax(probs, dim=1): This finds the index of the maximum value in the probabilities tensor, effectively giving you the predicted class.

### Print the Prediction

In [103]:
# Sample label mapping with descriptive sentences
label_mapping = {
    0: "The review indicates customer satisfaction and positive feedback.",
    1: "The review indicates customer dissatisfaction and negative feedback.",
    2: "The review is neutral and does not indicate either satisfaction or dissatisfaction."
}

# Prediction tensor to integer
predicted_label_index = prediction.item()

# Map the prediction to the label
predicted_label = label_mapping[predicted_label_index]

print(f"Predicted Label Description: {predicted_label}")


Predicted Label Description: The review indicates customer dissatisfaction and negative feedback.


This prints the prediction, which will be the index of the class with the highest probability.

That's the entire workflow for text classification using DistilBERT. This example is basic and serves educational purposes. In a real-world application, you'd also have additional steps like data preprocessing, model training, and evaluation.