This code demonstrates the process of fine-tuning and evaluating a text classification model for domain-specific text categorization tasks. Using Hugging Face's transformers library, we load a pre-trained model and evaluate its performance on a small custom dataset of enterprise-specific categories. Here’s a detailed explanation of each section in the code:


### First of all let's install required packages




In [1]:
! pip install transformers
! pip install pandas
! pip install torch



### Import Required Libraries
- transformers: Provides pre-trained models, tokenizers, and pipelines for various NLP tasks.
-  pandas: Used to handle data in a structured format, making it easy to work with tabular data like our custom dataset.

In [2]:
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
import pandas as pd

### 2. Load a Pre-trained Model for Sequence Classification
-  We load a pre-trained DistilBERT model (distilbert-base-uncased-finetuned-sst-2-english) that’s fine-tuned for sentiment analysis. While originally designed for sentiment classification, this model can provide a foundation for further adaptation to other classification tasks. Here:

-  model: Loads the sequence classification model.
-  tokenizer: Loads the tokenizer, which encodes input text into a format compatible with the model.

In [3]:
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]



### 3. Define a Custom Dataset
Here, we create a custom dataset using pandas to represent enterprise-specific categories. The dataset includes:

-  text: A sample text input for each task.
-  expected_output: The manually assigned label that represents the correct category for each text. This dataset simulates real-world text classification tasks, such as categorizing customer complaints or technical issues.

In [4]:
data = pd.DataFrame({
   "text": [
       "Customer complaint about billing",       # Likely to be classified as NEGATIVE
       "Legal clause on data privacy",           # Could be classified as NEUTRAL or NEGATIVE
       "Technical issue with software",          # Likely to be classified as NEGATIVE
       "User feedback praising the interface",   # Likely to be classified as POSITIVE
       "Inquiry about account balance",          # Could be classified as NEUTRAL
       "Successful resolution of ticket"         # Likely to be classified as POSITIVE
   ],
   "expected_output": [
       "NEGATIVE",  # Expected to match typical model sentiment label
       "NEGATIVE",  # Adjusted to align with the model's NEGATIVE label
       "NEGATIVE",  # Technical issues often relate to negative experiences
       "POSITIVE",  # Praising feedback aligns with POSITIVE sentiment
       "NEUTRAL",   # General inquiry could be neutral
       "POSITIVE"   # Success story should align with POSITIVE sentiment
   ]
})


###4. Define the Custom Evaluation Function
This function, custom_evaluation, evaluates the model against the custom dataset:

-  nlp_pipeline: A Hugging Face pipeline set up for text classification using the specified model and tokenizer, simplifying the inference process.
-  correct: A counter initialized to track the number of correct predictions by comparing model outputs to expected outputs.



### 5. Loop Through Each Row and Predict Labels
We loop through each entry in the dataset, where:

-  nlp_pipeline(row["text"]): Generates a prediction for the text. The output is a dictionary with keys such as label and score, so we access the label value.
-  print(): Displays the text, predicted label, and expected label for easy comparison.
-correct += 1: Increments the correct counter if the model's prediction matches the expected output.

### 6. Calculate and Print Accuracy
Finally, we calculate and display the model's accuracy on the custom dataset:

-  accuracy: Calculated as the ratio of correct predictions to the total number of examples.
-  print(): Outputs the accuracy as a percentage, giving insight into the model’s performance on this enterprise-specific task.

In [7]:
# Define a function to evaluate the model against the custom dataset.
def custom_evaluation(data, model, tokenizer):
   # Create a text classification pipeline for inference
   nlp_pipeline = pipeline("text-classification", model=model, tokenizer=tokenizer)
   correct = 0

   # Loop through each row in the dataset and get predictions
   for _, row in data.iterrows():
       output = nlp_pipeline(row["text"])[0]["label"]
       print(f"Text: {row['text']} | Predicted: {output} | Expected: {row['expected_output']}")

       # Count correct predictions by matching model output with expected output
       if output == row["expected_output"]:
           correct += 1


   # Calculate accuracy as a percentage
   accuracy = correct / len(data)
   print(f"Custom Evaluation Accuracy: {accuracy * 100:.2f}%")

### 7. Run the Custom Evaluation
This line initiates the custom evaluation function, where the model is tested against each text sample, and results are printed.



In [16]:
custom_evaluation(data, model, tokenizer)

Text: Customer complaint about billing | Predicted: NEGATIVE | Expected: NEGATIVE
Text: Legal clause on data privacy | Predicted: POSITIVE | Expected: NEGATIVE
Text: Technical issue with software | Predicted: NEGATIVE | Expected: NEGATIVE
Text: User feedback praising the interface | Predicted: POSITIVE | Expected: POSITIVE
Text: Inquiry about account balance | Predicted: POSITIVE | Expected: NEUTRAL
Text: Successful resolution of ticket | Predicted: POSITIVE | Expected: POSITIVE
Custom Evaluation Accuracy: 66.67%
