# Intent Classification

In this notebook, we will use a pre-trained BERT model to classify the intent of a question. We will use a dataset of questions and their corresponding intents to train the model. The model will be fine-tuned on the dataset and then used to classify new questions into one of the intents.

Creating an NLP-based framework to parse the input question to categorize the intent into one of the question types.

Question Types:
1. Why is action A not used in the plan, rather than being used?
2. Why is action A used in the plan, rather than not being used?
3. Why is action A used in state S, rather than action B?

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split
from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments
from datasets import Dataset
import torch
from sklearn.metrics import classification_report, accuracy_score

In [16]:
# Load the CSV file into a DataFrame
df = pd.read_csv('./data/intent_classification_dataset.csv')

# Display the first few rows of the DataFrame
print(f"Number of rows in the dataset: {df.shape[0]}")
df.head()

Number of rows in the dataset: 107


Unnamed: 0,text,label
0,Why is action A not included in the project ro...,1
1,What are the reasons for excluding action A fr...,1
2,Why was action A omitted from the strategy?,1
3,Why didn't we consider action A for the projec...,1
4,Why was action A left out of the final plan?,1


In [4]:
# Train/test split
train_df, test_df = train_test_split(df, test_size=0.2, random_state=13, stratify=df['label'])

# Convert DataFrame to Hugging Face Dataset
train_dataset = Dataset.from_pandas(train_df)
test_dataset = Dataset.from_pandas(test_df)

In [5]:
# Tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# Tokenization function
def tokenize_function(examples):
    return tokenizer(examples['text'], padding="max_length", truncation=True, max_length=128)

# Tokenize datasets
tokenized_train = train_dataset.map(tokenize_function, batched=True)
tokenized_test = test_dataset.map(tokenize_function, batched=True)

Map:   0%|          | 0/85 [00:00<?, ? examples/s]

Map:   0%|          | 0/22 [00:00<?, ? examples/s]

In [6]:
def get_best_available_device():
    if torch.cuda.is_available():
        return torch.device("cuda")
    elif torch.backends.mps.is_available():
        return torch.device("mps")
    else:
        return torch.device("cpu")

device = get_best_available_device()
print(f"Using device: {device}")

Using device: mps


In [7]:
# Load pre-trained BERT models for sequence classification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=3)
model.to(device);

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [8]:
# Training arguments
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=3,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    eval_strategy="epoch",
    save_steps=10_000,
    save_total_limit=2,
)

# Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_test,
)

# Train the models
trainer.train()

Epoch,Training Loss,Validation Loss
1,No log,0.501453
2,No log,0.477852
3,No log,0.474032


TrainOutput(global_step=33, training_loss=0.478452104510683, metrics={'train_runtime': 26.2188, 'train_samples_per_second': 9.726, 'train_steps_per_second': 1.259, 'total_flos': 16773480380160.0, 'train_loss': 0.478452104510683, 'epoch': 3.0})

In [9]:
# Evaluate the models
results = trainer.evaluate()
print(results)

{'eval_loss': 0.4740319550037384, 'eval_runtime': 0.4699, 'eval_samples_per_second': 46.819, 'eval_steps_per_second': 6.384, 'epoch': 3.0}


In [11]:
# Function to classify new questions
def classify_question(question):
    inputs = tokenizer(question, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device)
    outputs = model(**inputs)
    prediction = torch.argmax(outputs.logits, dim=1)
    return prediction.item()

In [12]:
# Predict and compare
test_df['predicted_label'] = test_df['text'].apply(classify_question)
y_true = test_df['label']
y_pred = test_df['predicted_label']

In [13]:
# Print the classification report
print(classification_report(y_true, y_pred, zero_division=0.0))
print(f"Accuracy: {accuracy_score(y_true, y_pred):.2f}")

              precision    recall  f1-score   support

           1       0.60      0.38      0.46         8
           2       0.35      0.86      0.50         7
           3       0.00      0.00      0.00         7

    accuracy                           0.41        22
   macro avg       0.32      0.41      0.32        22
weighted avg       0.33      0.41      0.33        22

Accuracy: 0.41
