# **Title:** Auto-Tagging Support Tickets Using LLMs & ML

## **Overview**
This project addresses the problem of **automated ticket categorization** using both:

- **Large Language Models (LLMs)** such as **Gemini** and **BART** for zero-shot and few-shot classification, and
- **Traditional machine learning models** like **Logistic Regression** with **TF-IDF** features.

Using a dataset of real-world support tickets (with fields like `title`, `body`, `category`, etc.), we build an end-to-end tagging pipeline that can output **top categories** for each incoming request.

### **Techniques Used**
- Prompt engineering (zero-shot & few-shot)
- Gemini API for generative classification
- TF-IDF + Logistic Regression pipeline
- Hugging Face BART (facebook/bart-large-mnli)
- Top-3 class probability extraction
- LLM vs ML performance comparison

### **Load and Explore the Dataset**
* This cell loads the support ticket dataset from `CSV`, displays the shape (`rows × columns`), and prints the `first 5 rows` and `column names`. It's useful for `initial inspection`.


In [26]:
import pandas as pd

# Step 1: Load the CSV file
df = pd.read_csv("support-ticket.csv")

# Step 2: Show the shape and first few rows
print("Shape:", df.shape)
print("First 5 rows:")
print(df.head())

# Step 3: Print the column names
print("\nColumn names:")
print(df.columns.tolist())

Shape: (48549, 9)
First 5 rows:
                                   title  \
0                                    NaN   
1                   connection with icon   
2                   work experience user   
3                 requesting for meeting   
4  reset passwords for external accounts   

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

### **Data Cleaning and Preparation**
* Here, the `title` and body `columns` are merged into a single `text` column. Any rows with missing `text` or `category` values are dropped. The `category` column is cast to integer type. This ensures clean, usable data for downstream models.

In [27]:
# Step 1.1: Combine 'title' and 'body' into one text column
df['text'] = df['title'].fillna('') + ' ' + df['body'].fillna('')

# Step 1.2: Drop rows where 'text' or 'category' is missing
df = df.dropna(subset=['text', 'category'])

# Step 1.3: Ensure 'category' is integer
df['category'] = df['category'].astype(int)

# Step 1.4: Print basic info
print("Cleaned dataset shape:", df.shape)
print("\nExample text:\n", df['text'].iloc[0])
print("\nUnique categories:", df['category'].nunique())
print(df['category'].value_counts().head())


Cleaned dataset shape: (48549, 10)

Example text:
  hi since recruiter lead permission approve requisitions makes please make thanks recruiter

Unique categories: 13
category
4     34061
5      9634
6      2628
7       921
11      612
Name: count, dtype: int64


### **Zero-Shot Classification with Hugging Face BART**
* This cell uses `facebook/bart-large-mnli` to perform zero-shot classification on a single ticket. Candidate labels are derived from unique categories in the dataset.

In [28]:
from transformers import pipeline

# Initialize zero-shot classifier
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

# Example input
sample_text = df['text'].iloc[0]

# Define candidate labels
candidate_labels = [str(c) for c in sorted(df['category'].unique())]

# Run zero-shot prediction
result = classifier(sample_text, candidate_labels, multi_label=False)

# Show result
print("🔍 Top prediction:", result['labels'][0])
print("🔢 All scores:", dict(zip(result['labels'], result['scores'])))

Device set to use cuda:0


🔍 Top prediction: 1
🔢 All scores: {'1': 0.1324264407157898, '2': 0.09119339287281036, '0': 0.08559601753950119, '4': 0.08088956028223038, '8': 0.07654865831136703, '6': 0.0763733834028244, '3': 0.07570978254079819, '9': 0.07054754346609116, '7': 0.0695902407169342, '12': 0.06742415577173233, '5': 0.06341762095689774, '10': 0.0553167499601841, '11': 0.05496637895703316}


### **Install and Configure Gemini API**
* Installs the `google-generativeai` SDK and sets up Gemini with your API key.

In [29]:
!pip install -q -U google-generativeai

In [30]:
import google.generativeai as genai

# Replace with your Gemini API key
genai.configure(api_key="ENTER_YOUR_GEMINI_API_KEY")

##### **(Optional) Check List Available Gemini Models**
* This (commented-out) block lets you inspect all available Gemini models using `genai.list_models()`.

In [31]:
# for m in genai.list_models():
#     print(m.name)

### **Create Zero-Shot Prompt for Gemini**
* This prompt is formatted to instruct Gemini to classify a single text into one of several ticket categories. It’s a plain zero-shot example with no training examples included.

In [32]:
# ✉️ Zero-shot prompt (no examples)
zero_shot_prompt = """
You are an AI support ticket classifier. Classify the following text into one of the categories:
1, 3, 4, 5, 6, 7, 8, 9, 11, 12

Text: "connection with icon"
Category:
"""

### **Generate Zero-Shot Prediction from Gemini**
* Here, Gemini (`models/gemini-1.5-flash`) is called using the prompt defined earlier. The predicted category is printed.

In [33]:
model = genai.GenerativeModel("models/gemini-1.5-flash")

response = model.generate_content(zero_shot_prompt)

print("🔮 Predicted Category:", response.text.strip())

🔮 Predicted Category: The provided text "connection with icon" is too vague for accurate classification without more context.  It could relate to several categories (e.g., network connectivity issues, software display problems, etc.).  Therefore, I cannot assign it to a specific category from the list provided (1, 3, 4, 5, 6, 7, 8, 9, 11, 12).  More information is needed.


### **Few-Shot Prompt Example for Gemini**
* This cell defines a few-shot prompt that includes three labeled examples before presenting a new ticket for classification. This typically improves accuracy over zero-shot.

In [34]:
few_shot_prompt = """
You are a support ticket classifier. Classify the following text into one of the following categories: 1, 3, 4, 5, 6, 7, 8, 9, 11, 12.
Only return the category number. Do not explain.

Examples:
Text: "reset passwords for external accounts"
Category: 4

Text: "connection with icon"
Category: 6

Text: "requesting for meeting"
Category: 5

Now classify:
Text: "work experience user hi work experience student uploaded document..."
Category:
"""


### **Generate Few-Shot Prediction from Gemini**
* Runs the few-shot prompt through the Gemini model and prints its predicted category.

In [35]:
model = genai.GenerativeModel("models/gemini-1.5-flash")

response = model.generate_content(few_shot_prompt)

print("🔮 Predicted Category:", response.text.strip())

🔮 Predicted Category: 12


### **Ask Gemini for Top-3 Category Predictions**
* This cell asks Gemini to return the **top 3 most likely categories** for a given support ticket. This satisfies the "`multi-class ranking`" requirement in Task 5.

In [36]:
prompt = """
You are an AI support ticket classifier. For each text, return the top 3 most probable category numbers from the following: 1, 3, 4, 5, 6, 7, 8, 9, 11, 12.

Text: "reset passwords for external accounts"

Return format:
Top 3 Categories: [4, 1, 5]
"""

response = model.generate_content(prompt)
print(response.text)


Top 3 Categories: [4, 1, 5]



### **Install Scikit-learn (for ML baseline)**
* Installs `scikit-learn`, which is used to train a traditional ML model (`Logistic Regression`) as a baseline for comparison.

In [37]:
!pip install -U scikit-learn



### **Train a Logistic Regression Pipeline**
* This block trains a pipeline using TF-IDF vectorization and Logistic Regression on the ticket text. The trained model (`model_pipeline`) will be used later for comparison with Gemini. It also prints the model’s test accuracy.

In [38]:
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.feature_extraction.text import TfidfVectorizer

# Combine title and body into single text column
df_clean = df.copy()
df_clean['text'] = df_clean['title'].fillna('') + ' ' + df_clean['body'].fillna('')
df_clean = df_clean.dropna(subset=["category"])  # Ensure target exists

# Features and labels
X = df_clean['text']
y = df_clean['category'].astype(int)

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.2, random_state=42)

# Define and train pipeline
model_pipeline = Pipeline([
    ('tfidf', TfidfVectorizer(max_features=5000, stop_words="english")),
    ('clf', LogisticRegression(max_iter=1000, random_state=42))
])
model_pipeline.fit(X_train, y_train)

# Evaluate
print("📊 Model accuracy on test set:", model_pipeline.score(X_test, y_test))

📊 Model accuracy on test set: 0.8634397528321318


### **Compare Gemini vs Logistic Regression (Side-by-Side)**
This cell randomly samples 5 tickets and shows predictions from both:

- Gemini (LLM)
- Logistic Regression (TF-IDF + sklearn)

A comparison table is printed to help evaluate strengths and weaknesses of each approach.

In [39]:
import pandas as pd

# Select sample tickets to compare
num_samples = 5
sample_df = df_clean.sample(num_samples, random_state=42)

# Prepare text input (combine title + body)
sample_texts = (sample_df["title"].fillna("") + " " + sample_df["body"].fillna("")).tolist()

# Gemini Predictions (LLM)
gemini_predictions = []
for text in sample_texts:
    prompt = f"""
    You are an AI support ticket classifier. Classify this text into one of the categories:
    1, 3, 4, 5, 6, 7, 8, 9, 11, 12

    Text: "{text}"
    Category:
    """
    response = model.generate_content(prompt)
    # Extract category number (handle Gemini's verbose answer)
    import re
    match = re.search(r"\bCategory:\s*(\d+)", response.text)
    gemini_predictions.append(int(match.group(1)) if match else "N/A")

# Logistic Regression Predictions (ML)
ml_predictions = model_pipeline.predict(sample_texts)

# Combine results in a DataFrame
comparison_df = pd.DataFrame({
    "Text": sample_texts,
    "LLM_Predicted_Category": gemini_predictions,
    "ML_Predicted_Category": ml_predictions
})

# Print side-by-side predictions
import pandas as pd
pd.set_option('display.max_colwidth', None)
display(comparison_df)

Unnamed: 0,Text,LLM_Predicted_Category,ML_Predicted_Category
0,expense report friday march expense report hello expense report submitted waiting please let how find approve thank,,4
1,can connect with oracle can connect with dear have problem with connecting with entering password open please why can connect urgent thank you regards manager,,4
2,access request for a list or library friday pm library requesting library links library manage setting library please provide regards note site site site provides central storage collaboration information ideas site tool collaboration tool communication meeting tool decision making site helps groups work teams social groups share information work together example site help coordinate calendars schedules discuss ideas review proposals share information keep touch other sites dynamic interactive members site contribute own ideas content well comment contribute other,,4
3,hi please help employees attached thank officer floor blvd district,,4
4,reset domain password tuesday october pm hello please colleague thank tester registered under number whose registered old broad street kingdom provide clients each subsidiaries separate entity has liability,,4


In [42]:
print("Alhumdulillah ✔ Done")

Alhumdulillah ✔ Done


## **Conclusion**

This notebook demonstrates how both **LLMs** and **traditional ML pipelines** can be applied to **support ticket classification**.

---

#### **Summary Comparison**

| Method | Strengths | Weaknesses |
|--------|-----------|------------|
| **Gemini / BART (LLM)** | No training needed, generalizes well, can explain decisions | Requires API key, may be slower, can hallucinate |
| **Logistic Regression + TF-IDF** | Fast, interpretable, high accuracy if trained | Needs labeled data, less flexible than LLMs |

---

#### **When to Use What?**

- Use **LLMs (Gemini, BART)** when:
  - You don’t have labeled training data
  - You need quick prototyping
  - You want to generate **top 3 tags** and **natural explanations**

- Use **traditional ML (Logistic Regression)** when:
  - You have a labeled dataset
  - Need **speed** and **control**
  - Want to run models offline without relying on APIs

---

#### **Status: All Task 5 Requirements**

- Prompt Engineering
- LLM-based Text Classification
- Zero-Shot & Few-Shot Learning
- Multi-Class Prediction & Ranking
- Traditional ML Comparison
