
#  Zero-Shot Text Classification with Evaluation Metrics

This notebook uses a transformer-based **zero-shot classification** model (`facebook/bart-large-mnli`) to assign health-related categories to text content. It evaluates model performance using a manually labeled column and visualises results with a confusion matrix.



## 📦 Import Libraries

We use:
- `pandas` for data handling
- `transformers` for zero-shot classification
- `sklearn.metrics` for evaluation
- `seaborn` and `matplotlib` for visualizations
- `ctypes` to prevent the system from sleeping during long runs (Windows-only)


In [None]:

import ctypes
import pandas as pd
from transformers import pipeline
from sklearn.metrics import accuracy_score, precision_recall_fscore_support, classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt



## 💤 Prevent System from Sleeping (Optional)

These functions are for Windows systems and prevent sleep while processing large batches.


In [None]:

def prevent_sleep():
    try:
        ctypes.windll.kernel32.SetThreadExecutionState(0x80000002)  # ES_CONTINUOUS | ES_SYSTEM_REQUIRED
        print("Sleep prevention activated.")
    except Exception as e:
        print(f"Failed to activate sleep prevention: {e}")

def restore_sleep():
    try:
        ctypes.windll.kernel32.SetThreadExecutionState(0x80000000)  # ES_CONTINUOUS
        print("Sleep prevention deactivated.")
    except Exception as e:
        print(f"Failed to deactivate sleep prevention: {e}")



## 📄 Load and Prepare Data

We load the dataset and fill in any missing values in the text columns.


In [None]:

df = pd.read_csv('NLS_classification_new manual.csv')
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
print(df.shape)

text_columns = ['description', 'title', 'keywords', 'summary', 'full_text']
for col in text_columns:
    df[col] = df[col].fillna('').astype(str)



## 🏷️ Define Candidate Labels

These are the category labels the model will try to classify into.


In [None]:

candidate_labels = [
    'Medicine not part of mainstream medicine, fringe debates, and treatments aimed at aesthetics', 
    'Includes issues connected to pregnancy, childbirth, and abortion', 
    'All topics related to cancer', 
    'Cardiovascular health issues',
    'Health issues related to children, parenting, and teens',
    'Includes COVID-19 and related vaccination debates',
    'Topics related to dementia',
    'Dental health issues',
    'All topics related to diabetes',
    'Includes chronic diseases, specific acute conditions, and disabilities',
    'Health issues related to the LGBTQ+ community, including transgender issues',
    'All topics related to mental health',
    'Includes neurodivergent conditions such as autism, ADHD, Down syndrome, learning difficulties, etc.',
    'Topics on nutrition and physical fitness',
    'Research and general health information resources not fitting into other categories',
    'Health services provided for profit, not fitting into other categories',
    'Includes sexual health and sexually transmitted infections',
    'Social activism and charity work related to health not fitting into other categories',
    'Includes topics on substance use and addiction',
    'All topics related to surgical procedures',
    'Health issues related to women, including menstruation, menopause, PCOS, and endometriosis'
]



## 🤖 Load Zero-Shot Classification Pipeline


In [None]:

classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')



## 🧠 Classify Text in Batches

We create helper functions to classify the text and write results in chunks, which is useful for large datasets.


In [None]:

def classify_batch(texts, labels):
    if not texts or not labels:
        raise ValueError("Empty batch or labels")
    results = classifier(texts, labels)
    return [(result['labels'][0], result['scores'][0]) for result in results]

def process_and_save_chunk(chunk, chunk_index, batch_size=16):
    predicted_categories = []
    confidence_scores = []
    num_batches = (len(chunk) + batch_size - 1) // batch_size

    for i in range(0, len(chunk), batch_size):
        batch_texts = chunk[text_columns].iloc[i:i+batch_size].apply(lambda row: ' '.join(row.values.astype(str)), axis=1).tolist()
        
        if not batch_texts:
            predicted_categories.extend([None] * batch_size)
            confidence_scores.extend([None] * batch_size)
            continue
        
        try:
            batch_results = classify_batch(batch_texts, candidate_labels)
            categories, scores = zip(*batch_results)
            predicted_categories.extend(categories)
            confidence_scores.extend(scores)
        except:
            predicted_categories.extend([None] * batch_size)
            confidence_scores.extend([None] * batch_size)

    if len(predicted_categories) < len(chunk):
        predicted_categories.extend([None] * (len(chunk) - len(predicted_categories)))
        confidence_scores.extend([0.0] * (len(chunk) - len(confidence_scores)))

    chunk['predicted_category'] = predicted_categories[:len(chunk)]
    chunk['confidence_score'] = confidence_scores[:len(chunk)]
    chunk.to_csv(f'chunk_{chunk_index}_classified.csv', index=False)



## 🔄 Process Data in Chunks

This avoids memory issues and simulates real-world batch classification.


In [None]:

prevent_sleep()

chunk_size = 200
chunks = [df[i:i+chunk_size] for i in range(0, len(df), chunk_size)]
for idx, chunk in enumerate(chunks):
    process_and_save_chunk(chunk, idx)

restore_sleep()



## 🧩 Merge and Save Final Results

We recombine all the classified chunks and save a final CSV file.


In [None]:

classified_dfs = [pd.read_csv(f'chunk_{i}_classified.csv') for i in range(len(chunks))]
final_df = pd.concat(classified_dfs, ignore_index=True)
final_df.to_csv('NLS_classified_new.csv', index=False)
print("Final DataFrame with classifications saved.")



## 📏 Evaluate Classifier Performance

We compare model predictions to manually labeled categories using accuracy, precision, recall, F1 score, and a confusion matrix.


In [None]:

final_df['manual_label'] = final_df['manual_label'].fillna('unknown')
final_df['predicted_category'] = final_df['predicted_category'].fillna('unknown')

accuracy = accuracy_score(final_df['manual_label'], final_df['predicted_category'])
precision, recall, f1, _ = precision_recall_fscore_support(
    final_df['manual_label'], final_df['predicted_category'], average='weighted', zero_division=0
)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 Score: {f1:.2f}")



## 📋 Classification Report and Confusion Matrix

We show a detailed report of performance per label and a heatmap for the confusion matrix.


In [None]:

print("\nClassification Report:")
print(classification_report(final_df['manual_label'], final_df['predicted_category']))

conf_matrix = confusion_matrix(final_df['manual_label'], final_df['predicted_category'])
plt.figure(figsize=(12, 8))
sns.heatmap(conf_matrix, annot=False, cmap='Blues', xticklabels=candidate_labels, yticklabels=candidate_labels)
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()
