In [1]:
# Install required libraries
!pip install transformers datasets pandas spacy
!python -m spacy download en_core_web_sm
!pip install -U datasets

Collecting en-core-web-sm==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.8.0/en_core_web_sm-3.8.0-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m61.8 MB/s[0m eta [36m0:00:00[0m
[?25h[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
Collecting datasets
  Downloading datasets-4.0.0-py3-none-any.whl.metadata (19 kB)
Collecting fsspec<=2025.3.0,>=2023.1.0 (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets)
  Downloading fsspec-2025.3.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-4.0.0-py3-none-any.whl (494 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

# **Part 1: Setting Up the Environment and Loading a Mental Health Dataset**

In [2]:
import pandas as pd
import spacy
import datasets
import warnings
warnings.filterwarnings("ignore")

In [3]:
from datasets import load_dataset

ds = load_dataset("nbertagnolli/counsel-chat")
df = pd.DataFrame(ds['train'])  # Convert train split to Pandas DataFrame
df.head()

README.md: 0.00B [00:00, ?B/s]

Repo card metadata block was not found. Setting CardData to empty.


20220401_counsel_chat.csv: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/2775 [00:00<?, ? examples/s]

Unnamed: 0,questionID,questionTitle,questionText,questionLink,topic,therapistInfo,therapistURL,answerText,upvotes,views
0,0,Do I have too many issues for counseling?,I have so many issues to address. I have a his...,https://counselchat.com/questions/do-i-have-to...,depression,Jennifer MolinariHypnotherapist & Licensed Cou...,https://counselchat.com/therapists/jennifer-mo...,It is very common for people to have multiple ...,3,1971
1,0,Do I have too many issues for counseling?,I have so many issues to address. I have a his...,https://counselchat.com/questions/do-i-have-to...,depression,"Jason Lynch, MS, LMHC, LCAC, ADSIndividual & C...",https://counselchat.com/therapists/jason-lynch...,"I've never heard of someone having ""too many i...",2,386
2,0,Do I have too many issues for counseling?,I have so many issues to address. I have a his...,https://counselchat.com/questions/do-i-have-to...,depression,Shakeeta TorresFaith Based Mental Health Couns...,https://counselchat.com/therapists/shakeeta-to...,Absolutely not. I strongly recommending worki...,2,3071
3,0,Do I have too many issues for counseling?,I have so many issues to address. I have a his...,https://counselchat.com/questions/do-i-have-to...,depression,"Noorayne ChevalierMA, RP, CCC, CCAC, LLP (Mich...",https://counselchat.com/therapists/noorayne-ch...,Let me start by saying there are never too man...,2,2643
4,0,Do I have too many issues for counseling?,I have so many issues to address. I have a his...,https://counselchat.com/questions/do-i-have-to...,depression,"Toni Teixeira, LCSWYour road to healing begins...",https://counselchat.com/therapists/toni-teixei...,I just want to acknowledge you for the courage...,1,256


In [4]:
# Basic exploration
print("Dataset Shape:", df.shape)
print("\nFirst 5 Rows:")
print(df.head())
print("\nColumns:", df.columns.tolist())

Dataset Shape: (2775, 10)

First 5 Rows:
   questionID                              questionTitle  \
0           0  Do I have too many issues for counseling?   
1           0  Do I have too many issues for counseling?   
2           0  Do I have too many issues for counseling?   
3           0  Do I have too many issues for counseling?   
4           0  Do I have too many issues for counseling?   

                                        questionText  \
0  I have so many issues to address. I have a his...   
1  I have so many issues to address. I have a his...   
2  I have so many issues to address. I have a his...   
3  I have so many issues to address. I have a his...   
4  I have so many issues to address. I have a his...   

                                        questionLink       topic  \
0  https://counselchat.com/questions/do-i-have-to...  depression   
1  https://counselchat.com/questions/do-i-have-to...  depression   
2  https://counselchat.com/questions/do-i-have-to...  dep

In [5]:
# Basic preprocessing
nlp = spacy.load("en_core_web_sm")

def preprocess_text(text):
    if pd.isna(text) or not isinstance(text, str):
        return ""
    # Convert to lowercase
    text = text.lower()
    # Tokenize and remove stopwords
    doc = nlp(text)
    tokens = [token.text for token in doc if not token.is_stop and token.is_alpha]
    return " ".join(tokens)

# Apply preprocessing to questionText and answerText columns
df['question_clean'] = df['questionText'].apply(preprocess_text)
df['answer_clean'] = df['answerText'].apply(preprocess_text)

# Remove rows with missing or empty text
df = df.dropna(subset=['questionText', 'answerText'])
df = df[df['question_clean'] != ""]
df = df[df['answer_clean'] != ""]

In [6]:
# Save preprocessed data
df.to_csv('/content/counselchat_preprocessed.csv', index=False)
print("\nPreprocessed Data Saved. Shape:", df.shape)
print("\nSample Preprocessed Data:")
print(df[['question_clean', 'answer_clean']].head())


Preprocessed Data Saved. Shape: (2609, 12)

Sample Preprocessed Data:
                                      question_clean  \
0  issues address history sexual abuse breast can...   
1  issues address history sexual abuse breast can...   
2  issues address history sexual abuse breast can...   
3  issues address history sexual abuse breast can...   
4  issues address history sexual abuse breast can...   

                                        answer_clean  
0  common people multiple issues want need addres...  
1  heard having issues therapy effective competen...  
2  absolutely strongly recommending working issue...  
3  let start saying concerns bring counselling fa...  
4  want acknowledge courage step support overwhel...  


# **Part 2: Implementing Sentiment Analysis for Emotion Detection**

---



In [7]:
import pandas as pd
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score
import transformers
import warnings
warnings.filterwarnings("ignore")

In [8]:
# Set device for GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Device:", device)

# Load preprocessed data from Part 1
try:
    df = pd.read_csv('/content/counselchat_preprocessed.csv')
except FileNotFoundError:
    print("Error: 'counselchat_preprocessed.csv' not found.")
    exit()

Device: cuda


In [9]:
# Step 1: Pseudo-label with twitter-roberta for better sentiment labels
classifier = pipeline("sentiment-analysis", model="cardiffnlp/twitter-roberta-base-sentiment", device=0 if torch.cuda.is_available() else -1)
def get_sentiment(text):
    if not text or not isinstance(text, str):
        return "NEUTRAL"
    result = classifier(text)[0]
    label = result['label']  # LABEL_0 (NEGATIVE), LABEL_1 (NEUTRAL), LABEL_2 (POSITIVE)
    return {"LABEL_0": "NEGATIVE", "LABEL_1": "NEUTRAL", "LABEL_2": "POSITIVE"}[label]

# Sample 600 samples to allow balancing
df_subset = df[['question_clean']].dropna().sample(600, random_state=42)
df_subset['sentiment'] = df_subset['question_clean'].apply(get_sentiment)


config.json:   0%|          | 0.00/747 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/499M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/150 [00:00<?, ?B/s]

Device set to use cuda:0
You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


In [10]:
# Step 2: Balance the dataset
print("Initial Label Distribution:")
print(df_subset['sentiment'].value_counts())

# Oversample minority classes
neutral_samples = df_subset[df_subset['sentiment'] == "NEUTRAL"].sample(200, replace=True, random_state=42)
positive_samples = df_subset[df_subset['sentiment'] == "POSITIVE"].sample(200, replace=True, random_state=42)
negative_samples = df_subset[df_subset['sentiment'] == "NEGATIVE"].sample(200, random_state=42)
df_balanced = pd.concat([negative_samples, neutral_samples, positive_samples])

print("Balanced Label Distribution:")
print(df_balanced['sentiment'].value_counts())

# Map sentiments to numeric labels
sentiment_map = {"POSITIVE": 2, "NEUTRAL": 1, "NEGATIVE": 0}
df_balanced['label'] = df_balanced['sentiment'].map(sentiment_map)

Initial Label Distribution:
sentiment
NEGATIVE    311
NEUTRAL     255
POSITIVE     34
Name: count, dtype: int64
Balanced Label Distribution:
sentiment
NEGATIVE    200
NEUTRAL     200
POSITIVE    200
Name: count, dtype: int64


In [11]:
# Step 3: Prepare data for fine-tuning
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
def tokenize_function(text):
    return tokenizer(text, padding="max_length", truncation=True, max_length=128)

df_balanced['input_ids'] = df_balanced['question_clean'].apply(lambda x: tokenize_function(x)['input_ids'])
df_balanced['attention_mask'] = df_balanced['question_clean'].apply(lambda x: tokenize_function(x)['attention_mask'])

# Convert to torch dataset
class SentimentDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {
            'input_ids': torch.tensor(self.encodings['input_ids'][idx]),
            'attention_mask': torch.tensor(self.encodings['attention_mask'][idx]),
            'labels': torch.tensor(self.labels[idx])
        }
        return item

    def __len__(self):
        return len(self.labels)

# Split into train and test sets
train_df, test_df = train_test_split(df_balanced, test_size=0.2, random_state=42)
train_dataset = SentimentDataset(
    {'input_ids': train_df['input_ids'].tolist(), 'attention_mask': train_df['attention_mask'].tolist()},
    train_df['label'].tolist()
)
test_dataset = SentimentDataset(
    {'input_ids': test_df['input_ids'].tolist(), 'attention_mask': test_df['attention_mask'].tolist()},
    test_df['label'].tolist()
)

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

In [12]:
# Step 4: Fine-tune DistilBERT
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=3).to(device)
training_args = TrainingArguments(
    output_dir="/content/sentiment_model_improved",
    num_train_epochs=10,  # Increased epochs
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    learning_rate=2e-5,
    warmup_steps=100,
    weight_decay=0.01,
    logging_dir="/content/logs_sentiment",
    logging_steps=10,
    eval_strategy="steps",  # Updated to eval_strategy
    eval_steps=200,
    save_steps=200,
    save_total_limit=2,
    load_best_model_at_end=True
)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=lambda pred: {
        "accuracy": accuracy_score(pred.label_ids, pred.predictions.argmax(-1)),
        "f1": f1_score(pred.label_ids, pred.predictions.argmax(-1), average='weighted')
    }
)
trainer.train()
eval_results = trainer.evaluate()
print("Evaluation Results:", eval_results)

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mpavova8202[0m ([33mpavova8202-[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Step,Training Loss,Validation Loss,Accuracy,F1
200,0.0658,0.132766,0.958333,0.957398
400,0.0031,0.142896,0.958333,0.957398
600,0.0022,0.108164,0.966667,0.966074


Evaluation Results: {'eval_loss': 0.10816408693790436, 'eval_accuracy': 0.9666666666666667, 'eval_f1': 0.9660740811476105, 'eval_runtime': 0.5239, 'eval_samples_per_second': 229.052, 'eval_steps_per_second': 28.632, 'epoch': 10.0}


In [13]:
# Step 5: Save the fine-tuned model
model.save_pretrained("/content/sentiment_model_improved")
tokenizer.save_pretrained("/content/sentiment_model_improved")
print("Fine-tuned model saved to /content/sentiment_model_improved")

Fine-tuned model saved to /content/sentiment_model_improved


In [14]:
# Step 6: Test sentiment prediction
def predict_sentiment(text):
    inputs = tokenizer(
        text,
        return_tensors="pt",
        padding=True,
        truncation=True,
        max_length=128
    )
    # Move all inputs to device
    inputs = {k: v.to(device) for k, v in inputs.items()}

    with torch.no_grad():
        outputs = model(**inputs)
    pred = torch.argmax(outputs.logits, dim=1).item()
    return {0: "NEGATIVE", 1: "NEUTRAL", 2: "POSITIVE"}[pred]

test_texts = [
    "I’m really struggling with anxiety",
    "I just got promoted and I’m thrilled",
    "What are some ways to cope with sadness?",
    "Life feels meaningless sometimes",
    "I’m okay, just navigating some challenges"
]

for text in test_texts:
    sentiment = predict_sentiment(text)
    print(f"Input: {text}\nPredicted Sentiment: {sentiment}\n")

Input: I’m really struggling with anxiety
Predicted Sentiment: NEGATIVE

Input: I just got promoted and I’m thrilled
Predicted Sentiment: POSITIVE

Input: What are some ways to cope with sadness?
Predicted Sentiment: NEUTRAL

Input: Life feels meaningless sometimes
Predicted Sentiment: NEGATIVE

Input: I’m okay, just navigating some challenges
Predicted Sentiment: NEUTRAL




# **Part 3: Crisis Detection**

In [15]:
# Set device for GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Device:", device)

# Load preprocessed data from Part 1
try:
    df = pd.read_csv('/content/counselchat_preprocessed.csv')
except FileNotFoundError:
    print("Error: 'counselchat_preprocessed.csv' not found.")
    exit()


Device: cuda


In [16]:
# Step 1: Create crisis dataset
crisis_keywords = ["suicide", "kill myself", "don’t want to live", "hopeless", "end my life", "worthless", "giving up"]
def label_crisis(text):
    return 1 if any(keyword in text.lower() for keyword in crisis_keywords) else 0

crisis_data = df[['question_clean']].dropna().sample(500, random_state=42)
crisis_data['crisis_label'] = crisis_data['question_clean'].apply(label_crisis)

# Balance dataset (250 crisis, 250 non-crisis)
crisis_data = pd.concat([
    crisis_data[crisis_data['crisis_label'] == 0].sample(250, random_state=42),
    crisis_data[crisis_data['crisis_label'] == 1].sample(250, replace=True, random_state=42)
])
print("Crisis Label Distribution:")
print(crisis_data['crisis_label'].value_counts())

Crisis Label Distribution:
crisis_label
0    250
1    250
Name: count, dtype: int64


In [17]:
# Step 2: Tokenize data
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
def tokenize_function(text):
    return tokenizer(text, padding="max_length", truncation=True, max_length=128)

crisis_data['input_ids'] = crisis_data['question_clean'].apply(lambda x: tokenize_function(x)['input_ids'])
crisis_data['attention_mask'] = crisis_data['question_clean'].apply(lambda x: tokenize_function(x)['attention_mask'])


In [18]:
# Step 3: Create torch dataset
class CrisisDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels
    def __getitem__(self, idx):
        item = {
            'input_ids': torch.tensor(self.encodings['input_ids'][idx]),
            'attention_mask': torch.tensor(self.encodings['attention_mask'][idx]),
            'labels': torch.tensor(self.labels[idx])
        }
        return item
    def __len__(self):
        return len(self.labels)

# Split data
train_df, test_df = train_test_split(crisis_data, test_size=0.2, random_state=42)
train_dataset = CrisisDataset(
    {'input_ids': train_df['input_ids'].tolist(), 'attention_mask': train_df['attention_mask'].tolist()},
    train_df['crisis_label'].tolist()
)
test_dataset = CrisisDataset(
    {'input_ids': test_df['input_ids'].tolist(), 'attention_mask': test_df['attention_mask'].tolist()},
    test_df['crisis_label'].tolist()
)

In [19]:
# Step 4: Fine-tune DistilBERT
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=2).to(device)
training_args = TrainingArguments(
    output_dir="/content/crisis_model",
    num_train_epochs=10,  # Increased epochs
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    learning_rate=2e-5,  # Added for stable training
    warmup_steps=100,  # Added for gradual learning rate increase
    weight_decay=0.01,  # Added to prevent overfitting
    logging_dir="/content/logs_crisis",
    logging_steps=10,
    eval_strategy="steps",  # Updated to eval_strategy
    eval_steps=200,
    save_steps=200,
    save_total_limit=2,
    load_best_model_at_end=True
)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=lambda pred: {
        "accuracy": accuracy_score(pred.label_ids, pred.predictions.argmax(-1)),
        "f1": f1_score(pred.label_ids, pred.predictions.argmax(-1), average='weighted')
    }
)
trainer.train()
eval_results = trainer.evaluate()
print("Crisis Classifier Evaluation Results:", eval_results)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Step,Training Loss,Validation Loss,Accuracy,F1
200,0.0019,0.00141,1.0,1.0
400,0.0008,0.000588,1.0,1.0


Crisis Classifier Evaluation Results: {'eval_loss': 0.0005876069772057235, 'eval_accuracy': 1.0, 'eval_f1': 1.0, 'eval_runtime': 0.4063, 'eval_samples_per_second': 246.135, 'eval_steps_per_second': 31.998, 'epoch': 10.0}


In [20]:
# Step 5: Save the model
model.save_pretrained("/content/crisis_model")
tokenizer.save_pretrained("/content/crisis_model")
print("Fine-tuned crisis model saved to /content/crisis_model")


Fine-tuned crisis model saved to /content/crisis_model


In [21]:
# Step 6: Test the model
def predict_crisis(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device)
    with torch.no_grad():
        outputs = model(**inputs)
    pred = torch.argmax(outputs.logits, dim=1).item()
    return "CRISIS" if pred == 1 else "NON-CRISIS"

test_texts = [
    "I feel worthless and don’t want to live",
    "I’m okay, just feeling a bit down",
    "I’m thinking about suicide",
    "What are some ways to manage stress?",
    "Life feels hopeless sometimes"
]
for text in test_texts:
    crisis_pred = predict_crisis(text)
    print(f"Input: {text}\nPredicted: {crisis_pred}\n")

Input: I feel worthless and don’t want to live
Predicted: NON-CRISIS

Input: I’m okay, just feeling a bit down
Predicted: NON-CRISIS

Input: I’m thinking about suicide
Predicted: NON-CRISIS

Input: What are some ways to manage stress?
Predicted: NON-CRISIS

Input: Life feels hopeless sometimes
Predicted: NON-CRISIS



# **Part 4: Response Generation**

In [22]:
import pandas as pd
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, Trainer, TrainingArguments
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings("ignore")

# Set device for GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Device:", device)

# Load preprocessed data from Part 1
try:
    df = pd.read_csv('/content/counselchat_preprocessed.csv')
except FileNotFoundError:
    print("Error: 'counselchat_preprocessed.csv' not found. Run Part 1 first.")
    exit()

Device: cuda


In [23]:
# Step 1: Prepare dialogue data
dialogpt_tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium")  # Use medium model for better performance
dialogpt_model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium").to(device)
dialogpt_tokenizer.pad_token = dialogpt_tokenizer.eos_token

# Use raw answerText and questionText to preserve natural language
dialogue_data = df[['questionText', 'answerText']].dropna().sample(1000, random_state=42)  # Increase sample size for better training

def prepare_dialogue_data(row):
    input_text = row['questionText']
    target_text = row['answerText'][:500]  # Limit length for stability
    # Combine input and target for conversational context
    input_encodings = dialogpt_tokenizer(
        input_text + dialogpt_tokenizer.eos_token + target_text,
        max_length=128,
        truncation=True,
        padding="max_length",
        return_tensors="pt"
    )
    return {
        'input_ids': input_encodings['input_ids'].squeeze(),
        'attention_mask': input_encodings['attention_mask'].squeeze(),
        'labels': input_encodings['input_ids'].squeeze()  # Use input_ids as labels for causal LM
    }

dialogue_dataset = [prepare_dialogue_data(row) for _, row in dialogue_data.iterrows()]
train_data, eval_data = train_test_split(dialogue_dataset, test_size=0.2, random_state=42)

tokenizer_config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/642 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/863M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/863M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [24]:
# Step 2: Create torch dataset
class DialogueDataset(torch.utils.data.Dataset):
    def __init__(self, data):
        self.data = data
    def __getitem__(self, idx):
        return {
            'input_ids': self.data[idx]['input_ids'],
            'attention_mask': self.data[idx]['attention_mask'],
            'labels': self.data[idx]['labels']
        }
    def __len__(self):
        return len(self.data)

train_dataset = DialogueDataset(train_data)
eval_dataset = DialogueDataset(eval_data)

In [25]:
# Step 3: Fine-tune DialoGPT with improved parameters
training_args = TrainingArguments(
    output_dir="/content/dialogpt_finetuned",
    num_train_epochs=6,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    learning_rate=2e-5,
    warmup_steps=100,
    weight_decay=0.01,
    logging_dir="/content/logs_dialogpt",
    logging_steps=50,
    eval_strategy="steps",
    eval_steps=200,
    save_steps=200,
    save_total_limit=2,
    load_best_model_at_end=True
)
trainer = Trainer(
    model=dialogpt_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset
)
trainer.train()
eval_results = trainer.evaluate()
print("DialoGPT Evaluation Results:", eval_results)

`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.


Step,Training Loss,Validation Loss
200,2.8657,2.850259
400,2.4989,2.703285
600,2.3362,2.682057


There were missing keys in the checkpoint model loaded: ['lm_head.weight'].


DialoGPT Evaluation Results: {'eval_loss': 2.6820573806762695, 'eval_runtime': 5.7274, 'eval_samples_per_second': 34.92, 'eval_steps_per_second': 4.365, 'epoch': 6.0}


In [26]:
# Step 4: Save the model
dialogpt_model.save_pretrained("/content/dialogpt_finetuned")
dialogpt_tokenizer.save_pretrained("/content/dialogpt_finetuned")
print("Fine-tuned DialoGPT saved to /content/dialogpt_finetuned")

Fine-tuned DialoGPT saved to /content/dialogpt_finetuned


In [27]:
# Step 5: Test the model
def generate_response(text):
    prompt = (
        "You are a compassionate mental health assistant. "
        "Respond in a supportive and helpful way. "
        f"User: {text}\n"
        "Assistant:"
    )

    inputs = dialogpt_tokenizer.encode(
        prompt,
        return_tensors="pt",
        max_length=256,
        truncation=True
    ).to(device)

    outputs = dialogpt_model.generate(
        inputs,
        max_length=200,
        pad_token_id=dialogpt_tokenizer.eos_token_id,
        num_beams=5,
        no_repeat_ngram_size=2,
        early_stopping=True
    )
    response = dialogpt_tokenizer.decode(outputs[0], skip_special_tokens=True)
    response = response.replace(text, "").strip()

    if not response or len(response.split()) < 3:
        response = "I'm here to help. Can you tell me more about how you're feeling?"

    return response

# **Part 5: Chatbot Interface**

In [28]:
# Import libraries
import pandas as pd
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForCausalLM
import gradio as gr
import warnings
warnings.filterwarnings("ignore")

# Set device for GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Device:", device)

Device: cuda


In [29]:
# Load models and tokenizers
try:
    sentiment_tokenizer = AutoTokenizer.from_pretrained("/content/sentiment_model_improved")
    sentiment_model = AutoModelForSequenceClassification.from_pretrained("/content/sentiment_model_improved").to(device)
    crisis_tokenizer = AutoTokenizer.from_pretrained("/content/crisis_model")
    crisis_model = AutoModelForSequenceClassification.from_pretrained("/content/crisis_model").to(device)
    dialogpt_tokenizer = AutoTokenizer.from_pretrained("/content/dialogpt_finetuned")
    dialogpt_model = AutoModelForCausalLM.from_pretrained("/content/dialogpt_finetuned").to(device)
    dialogpt_tokenizer.pad_token = dialogpt_tokenizer.eos_token
except FileNotFoundError as e:
    print(f"Error: Model files not found. Ensure /content/sentiment_model_improved, /content/crisis_model, and /content/dialogpt_finetuned exist. {e}")
    exit()

In [30]:
# Step 1: Define prediction functions
def predict_sentiment(text):
    try:
        inputs = sentiment_tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device)
        with torch.no_grad():
            outputs = sentiment_model(**inputs)
        pred = torch.argmax(outputs.logits, dim=1).item()
        return {0: "NEGATIVE", 1: "NEUTRAL", 2: "POSITIVE"}[pred]
    except Exception as e:
        print(f"Error in sentiment prediction: {e}")
        return "NEUTRAL"

def predict_crisis(text):
    try:
        inputs = crisis_tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device)
        with torch.no_grad():
            outputs = crisis_model(**inputs)
        pred = torch.argmax(outputs.logits, dim=1).item()
        return pred == 1
    except Exception as e:
        print(f"Error in crisis prediction: {e}")
        return False

def generate_response(text):
    try:
        # Format input with a clear delimiter
        prompt = f"User: {text} Assistant: " + dialogpt_tokenizer.eos_token
        inputs = dialogpt_tokenizer.encode(prompt, return_tensors="pt", max_length=128, truncation=True).to(device)
        outputs = dialogpt_model.generate(
            inputs,
            max_length=150,
            pad_token_id=dialogpt_tokenizer.eos_token_id,
            num_beams=5,
            no_repeat_ngram_size=2,
            early_stopping=True
        )
        response = dialogpt_tokenizer.decode(outputs[0], skip_special_tokens=True)
        # Extract only the assistant's response
        response = response.split("Assistant: ")[-1].strip() if "Assistant: " in response else response.replace(f"User: {text}", "").strip()
        return response if response else "I'm here to help. Could you share a bit more?"
    except Exception as e:
        print(f"Error in response generation: {e}")
        return "I'm here to help. Could you share a bit more?"

In [31]:
# Step 2: Combine logic for chatbot
def chatbot_logic(user_input):
    # Check for crisis
    if predict_crisis(user_input):
        return ("I'm really concerned about how you're feeling. Please reach out to a trusted person or contact a helpline like the National Suicide Prevention Lifeline at 1-800-273-8255. I'm here to listen, too.")

    # Get sentiment and generate response
    sentiment = predict_sentiment(user_input)
    response = generate_response(user_input)

    # Tailor response based on sentiment
    if sentiment == "NEGATIVE":
        response += " I’m here for you. Would you like some coping strategies, like deep breathing or journaling?"
    elif sentiment == "POSITIVE":
        response += " That’s wonderful to hear! Want to tell me more?"
    else:
        response += " Thanks for sharing. How can I help you further, perhaps with some tips or resources?"

    return response

In [32]:
# Step 2: Combine logic for chatbot
def chatbot_logic(user_input):
    if not user_input or not isinstance(user_input, str):
        return "Please enter a valid message."

    # Check for crisis
    if predict_crisis(user_input):
        return ("I'm really concerned about how you're feeling. Please reach out to a trusted person or contact a helpline like the National Suicide Prevention Lifeline at 1-800-273-8255. I'm here to listen, too.")

    # Get sentiment and generate response
    sentiment = predict_sentiment(user_input)
    response = generate_response(user_input)

    # Tailor response based on sentiment
    if sentiment == "NEGATIVE":
        response += " I’m here for you. Would you like some coping strategies, like deep breathing or journaling?"
    elif sentiment == "POSITIVE":
        response += " That’s wonderful to hear! Want to tell me more?"
    else:
        response += " Thanks for sharing. How can I help you further, perhaps with some tips or resources?"

    return response

In [39]:
# Step 3: CLI Interface
def run_chatbot():
    print("Mental Health Support Chatbot")
    print("Share your thoughts or feelings, and I'll try to help. Type 'exit' to quit.")
    print("If you're in crisis, I'll suggest resources.")
    print("Made by Hardik\n")

    while True:
        user_input = input("Your Message: ").strip()
        if user_input.lower() == 'exit':
            print("Goodbye. Take care!")
            break
        if not user_input:
            print("Please enter a message.")
            continue
        response = chatbot_logic(user_input)
        print(f"Response: {response}\n")

In [44]:
# Step 4: Test chatbot programmatically
def test_chatbot():
    print("\nTesting chatbot with sample inputs:")
    test_texts = [
        "I’m really struggling with anxiety",
        "I just got promoted and I’m thrilled",
        "What are some ways to cope with sadness?",
        "Life feels meaningless sometimes"
    ]
    for text in test_texts:
        response = chatbot_logic(text)
        print(f"Input: {text}\nResponse: {response}\n")

In [48]:
# Step 5: Run the chatbot
if __name__ == "__main__":
    test_chatbot()  # Run tests first
    run_chatbot()   # Then start CLI


Testing chatbot with sample inputs:
Input: I’m really struggling with anxiety
Response: Hello, and thank you for your question. I am so sorry that you are having such a difficult time dealing with your anxiety.  I hope you can find the support you need. I’m here for you. Would you like some coping strategies, like deep breathing or journaling?

Input: I just got promoted and I’m thrilled
Response: “ Congratulations on your promotion ” That’s wonderful to hear! Want to tell me more?

Input: What are some ways to cope with sadness?
Response: I'm sorry to hear about your loss of your mother. I'm sure you can find some comfort in the fact that you lost her, but I'd like to remind you that there are a lot of things that can be done to help you cope.  You can also talk to a counselor about how you are coping with your sadness. Thanks for sharing. How can I help you further, perhaps with some tips or resources?

Input: Life feels meaningless sometimes
Response: I know that feeling all too we