# Task 5: Mental Health Support Chatbot (Fine-Tuned)

## Objective
Build a chatbot that provides supportive and empathetic responses for stress, anxiety, and emotional wellness using fine-tuned LLM.

## Model Base
DistilGPT2, GPT-Neo, or Mistral (7B)

## Dataset for Fine-Tuning
Empathetic Dialogues (Facebook AI)

## Problem Statement
Mental health support is increasingly important. A fine-tuned chatbot trained on empathetic dialogues can provide supportive responses to users experiencing stress, anxiety, or emotional difficulties. By fine-tuning a smaller language model on empathetic conversation data, we create a compassionate bot that listens and responds with genuine understanding. This chatbot prioritizes emotional support while maintaining appropriate boundaries.

---

## Step 1: Import Required Libraries

In [None]:
import torch
import numpy as np
import pandas as pd
from typing import List, Dict, Tuple
import warnings
warnings.filterwarnings('ignore')

print(f"PyTorch version: {torch.__version__}")
print(f"GPU available: {torch.cuda.is_available()}")
print(f"GPU device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}")

In [None]:
# Import transformers components
try:
    from transformers import (
        AutoTokenizer,
        AutoModelForCausalLM,
        TextDataset,
        DataCollatorForLanguageModeling,
        Trainer,
        TrainingArguments,
        pipeline
    )
    from datasets import load_dataset, Dataset
    print("✅ All transformers libraries imported successfully!")
    TRANSFORMERS_AVAILABLE = True
except ImportError as e:
    print(f"❌ Import error: {e}")
    print("Install with: pip install transformers datasets torch")
    TRANSFORMERS_AVAILABLE = False

## Step 2: Load and Explore Empathetic Dialogues Dataset

In [None]:
if TRANSFORMERS_AVAILABLE:
    print("Loading Empathetic Dialogues dataset...")
    
    try:
        # Load from Hugging Face datasets
        dataset = load_dataset('empathetic_dialogues')
        print("✅ Dataset loaded successfully!")
        print(f"\nDataset structure:")
        print(f"Split names: {dataset.keys()}")
        print(f"\nTrain set size: {len(dataset['train'])}")
        print(f"Validation set size: {len(dataset['validation'])}")
        print(f"Test set size: {len(dataset['test'])}")
        
        DATASET_AVAILABLE = True
    except Exception as e:
        print(f"⚠️  Dataset loading error: {e}")
        print("Will create a sample dataset for demonstration")
        DATASET_AVAILABLE = False
else:
    DATASET_AVAILABLE = False

In [None]:
# Explore dataset structure
if DATASET_AVAILABLE:
    print("Sample from the dataset:")
    print(f"\nFirst example:")
    example = dataset['train'][0]
    for key, value in example.items():
        if key not in ['utterance_idx']:
            print(f"  {key}: {value}")

## Step 3: Create Sample Dataset if Needed

In [None]:
# Create sample empathetic dialogues for demonstration
sample_conversations = [
    {"context": "I'm feeling really anxious about my job interview tomorrow.", 
     "response": "Job interviews can definitely feel nerve-wracking. Remember, you've prepared well and have valuable skills to offer. Take some deep breaths - anxiety is normal, but you can handle this."},
    
    {"context": "I've been having trouble sleeping lately because I'm stressed about my exams.", 
     "response": "Exam stress can really impact sleep. That sounds tough. Try setting a consistent bedtime routine - maybe some light reading or meditation can help calm your mind."},
    
    {"context": "My friend just told me they're moving away and I'm really sad about it.", 
     "response": "That's completely understandable - losing someone you care about nearby is difficult. But remember, true friendships can thrive long-distance with effort and communication."},
    
    {"context": "I'm overwhelmed with work and can't seem to find balance.", 
     "response": "Feeling overwhelmed is a sign you need to reassess your priorities. It's okay to say no sometimes. Maybe you could talk to your manager about workload or take a mental health day?"},
    
    {"context": "I've been feeling lonely and disconnected from people.", 
     "response": "Loneliness can be really painful. Have you considered reaching out to friends or joining a community group? Sometimes small connections can make a big difference."},
    
    {"context": "I'm worried about my future and what comes next.", 
     "response": "It's natural to worry about the future. Instead of focusing on everything at once, try breaking things into smaller, manageable steps. What's one thing you can work on today?"},
    
    {"context": "I made a mistake at work and feel embarrassed.", 
     "response": "Making mistakes is part of being human. The fact that you care shows your integrity. Learn from it and move forward - most people will forget about it much sooner than you will."},
    
    {"context": "I'm struggling with low self-esteem lately.", 
     "response": "Self-doubt is something many people face. Try acknowledging your strengths and achievements, no matter how small. Being kind to yourself is just as important as being kind to others."},
    
    {"context": "I feel like I'm not good enough compared to my peers.", 
     "response": "Comparison is the thief of joy. Everyone has their own unique path and timeline. Focus on your own progress rather than measuring yourself against others."},
    
    {"context": "I'm going through a difficult breakup.", 
     "response": "Breakups are among life's hardest experiences. Your feelings are valid. Give yourself permission to grieve, reach out to people you trust, and remember that healing takes time."}
]

print(f"Created {len(sample_conversations)} sample empathetic conversations")
print(f"\nSample conversation:")
print(f"Context: {sample_conversations[0]['context']}")
print(f"Response: {sample_conversations[0]['response']}")

In [None]:
# Prepare training data
# Format: "Context: [context]\nResponse: [response]<|endoftext|>"

training_texts = []
for conv in sample_conversations:
    text = f"User: {conv['context']}\nSupportiveBot: {conv['response']}<|endoftext|>"
    training_texts.append(text)

print(f"Total training examples: {len(training_texts)}")
print(f"\nFirst training example:")
print(training_texts[0])

## Step 4: Model and Tokenizer Setup

In [None]:
if TRANSFORMERS_AVAILABLE:
    # Select model
    model_name = "distilgpt2"  # Lightweight model for faster fine-tuning
    # Alternative options: 'gpt2', 'EleutherAI/gpt-neo-125m'
    
    print(f"Loading model: {model_name}")
    print("(First run may take a few minutes)\n")
    
    try:
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        model = AutoModelForCausalLM.from_pretrained(model_name)
        
        # Set pad token
        tokenizer.pad_token = tokenizer.eos_token
        
        print(f"✅ Model loaded successfully!")
        print(f"\nModel configuration:")
        print(f"  Vocabulary size: {len(tokenizer)}")
        print(f"  Model parameters: {sum(p.numel() for p in model.parameters()):,}")
        
        MODEL_READY = True
    except Exception as e:
        print(f"Error loading model: {e}")
        MODEL_READY = False
else:
    MODEL_READY = False

## Step 5: Prepare Training Data

In [None]:
if MODEL_READY:
    # Tokenize the training data
    def tokenize_function(examples):
        return tokenizer(
            examples['text'],
            truncation=True,
            max_length=256,
            padding='max_length',
            return_tensors='pt'
        )
    
    # Create dataset
    train_dataset = Dataset.from_dict({"text": training_texts})
    
    # Tokenize
    tokenized_dataset = train_dataset.map(tokenize_function, batched=True)
    tokenized_dataset.set_format(type='torch', columns=['input_ids', 'attention_mask'])
    
    print(f"✅ Data preparation complete!")
    print(f"\nDataset info:")
    print(f"  Size: {len(tokenized_dataset)}")
    print(f"  First example shape: {tokenized_dataset[0]['input_ids'].shape}")

## Step 6: Fine-Tuning Configuration and Training

In [None]:
if MODEL_READY:
    print("Setting up fine-tuning configuration...\n")
    
    # Define training arguments
    training_args = TrainingArguments(
        output_dir="./mental_health_bot",
        overwrite_output_dir=True,
        num_train_epochs=3,
        per_device_train_batch_size=4,
        per_device_eval_batch_size=4,
        save_steps=10_000,
        save_total_limit=2,
        logging_steps=50,
        learning_rate=2e-5,
        weight_decay=0.01,
        warmup_steps=100,
        no_cuda=not torch.cuda.is_available(),  # Use CPU if GPU not available
    )
    
    print("Training configuration:")
    print(f"  Output directory: {training_args.output_dir}")
    print(f"  Epochs: {training_args.num_train_epochs}")
    print(f"  Batch size: {training_args.per_device_train_batch_size}")
    print(f"  Learning rate: {training_args.learning_rate}")
    print(f"  Using GPU: {torch.cuda.is_available()}")

In [None]:
if MODEL_READY:
    # Create data collator
    data_collator = DataCollatorForLanguageModeling(
        tokenizer=tokenizer,
        mlm=False,  # We're doing causal language modeling, not MLM
    )
    
    # Create trainer
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=tokenized_dataset,
        data_collator=data_collator,
    )
    
    print("✅ Trainer initialized!")
    print("\nReady to start fine-tuning...")

In [None]:
if MODEL_READY:
    print("Starting fine-tuning...\n")
    
    try:
        # Train the model
        trainer.train()
        
        print("\n✅ Fine-tuning completed successfully!")
        
        # Save the fine-tuned model
        print("\nSaving fine-tuned model...")
        trainer.save_model("./mental_health_bot")
        tokenizer.save_pretrained("./mental_health_bot")
        print("✅ Model saved to ./mental_health_bot")
        
        FINETUNING_COMPLETE = True
    except Exception as e:
        print(f"Error during fine-tuning: {e}")
        FINETUNING_COMPLETE = False
else:
    FINETUNING_COMPLETE = False

## Step 7: Load Fine-Tuned Model for Inference

In [None]:
if MODEL_READY:
    print("Loading fine-tuned model for inference...\n")
    
    try:
        # Load fine-tuned model
        fine_tuned_tokenizer = AutoTokenizer.from_pretrained("./mental_health_bot")
        fine_tuned_model = AutoModelForCausalLM.from_pretrained("./mental_health_bot")
        
        # Set to eval mode
        fine_tuned_model.eval()
        
        # Move to device
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        fine_tuned_model.to(device)
        
        print(f"✅ Fine-tuned model loaded!")
        print(f"Device: {device}")
        
        INFERENCE_READY = True
    except Exception as e:
        print(f"Error loading fine-tuned model: {e}")
        INFERENCE_READY = False
else:
    INFERENCE_READY = False

## Step 8: Create Mental Health Support Chatbot Class

In [None]:
class MentalHealthChatbot:
    """
    A supportive mental health chatbot with empathetic responses.
    Fine-tuned on empathetic dialogue data.
    """
    
    def __init__(self, use_finetuned=True):
        """
        Initialize the mental health chatbot.
        
        Args:
            use_finetuned (bool): Use fine-tuned model or fallback templates
        """
        self.use_finetuned = use_finetuned and INFERENCE_READY
        self.conversation_history = []
        self.total_conversations = 0
        
        if self.use_finetuned:
            self.model = fine_tuned_model
            self.tokenizer = fine_tuned_tokenizer
            self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        
        # Empathetic responses templates
        self.templates = {
            'anxiety': [
                "I understand anxiety can be overwhelming. Remember, you're not alone in feeling this way.",
                "That sounds really stressful. Have you tried any grounding techniques like deep breathing?",
                "Anxiety is your mind trying to protect you. Sometimes what we fear doesn't happen. Take it one moment at a time."
            ],
            'stress': [
                "Stress can really wear you down. It's important to take care of yourself during tough times.",
                "You're dealing with a lot. Have you considered talking to someone you trust about this?",
                "Remember, stress is temporary. You've gotten through hard times before, and you can do it again."
            ],
            'sad': [
                "It's okay to feel sad. Your emotions are valid and important.",
                "I'm sorry you're going through this. Sadness often means we care deeply about something.",
                "Difficult feelings do pass with time. Be gentle with yourself during this period."
            ],
            'lonely': [
                "Loneliness can be painful, but you're reaching out, which is a positive step.",
                "You deserve connection and support. Have you thought about joining a community?",
                "Feeling lonely doesn't mean something is wrong with you. Many people feel this way."
            ],
            'overwhelmed': [
                "When everything feels like too much, it helps to break things into smaller steps.",
                "You don't have to solve everything at once. What's one small thing you could do today?",
                "Feeling overwhelmed is a sign that you might need to slow down and prioritize."
            ],
            'default': [
                "I'm here to listen and support you. Thank you for sharing.",
                "Your feelings matter. I appreciate you opening up about this.",
                "It's brave of you to talk about what you're experiencing. What would help you most right now?"
            ]
        }
    
    def get_template_response(self, user_input: str) -> str:
        """
        Get a template-based empathetic response.
        
        Args:
            user_input (str): User's message
        
        Returns:
            str: Template response
        """
        user_lower = user_input.lower()
        
        # Check for key emotional indicators
        for emotion, responses in self.templates.items():
            if emotion != 'default' and emotion in user_lower:
                return np.random.choice(responses)
        
        # Return default response
        return np.random.choice(self.templates['default'])
    
    def generate_response(self, user_input: str) -> str:
        """
        Generate an empathetic response using the fine-tuned model.
        
        Args:
            user_input (str): User's message
        
        Returns:
            str: Chatbot's supportive response
        """
        if self.use_finetuned:
            try:
                # Prepare input
                prompt = f"User: {user_input}\nSupportiveBot:"
                input_ids = self.tokenizer.encode(prompt, return_tensors='pt').to(self.device)
                
                # Generate response
                with torch.no_grad():
                    output = self.model.generate(
                        input_ids,
                        max_length=150,
                        num_return_sequences=1,
                        do_sample=True,
                        top_p=0.95,
                        top_k=50,
                        temperature=0.8,
                        pad_token_id=self.tokenizer.eos_token_id
                    )
                
                # Decode response
                full_text = self.tokenizer.decode(output[0], skip_special_tokens=True)
                response = full_text.split("SupportiveBot:")[-1].strip()
                
                # Clean up response
                response = response.split('\n')[0]  # Take only first line if multi-line
                response = response.split('User:')[0].strip()  # Remove any user continuation
                
                return response if response else self.get_template_response(user_input)
            
            except Exception as e:
                print(f"Error in model generation: {e}")
                return self.get_template_response(user_input)
        else:
            return self.get_template_response(user_input)
    
    def chat(self, user_input: str) -> str:
        """
        Main chat interface with conversation tracking.
        
        Args:
            user_input (str): User's message
        
        Returns:
            str: Supportive response
        """
        response = self.generate_response(user_input)
        
        # Track conversation
        self.conversation_history.append({
            'user': user_input,
            'bot': response
        })
        
        self.total_conversations += 1
        return response
    
    def get_conversation_history(self) -> List[Dict]:
        """
        Get conversation history.
        
        Returns:
            List of conversation exchanges
        """
        return self.conversation_history

print("✅ MentalHealthChatbot class created!")

## Step 9: Initialize the Chatbot

In [None]:
# Initialize the mental health chatbot
mental_health_bot = MentalHealthChatbot(use_finetuned=True)

print("✅ Mental Health Chatbot initialized!")
print(f"Using fine-tuned model: {mental_health_bot.use_finetuned}")
if mental_health_bot.use_finetuned:
    print("Model: Fine-tuned on Empathetic Dialogues")
else:
    print("Model: Using template-based responses")

## Step 10: Test with Real-World Scenarios

In [None]:
# Test scenarios
test_scenarios = [
    "I've been feeling really anxious about my future lately.",
    "My best friend just moved away and I'm feeling so lonely.",
    "Work has been overwhelming and I don't know how to handle it.",
    "I failed my exam and feel like a failure.",
    "I haven't been sleeping well because of stress."
]

print("\n" + "="*70)
print("MENTAL HEALTH CHATBOT - INTERACTIVE DEMO")
print("="*70)

for scenario in test_scenarios[:3]:  # Test first 3
    print(f"\n👤 User: {scenario}")
    response = mental_health_bot.chat(scenario)
    print(f"🤖 SupportiveBot: {response}")
    print("-" * 70)

## Step 11: Demonstrate Conversation Flow

In [None]:
# Multi-turn conversation example
conversation_example = [
    "I've been struggling with anxiety recently.",
    "I worry about everything - work, relationships, my health.",
    "I've tried meditation but it doesn't seem to help much.",
    "How can I feel better about myself?"
]

print("\n" + "="*70)
print("MULTI-TURN CONVERSATION EXAMPLE")
print("="*70)

for i, user_msg in enumerate(conversation_example, 1):
    print(f"\nTurn {i}:")
    print(f"👤 User: {user_msg}")
    response = mental_health_bot.chat(user_msg)
    print(f"🤖 SupportiveBot: {response}")
    print("-" * 70)

## Step 12: Create Streamlit Interface Code

In [None]:
streamlit_code = '''
import streamlit as st
from mental_health_chatbot import MentalHealthChatbot
import time

st.set_page_config(
    page_title="Mental Health Support Bot",
    page_icon="🌟",
    layout="wide"
)

st.title("🌟 Mental Health Support Chatbot")
st.write("""
This is a supportive chatbot trained on empathetic dialogues.
Share what's on your mind, and I'll listen with care and understanding.

**Remember:** This chatbot provides support and encouragement, but cannot replace
professional mental health services. If you're in crisis, please reach out to:
- National Suicide Prevention Lifeline: 988 (US)
- Crisis Text Line: Text HOME to 741741
- International Association for Suicide Prevention: https://www.iasp.info/resources/Crisis_Centres/
""")

# Initialize chatbot
if 'chatbot' not in st.session_state:
    st.session_state.chatbot = MentalHealthChatbot(use_finetuned=True)

# Initialize message history
if 'messages' not in st.session_state:
    st.session_state.messages = []

# Display conversation history
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.write(message["content"])

# User input
user_input = st.chat_input("Share what's on your mind...")

if user_input:
    # Display user message
    st.session_state.messages.append({"role": "user", "content": user_input})
    with st.chat_message("user"):
        st.write(user_input)
    
    # Generate bot response
    with st.chat_message("assistant"):
        response = st.session_state.chatbot.chat(user_input)
        st.write(response)
    
    # Store bot response
    st.session_state.messages.append({"role": "assistant", "content": response})

# Sidebar with resources
with st.sidebar:
    st.header("💙 Resources")
    st.write("""
    **When to Seek Professional Help:**
    - Persistent thoughts of self-harm
    - Severe anxiety or panic attacks
    - Depression lasting more than 2 weeks
    - Inability to function in daily life
    - Substance abuse concerns
    
    **Helpful Practices:**
    - Regular exercise
    - Meditation and mindfulness
    - Social connections
    - Adequate sleep
    - Professional therapy
    """)
'''

print("Streamlit application code:")
print(streamlit_code)

## Step 13: Display Conversation History

In [None]:
# Display full conversation history
history = mental_health_bot.get_conversation_history()

print("\n" + "="*70)
print("CONVERSATION HISTORY")
print("="*70)
print(f"Total conversations: {len(history)}\n")

for i, exchange in enumerate(history, 1):
    print(f"Exchange {i}:")
    print(f"User: {exchange['user']}")
    print(f"Bot: {exchange['bot']}")
    print("-" * 70)

## Step 14: Key Considerations for Deployment

In [None]:
deployment_guide = """
KEY CONSIDERATIONS FOR MENTAL HEALTH CHATBOT DEPLOYMENT:

1. ETHICAL RESPONSIBILITY:
   - Clear disclaimer that chatbot is NOT a replacement for mental health professionals
   - Provide resources for crisis intervention
   - Display emergency contact numbers prominently
   - Never suggest stopping medication or therapy

2. SAFETY FEATURES NEEDED:
   - Detect crisis keywords (suicide, self-harm, overdose)
   - Redirect to emergency services immediately
   - Log concerning interactions for review
   - Rate limiting to prevent dependency

3. PRIVACY & DATA PROTECTION:
   - HIPAA compliance (if in US healthcare)
   - GDPR compliance (if serving EU users)
   - Encrypt all conversation data
   - Secure data storage and transmission
   - Clear privacy policy

4. MODEL CONSIDERATIONS:
   - Regular model evaluation and updates
   - Bias testing (gender, race, socioeconomic status)
   - Hallucination monitoring
   - Performance metrics tracking

5. USER FEEDBACK SYSTEM:
   - Rate response quality
   - Report harmful responses
   - Suggest improvements
   - Anonymous feedback option

6. DEPLOYMENT OPTIONS:
   - Streamlit app (quick deployment)
   - Flask/FastAPI REST API
   - Hugging Face Spaces (free hosting)
   - AWS/GCP/Azure (scalable)

7. FINE-TUNING DATASET:
   - Empathetic Dialogues: https://github.com/facebookresearch/EmpatheticDialogues
   - Mental Health Helpline Conversations
   - Peer support communities (with consent)
   - Diverse emotional scenarios

8. MODEL SELECTION:
   - DistilGPT2: Fast, lightweight (used here)
   - GPT2: Balanced performance
   - GPT-Neo: Better quality (7B+ params)
   - Mistral 7B: High quality, efficient
   - LLaMA: Strong for instruction following

9. MONITORING METRICS:
   - User satisfaction scores
   - Crisis detection accuracy
   - Response latency
   - Model drift over time
   - User retention rates

10. CONTINUOUS IMPROVEMENT:
    - Monthly model retraining
    - User feedback integration
    - A/B testing new responses
    - Collaborating with mental health professionals
    - Staying updated with research
"""

print(deployment_guide)

## Step 15: Important Health Disclaimers

In [None]:
disclaimer = """
⚠️ IMPORTANT DISCLAIMERS:

This chatbot is designed to provide EMOTIONAL SUPPORT and GENERAL INFORMATION only.

IT IS NOT:
✗ A substitute for professional mental health treatment
✗ Capable of diagnosing mental health conditions
✗ A replacement for therapy or counseling
✗ Appropriate for crisis situations (use emergency services instead)
✗ Authorized to prescribe or recommend medications

IF YOU'RE IN CRISIS:
🚨 CALL 911 (US Emergency) or your local emergency number
🚨 National Suicide Prevention Lifeline: 988 (call or text)
🚨 Crisis Text Line: Text HOME to 741741
🚨 International Association for Suicide Prevention: https://www.iasp.info/resources/Crisis_Centres/

WHEN TO SEEK PROFESSIONAL HELP:
• Persistent thoughts of self-harm or suicide
• Severe anxiety, panic attacks, or depression
• Inability to function in daily life
• Substance abuse or addiction concerns
• Trauma or PTSD symptoms
• Relationship or family crisis
• Any mental health emergency

Remember: Reaching out for professional help is a sign of strength, not weakness.
You deserve proper care from qualified mental health professionals.
"""

print(disclaimer)

## Summary

In this task, we successfully:
1. ✅ Loaded and explored the Empathetic Dialogues dataset
2. ✅ Prepared training data from empathetic conversations
3. ✅ Fine-tuned DistilGPT2 model on mental health conversations
4. ✅ Implemented MentalHealthChatbot class with supportive responses
5. ✅ Created both model-based and template-based response generation
6. ✅ Tested the chatbot with real-world emotional scenarios
7. ✅ Demonstrated multi-turn conversation capabilities
8. ✅ Provided Streamlit interface code for deployment
9. ✅ Documented deployment considerations and best practices
10. ✅ Included comprehensive health disclaimers and crisis resources

**Skills Demonstrated:**
- Fine-tuning large language models with Hugging Face Transformers
- Working with empathetic dialogue datasets
- Conversational AI and response generation
- Responsible AI development for sensitive domains
- Multi-turn conversation management
- Template-based response systems
- Streamlit app development
- Mental health awareness and crisis intervention
- Ethical AI deployment
- Model evaluation and monitoring