1. Started with: Pre-trained LLM model
2. Applied: Fine-tuning on mental health datasets
3. Built: RAG architecture with mental health knowledge base
4. Used: Prompt engineering to guide empathetic responses
5. Included: System prompts for safety and behavior guidelines

data resource

1. Beyond Blue: https://www.beyondblue.org.au/mental-health/resource-library
2. Reachout: https://au.reachout.com/challenges-and-coping/abuse-and-violence/sexual-assault-support-services
3. First aid: https://www.mhfa.com.au/resources-support
4. CCI: https://www.cci.health.wa.gov.au/Resources/Looking-After-Others4


Metrics

Technical Metrics (Computer-Based)
Most common metrics:

1. Perplexity: How well the model predicts responses (lower = better)
2. ROUGE-L: Measures if responses match expected answers
3. BLEU scores: Checks precision of language generation
4. Distinct-1/2/3: Measures response variety (can it say things differently?)
5. BERTScore: Captures semantic meaning (does it understand context?)
6. Empathy%: What percentage of responses show compassion?

Human Evaluation Metrics (Expert/User Assessment)
General metrics:

1. Helpfulness, Fluency, Relevance, Logic
2. Informativeness, Understanding, Consistency, Coherence, Empathy, Expertise, Engagement

Counseling-specific metrics:

1. Direct Guidance, Approval and Reassurance, Restatement, Reflection, Listening, Interpretation, Self-disclosure
(These mirror what real therapists do)

Reliability check:

Krippendorff's Alpha - ensures different evaluators agree



In [8]:
import platform
import psutil

# Install psutil if not already: pip install psutil

# Get OS details
print("Operating System:", platform.system())
print("OS Version:", platform.version())
print("OS Release:", platform.release())

# Get CPU details
print("\nCPU Info:")
print("Processor:", platform.processor())
print("Number of CPU Cores:", psutil.cpu_count(logical=True))

# Get RAM details
print("\nRAM Info:")
ram = psutil.virtual_memory()
print("Total RAM:", f"{ram.total / (1024 ** 3):.2f} GB")
print("Available RAM:", f"{ram.available / (1024 ** 3):.2f} GB")

# Get Disk details (for C: drive on Windows)
print("\nDisk Info (C: drive):")
disk = psutil.disk_usage('C:\\')
print("Total Disk Space:", f"{disk.total / (1024 ** 3):.2f} GB")
print("Used Disk Space:", f"{disk.used / (1024 ** 3):.2f} GB")
print("Free Disk Space:", f"{disk.free / (1024 ** 3):.2f} GB")

# Get GPU details (requires NVIDIA GPU and CUDA installed)
try:
    import torch
    if torch.cuda.is_available():
        print("\nGPU Info:")
        print("GPU Name:", torch.cuda.get_device_name(0))
        print("GPU VRAM:", f"{torch.cuda.get_device_properties(0).total_memory / (1024 ** 3):.2f} GB")
    else:
        print("\nNo GPU detected or CUDA not available.")
except ImportError:
    print("\nTorch not installed; skipping GPU check.")


import faiss
print(faiss.get_num_gpus())

Operating System: Windows
OS Version: 10.0.26100
OS Release: 10

CPU Info:
Processor: Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
Number of CPU Cores: 12

RAM Info:
Total RAM: 15.84 GB
Available RAM: 1.84 GB

Disk Info (C: drive):
Total Disk Space: 441.52 GB
Used Disk Space: 232.55 GB
Free Disk Space: 208.97 GB

GPU Info:
GPU Name: NVIDIA GeForce GTX 1650 Ti
GPU VRAM: 4.00 GB
1


# import the pre-trained model

In [9]:
# # Install main packages (your original, but with -U for upgrades)
# %pip install -U transformers bitsandbytes accelerate peft datasets

# # Install LangChain and FAISS (use faiss-gpu-cu12 for GPU; fallback to faiss-cpu if issues)
# %pip install langchain-community faiss-gpu-cu12 sentence-transformers

# Optional: Downgrade PyArrow if conflicts arise (e.g., if cudf errors pop up during imports)
# !pip install pyarrow==18.0.0 --force-reinstall


In [1]:
import os
import numpy as np
import pandas as pd
import re
from sentence_transformers import SentenceTransformer
import faiss
from faiss import read_index, write_index
import torch
import logging
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import nltk
from openai import OpenAI  # Add this import for API

# Download VADER lexicon if not already downloaded
# nltk.download('vader_lexicon', quiet=True)

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Detect device (still needed for embedder, but not for LLM)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
logger.info(f"Using device: {device}")

# Set up OpenAI client for Hugging Face API router
client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)
model_name = "meta-llama/Llama-3.3-70B-Instruct:fireworks-ai"  # Use a larger model via API

INFO:__main__:Using device: cuda


# RAG DATA PREP

In [2]:
import os
import numpy as np
import pandas as pd
from sentence_transformers import SentenceTransformer
import faiss
# from PyPDF2 import PdfReader  # Or import pdfplumber for better extraction
import pdfplumber  # Or import pdfplumber for better extraction
import logging
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download VADER lexicon if not already downloaded
nltk.download('vader_lexicon', quiet=True)

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Detect device
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
logger.info(f"Using device: {device}")

# Configurable paths (change category as needed)
category = 'general'  # 'anxiety', 'depression', 'ptsd', 'suicide', general
pdf_dir = f'./rag_system/{category}/'
rag_chunks_csv_path = f'./rag_system/{category}/chunks.csv'
rag_embeddings_path = f'./rag_system/{category}/chunk_embeddings.npy'
rag_index_path = f'./rag_system/{category}/faiss_index.index'

# Ensure directories exist
os.makedirs(os.path.dirname(rag_chunks_csv_path), exist_ok=True)

# Function to extract and chunk text from a single PDF
def extract_and_chunk_pdf(pdf_path, chunk_size=400, overlap=50):
    chunks = []
    try:
        with pdfplumber.open(pdf_path) as pdf:
            full_text = ""
            for page_num, page in enumerate(pdf.pages, start=1):
                text = page.extract_text() or ""  # Use layout=True if needed: page.extract_text(layout=True)
                full_text += f"\n\n[Page {page_num}]\n{text}"

        # Simple chunking: Split by words with overlap
        words = full_text.split()
        for i in range(0, len(words), chunk_size - overlap):
            chunk_words = words[i:i + chunk_size]
            chunk_text = " ".join(chunk_words)
            chunks.append({
                'content': chunk_text,
                'source_pdf': os.path.basename(pdf_path),
                'page_start': (i // chunk_size) + 1,  # Approximate page
                'category': category
            })

        logger.info(f"Extracted {len(chunks)} chunks from {pdf_path}")

    except Exception as e:
        logger.error(f"Error processing {pdf_path}: {e}")
    return chunks

# ==================================================================

# Step 1: Process all PDFs and save chunks to CSV
if os.path.exists(rag_chunks_csv_path):
    logger.info(f"Loading existing chunks from {rag_chunks_csv_path}")
    chunks_df = pd.read_csv(rag_chunks_csv_path)
else:
    all_chunks = []
    for filename in os.listdir(pdf_dir):
        if filename.endswith('.pdf'):
            pdf_path = os.path.join(pdf_dir, filename)
            pdf_chunks = extract_and_chunk_pdf(pdf_path)
            all_chunks.extend(pdf_chunks)

    if not all_chunks:
        raise ValueError(f"No chunks extracted from PDFs in {pdf_dir}")
    
    chunks_df = pd.DataFrame(all_chunks)
    chunks_df['chunk_id'] = range(len(chunks_df))  # Add unique ID
    chunks_df.to_csv(rag_chunks_csv_path, index=False)
    logger.info(f"Saved {len(chunks_df)} chunks to {rag_chunks_csv_path}")

# Add sentiment if not present
if 'sentiment' not in chunks_df.columns:
    logger.info("Computing sentiment for chunks...")
    sia = SentimentIntensityAnalyzer()
    
    def get_sentiment(text):
        score = sia.polarity_scores(text)['compound']
        if score > 0.05:
            return 'positive'
        elif score < -0.05:
            return 'negative'
        else:
            return 'neutral'
    
    chunks_df['sentiment'] = chunks_df['content'].apply(get_sentiment)
    chunks_df.to_csv(rag_chunks_csv_path, index=False)  # Resave with sentiment
    logger.info("Sentiment added and CSV updated.")

# Load embedder
embedder = SentenceTransformer('all-MiniLM-L6-v2', device=device)

# Step 2: Load or compute embeddings
if os.path.exists(rag_embeddings_path):
    logger.info("Loading precomputed embeddings...")
    chunk_embeddings_np = np.load(rag_embeddings_path)
else:
    logger.info("Computing embeddings...")
    chunk_contents = chunks_df['content'].tolist()
    chunk_embeddings = embedder.encode(
        chunk_contents,
        batch_size=128,
        show_progress_bar=True,
        convert_to_tensor=True,
        device=device,
        normalize_embeddings=True  # Built-in normalization
    )
    chunk_embeddings_np = chunk_embeddings.cpu().numpy()
    np.save(rag_embeddings_path, chunk_embeddings_np)
    logger.info(f"Embeddings saved to {rag_embeddings_path}")

# Step 3: Build/Load FAISS Index
d = chunk_embeddings_np.shape[1]  # Embedding dimension
if os.path.exists(rag_index_path):
    logger.info("Loading existing FAISS index...")
    faiss_index = faiss.read_index(rag_index_path)
else:
    logger.info("Building FAISS index...")
    index = faiss.IndexFlatIP(d)  # Inner product for normalized vectors
    index.add(chunk_embeddings_np)
    faiss.write_index(index, rag_index_path)
    logger.info(f"FAISS index saved to {rag_index_path}")




INFO:__main__:Using device: cuda
INFO:__main__:Loading existing chunks from ./rag_system/general/chunks.csv
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-MiniLM-L6-v2
INFO:__main__:Loading precomputed embeddings...
INFO:__main__:Loading existing FAISS index...


In [3]:
# RAG paths for Beyond Blue posts (configurable)
bblue_data_path = './rag_system/beyondblue/beyond_df.csv'
bblue_embeddings_path = './rag_system/beyondblue/post_embeddings.npy'
bblue_index_path = './rag_system/beyondblue/faiss_index.index'

# Load the Beyond Blue DataFrame
try:
    beyondblue_df = pd.read_csv(bblue_data_path)
    logger.info(f"Loaded {len(beyondblue_df)} posts from {bblue_data_path}")
except FileNotFoundError:
    raise FileNotFoundError(f"CSV file not found at {bblue_data_path}. Please check the path.")

# Load embedder (for embeddings)
embedder = SentenceTransformer('all-MiniLM-L6-v2', device=device)

# Load or compute embeddings for posts
if os.path.exists(bblue_embeddings_path):
    logger.info("Loading precomputed embeddings...")
    post_embeddings_np = np.load(bblue_embeddings_path)
else:
    logger.info("Computing embeddings...")
    post_contents = beyondblue_df['clean_title_content_comments'].tolist()
    post_embeddings = embedder.encode(
        post_contents,
        convert_to_tensor=True,
        device=device,
        batch_size=128,
        show_progress_bar=True,
        normalize_embeddings=True  # Use built-in normalization
    )
    post_embeddings_np = post_embeddings.cpu().numpy()
    np.save(bblue_embeddings_path, post_embeddings_np)
    logger.info(f"Embeddings saved to {bblue_embeddings_path}")

# Build/Load FAISS Index for posts
if os.path.exists(bblue_index_path):
    logger.info("Loading precomputed FAISS index...")
    post_index = read_index(bblue_index_path)
else:
    logger.info("Building FAISS index...")
    d = post_embeddings_np.shape[1]  # Embedding dimension
    post_index = faiss.IndexFlatIP(d)  # Inner Product for normalized vectors
    post_index.add(post_embeddings_np)
    write_index(post_index, bblue_index_path)
    logger.info(f"FAISS index saved to {bblue_index_path}")





# loaded resources for all


# Categories (assuming lowercase for paths, but adjust if needed)
categories = ['anxiety', 'depression', 'ptsd', 'suicide', 'general']  # Include 'general' as fallback

# Load category-specific resources into a dictionary
category_resources = {}
for cat in categories:
    chunks_csv = f'./rag_system/{cat}/chunks.csv'
    emb_path = f'./rag_system/{cat}/chunk_embeddings.npy'
    idx_path = f'./rag_system/{cat}/faiss_index.index'
    
    if os.path.exists(chunks_csv) and os.path.exists(emb_path) and os.path.exists(idx_path):
        chunks_df = pd.read_csv(chunks_csv)
        # Ensure sentiment is present (fallback compute if missing)
        if 'sentiment' not in chunks_df.columns:
            logger.info(f"Computing sentiment for {cat} chunks...")
            sia = SentimentIntensityAnalyzer()
            def get_sentiment(text):
                score = sia.polarity_scores(text)['compound']
                if score > 0.05: return 'positive'
                elif score < -0.05: return 'negative'
                else: return 'neutral'
            chunks_df['sentiment'] = chunks_df['content'].apply(get_sentiment)
            chunks_df.to_csv(chunks_csv, index=False)
            logger.info(f"Sentiment added for {cat} and CSV updated.")
        
        chunk_emb_np = np.load(emb_path)
        cat_index = read_index(idx_path)
        category_resources[cat] = {
            'df': chunks_df,
            'embeddings': chunk_emb_np,
            'index': cat_index
        }
        logger.info(f"Loaded resources for category: {cat}")
    else:
        logger.warning(f"Missing resources for category: {cat}. Skipping.")

INFO:__main__:Loaded 11573 posts from ./rag_system/beyondblue/beyond_df.csv
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-MiniLM-L6-v2
INFO:__main__:Loading precomputed embeddings...
INFO:__main__:Loading precomputed FAISS index...
INFO:__main__:Loaded resources for category: anxiety
INFO:__main__:Loaded resources for category: depression
INFO:__main__:Loaded resources for category: ptsd
INFO:__main__:Loaded resources for category: suicide
INFO:__main__:Loaded resources for category: general


# building the RAG system

In [5]:
# Function for RAG Query on posts (to get category) - refined for single top match
def rag_search(query: str, k: int = 1) -> dict:
    """Embed query and search FAISS index for the most similar post to determine category."""
    query_embedding = embedder.encode(query, convert_to_tensor=True, device=device, normalize_embeddings=True)
    query_embedding_np = query_embedding.cpu().numpy()
    query_embedding_np = query_embedding_np.reshape(1, -1)  # Reshape for FAISS

    distances, indices = post_index.search(query_embedding_np, k=1)  # Always top 1 for category detection
    if len(distances[0]) == 0:
        raise ValueError("No similar posts found.")
    
    idx = indices[0][0]
    score = distances[0][0]
    post = beyondblue_df.iloc[idx]['clean_title_content_comments']
    category = beyondblue_df.iloc[idx]['Post Category'].lower()  # Normalize to lowercase
    logger.info(f"Most similar post (score {score:.4f}): {post[:200]}... Category: {category}")
    return {
        'post': post,
        'category': category,
        'score': score
    }

# API-based generation for RAG (single prompt)
def generate_with_llm_rag(prompt: str, max_tokens: int = 100):
    messages = [{"role": "user", "content": prompt}]
    completion = client.chat.completions.create(
        model=model_name,
        messages=messages,
        max_tokens=max_tokens,
        temperature=0.8,
        top_p=0.95,
        frequency_penalty=1.2  # Approximate repetition_penalty with frequency_penalty
    )
    return completion.choices[0].message.content.strip()

# Category retrieval with method - refined to use detected category
def category_retrieve(category: str, query: str, method: str = 'original', k: int = 20) -> list[dict]:  # Retrieve more for rerank
    if category not in category_resources:
        logger.warning(f"Category '{category}' not found. Falling back to 'general'.")
        category = 'general'
        if category not in category_resources:
            raise ValueError("No 'general' resources available as fallback.")
    
    res = category_resources[category]
    chunks_df = res['df']
    cat_index = res['index']
    
    queries = [query]
    
    if method == 'multi':
        prompt = f"""
            Strictly follow: Output EXACTLY 5 unique variant queries similar to '{query}', one per line.
            Preserve key elements (events, causes, feelings).
            No reasoning, no steps, no examples, no introductions, no extra text at all.
            Directly start with the first query.

            Example output format (do not include this in output):
            Variant1
            Variant2
            Variant3
            Variant4
            Variant5
            """
        variants_response = generate_with_llm_rag(prompt)
        variants = [line.strip() for line in variants_response.split('\n') if line.strip() and len(line.split()) > 2 and not re.match(r'^(#|Step|Example|For reference|The|To|Output|Preserve|No|Directly)', line, re.I)]
        queries = list(set(v for v in variants if v.lower() != query.lower()))[:5]
        if not queries or len(queries) < 3:  # Fallback if poor generation
            queries = [query, f"{query} coping strategies", f"{query} emotional support"]
        logger.info(f"Generated multi-queries: {queries}")

    elif method == 'hyde':
        prompt = f"""
            Generate a concise hypothetical answer document for the query '{query}'.
            Output only the document text, without any introductions, explanations, numbering,
            or extra formatting.
        """
        hyde_response = generate_with_llm_rag(prompt)
        # Clean up: Join lines into a single document string, remove leading/trailing whitespace, and strip common prefixes like "Document:"
        hyde_doc = ' '.join(line.strip() for line in hyde_response.split('\n') if line.strip()).replace('Document:', '').strip()
        queries = [hyde_doc]  # Use the cleaned doc as "query" for embedding
        logger.info(f"Generated HyDE document: {hyde_doc[:200]}...")
    
    # Embed all queries
    query_embs = embedder.encode(queries, convert_to_tensor=True, device=device, normalize_embeddings=True).cpu().numpy()
    
    # Search for each, collect unique results with max score
    all_results = {}
    for q_emb in query_embs:
        q_emb = q_emb.reshape(1, -1)
        distances, indices = cat_index.search(q_emb, k=k)
        for i in range(len(distances[0])):
            idx = indices[0][i]
            if idx == -1: continue
            score = distances[0][i]
            if idx not in all_results or score > all_results[idx]['score']:
                row = chunks_df.iloc[idx]
                all_results[idx] = {
                    'content': row['content'],
                    'source_pdf': row['source_pdf'],
                    'page_start': row['page_start'],
                    'sentiment': row['sentiment'],
                    'score': score
                }
    
    # Get top k by score first (before rerank)
    sorted_results = sorted(all_results.values(), key=lambda x: x['score'], reverse=True)[:k]
    return sorted_results

# Rerank function
def rerank_results(results: list[dict]) -> list[dict]:
    sentiment_order = {'positive': 0, 'neutral': 1, 'negative': 2}
    # Sort by sentiment order, then by score descending
    sorted_results = sorted(results, key=lambda x: (sentiment_order.get(x['sentiment'], 3), -x['score']))
    return sorted_results[:5]  # Top 5 after rerank

# Full workflow - refined to always use top similar post for category
def full_rag_workflow(query: str, method: str = 'original') -> tuple[list[dict], dict]:
    # Step 1: Get category from the most similar Beyond Blue post
    post_result = rag_search(query)  # k=1 by default
    category = post_result['category']

    category = 'general'  # Comment out if you want dynamic category

    # Step 2: Retrieve from category-specific database with method
    retrieved = category_retrieve(category, query, method=method)
    
    # Step 3: Rerank
    reranked = rerank_results(retrieved)
    return reranked, post_result

# API-based generation for chatbot (messages list)
def generate_with_llm(messages: list[dict], max_tokens: int = 150) -> str:
    completion = client.chat.completions.create(
        model=model_name,
        messages=messages,
        max_tokens=max_tokens,
        temperature=0.8,
        top_p=0.95,
        frequency_penalty=1.2  # Approximate repetition_penalty
    )
    return completion.choices[0].message.content.strip()








example_query = "my dog pass away"

# Test with different methods: 'original', 'multi', 'hyde'
results, bpost_result = full_rag_workflow(example_query, method='hyde')  # Change method as needed
for i, res in enumerate(results):
    print(f"Top {i+1}: Score: {res['score']:.4f}, Sentiment: {res['sentiment']}, Source: {res['source_pdf']}")
    print(f"Content snippet: {res['content'][:300]}...\n")




Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Most similar post (score 0.5467): dog pass may seem silly unreasonable someone look outside dear alizerath look finish post could take guess say want say dog world family deeply deeply sadden sher pass lose pet pet arehave cldren neve... Category: depression
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: Losing a pet can be a devastating experience and it's normal to feel overwhelmed with grief. Take time to process your emotions and consider reaching out to friends, family, or a support group for hel...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Top 1: Score: 0.6218, Sentiment: positive, Source: bl0390-grief-loss-and-depression-fact-sheet_acc.pdf
Content snippet: of grief, however, you begin to create new irritable or numb. Many of these reactions are experiences and habits that work around your not constant but instead can come in waves; loss. You slowly begin to experience a greater often triggered by memories or occasions. The sense of hope; focusing more...

Top 2: Score: 0.4716, Sentiment: positive, Source: bl0390-grief-loss-and-depression-fact-sheet_acc.pdf
Content snippet: life. • Look after yourself. Helping a grieving person Australian Centre for Grief can be a heavy burden. Take care of your and Bereavement own physical and emotional health, and talk about your feelings with someone during this www.grief.org.au stressful time. Information about grief and support fo...

Top 3: Score: 0.4343, Sentiment: positive, Source: What works to promote emotional wellbeing.pdf
Content snippet: There is not enough evidence to show

In [14]:
# Refined chatbot loop with history and token management
def run_chatbot(method: str = 'original', max_context_tokens: int = 8192, max_history_pairs: int = 10):
    
    system_prompt = """
        You are a compassionate and empathetic mental health assistant. Your responses should:
        - Analyze the user's input with care and understanding.
        - Respond directly in a comforting manner, tailoring to the specific query, history, and provided guidelines.
        - Incorporate relevant details from guidelines naturally (e.g., coping strategies).
        - Offer gentle, practical coping suggestions where relevant, varying them based on context.
        - Keep the tone warm, supportive, and conversational—vary phrasing to avoid repetition.
        - Do not diagnose, give medical advice, or promise cures. Always suggest professional help if needed (e.g., helplines like beyondblue: 1300 22 4636).
        - Responses should be 60-100 words.
        """
    
    history = []  # List of dicts: [{'role': 'user', 'content': ...}, {'role': 'assistant', 'content': ...}]
    
    print("Welcome to the Mental Health Chatbot. Type 'quit' to exit.")
    
    while True:
        try:
            query = input("You: ").strip()
            if query.lower() in ['quit', 'exit']:
                print("Chatbot: Goodbye! Take care.")
                break
            
            if not query:
                print("Chatbot: Please enter a message.")
                continue
            
            # Get RAG results
            retrieved_docs, post_result = full_rag_workflow(query, method=method)
            
            # Format retrieved docs as context string
            doc_context = "\n".join([f"Doc {i+1} (Sentiment: {doc['sentiment']}, Score: {doc['score']:.4f}): {doc['content']}" for i, doc in enumerate(retrieved_docs)])
            
            
            # Build user prompt (added back the similar post context and instruction)
            user_prompt = f"""
                    User's message: {query}

                    Additional relevant guidelines (use these to inform suggestions if they fit the query):
                    {doc_context}

                    Respond empathetically, drawing from guidelines for specific ideas. If the query involves loss or grief, suggest personalized memorials or support resources. Vary your language from previous responses.
                """
            
            # Build messages
            messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": user_prompt}]
            
            # Check token length and truncate history proactively if needed
            # Note: Without local tokenizer, approximate token count or skip; for simplicity, assume API handles it or implement a rough estimator
            # For now, we'll skip precise truncation; API has context limits (e.g., 128k for Llama-3.3)
            def estimate_tokens(text: str) -> int:
                return len(text.split()) * 1.3 + 100  # Rough estimate + buffer

            while sum(estimate_tokens(msg['content']) for msg in messages) > max_context_tokens:
                if len(history) <= 2:
                    logger.warning("Context exceeds limit; proceeding.")
                    break
                history = history[2:]  # Remove oldest pair
                messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": user_prompt}]
                logger.info(f"Truncated history to {len(history)} entries.")
            
            # Generate response
            response = generate_with_llm(messages, max_tokens=150)
            
            print("Chatbot:", response)
            
            # Append to history
            history.append({"role": "user", "content": query})
            history.append({"role": "assistant", "content": response})
        
        except Exception as e:
            logger.error(f"Error in chatbot loop: {e}")
            print("Chatbot: Sorry, something went wrong. Please try again.")



run_chatbot(method='multi')  # Change method as needed: 'original', 'multi', 'hyde'

Welcome to the Mental Health Chatbot. Type 'quit' to exit.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Most similar post (score 0.5369): stress get stress headache assume need talk someone hello dear guest warm care welcome forum sorry you re get stress headache you re stress that s cause please you re talk kind thought dear guest gran... Category: anxiety
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated multi-queries: ['I feel completely burnt out and exhausted', 'I am extremely frustrated and tense', "I'm feeling overwhelmed and anxious", 'I am totally drained and feeling hopeless', 'My stress levels are really high right now']


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I'm so sorry to hear you're feeling stressed. It's completely normal, and many people experience it. Try taking a few deep breaths and identifying what's triggering your stress. You might consider exercise, like a short walk, or practicing relaxation techniques like progressive muscle relaxation to help calm down. If needed, reach out to a professional for support, like calling beyondblue at 1300 22 4636.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:__main__:Most similar post (score 0.5075): extremely overwhelmed uni nearly end uni panic attack nearly everyday dream uni have not do try break hard try find tutor can not work full day week also everyone pressure keep together leave much de ... Category: anxiety
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated multi-queries: ["i'm finding it increasingly difficult to keep up with my studies and it's leaving me with a sense of inadequacy", 'i am struggling more and more with my studies and it feels like everyone else is ahead of me', 'my academic progress is slowing down and i feel overwhelmed by the fear of being left behind my peers', "my studying is getting harder every day and i always feel like i'm falling behind"]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: Feeling left behind in your studies can be really overwhelming. Remember that everyone learns at their own pace, and it's okay to take things one step at a time. Try breaking down your study material into smaller, manageable chunks, and focus on making progress rather than comparing yourself to others. Take breaks, practice self-care, and reach out to friends or a tutor for support when you need it. You got this!
Chatbot: Goodbye! Take care.


# Prompt engineering

SYSTEM PROMPT (Behavior Guidelines):
You are a compassionate, emotionally intelligent mental health assistant.
Your role is to support users with warmth, honesty, and practical insight.

INSTRUCTIONS:
- Begin by validating the user's feelings
- Speak with empathy and clarity
- Keep responses concise (2-4 sentences)
- Offer one clear, actionable technique
- Never fabricate facts or provide medical diagnoses

RETRIEVED CONTEXT (From RAG):
"Work stress is incredibly common. Try the '3-breath reset' - take three deep breaths, then write down just three things you need to accomplish today. This helps create manageable chunks instead of an endless to-do list."

USER INPUT:
I'm feeling overwhelmed with work stress

RESPONSE FORMAT:
[Validation] + [Brief explanation] + [Actionable technique] + [Reassurance]

Now respond as the mental health assistant: