1. Started with: Pre-trained LLM model(meta-llama/Llama-3.3-70B-Instruct:fireworks-ai)
2. Applied: Fine-tuning on mental health datasets(none)
3. Built: RAG architecture with mental health knowledge base
4. Used: Prompt engineering to guide empathetic responses
5. Included: System prompts for safety and behavior guidelines

# pre-existing generic mrntal-health chatbot


LLM-based chatbots (newer generation - 22 total identified):

MindGuide,
ChatCounselor, 
Replika,
PsyChat,
SMILE,
Psy-LLM,
Assistant-Instruct,
CBT-LLM,
MindWatch,
EmLLM,
Counsellor Chatbot

# data resource

1. Beyond Blue: https://www.beyondblue.org.au/mental-health/resource-library
2. Reachout: https://au.reachout.com/challenges-and-coping/abuse-and-violence/sexual-assault-support-services
3. First aid: https://www.mhfa.com.au/resources-support
4. CCI: https://www.cci.health.wa.gov.au/Resources/Looking-After-Others4
5. Sexual Assault Support Service (SASS): https://www.sass.org.au/resources




=========================================================================

CBT, or Cognitive Behavioural Therapy,


=========================================================================

ICD-11 data (Profession)

DSM-5-TR


=========================================================================




# Metrics

Technical Metrics (Computer-Based)
Most common metrics:

1. Perplexity: How well the model predicts responses (lower = better)
2. ROUGE-L: Measures if responses match expected answers
3. BLEU scores: Checks precision of language generation
4. Distinct-1/2/3: Measures response variety (can it say things differently?)
5. BERTScore: Captures semantic meaning (does it understand context?)
6. Empathy%: What percentage of responses show compassion?

Human Evaluation Metrics (Expert/User Assessment)
General metrics:

1. Helpfulness, Fluency, Relevance, Logic
2. Informativeness, Understanding, Consistency, Coherence, Empathy, Expertise, Engagement

Counseling-specific metrics:

1. Direct Guidance, Approval and Reassurance, Restatement, Reflection, Listening, Interpretation, Self-disclosure
(These mirror what real therapists do)

Reliability check:

1. Krippendorff's Alpha - ensures different evaluators agree



# scrapping DSM-5-TR clinical case

In [None]:
import pandas as pd  # Kept as in original, though not used here
import time
import random
import json
import requests
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
import undetected_chromedriver as uc  # Import the undetectable version

# Set up options for undetectable Chrome
options = uc.ChromeOptions()
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36')  # Example user-agent

# Enable performance logging to capture network requests
options.set_capability("goog:loggingPrefs", {'performance': 'ALL'})

# Initialize the undetectable driver
driver = uc.Chrome(options=options)

try:
    # Navigate to the URL (https://psychiatryonline.org/doi/epub/10.1176/appi.books.9781615375295)
    driver.get('https://psychiatryonline.org/doi/epdf/10.1176/appi.books.9781615375295')
    
    # Wait for the PDF viewer to load
    WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.ID, 'viewer')))
    
    # Simulate human scroll to load more content (e.g., scroll down a bit)
    actions = ActionChains(driver)
    actions.move_by_offset(random.randint(100, 300), random.randint(100, 300)).perform()  # Random mouse move
    time.sleep(random.uniform(1, 3))  # Random delay
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight / 2);")  # Scroll halfway
    time.sleep(random.uniform(2, 5))  # Wait like a human reading
    
    # If login is required, manually log in via the browser window, then press Enter here
    # input("Log in if necessary and press Enter to continue...")
    
    # Optional: More scrolls to ensure all pages load if lazy-loaded
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(random.uniform(3, 6))
    
    # Hide webdriver property (though undetected_chromedriver handles most of this)
    driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
    
    # Give some time for network requests to complete
    time.sleep(5)
    
    # Get performance logs
    logs = driver.get_log('performance')
    
    pdf_url = None
    for entry in logs:
        message = json.loads(entry['message'])['message']
        if message['method'] == 'Network.responseReceived':
            response = message['params'].get('response', {})
            url = response.get('url', '').lower()
            if '.pdf' in url or 'application/pdf' in response.get('mimeType', ''):
                pdf_url = response['url']
                print(f"Found potential PDF URL: {pdf_url}")
                break  # Assuming the first PDF is the main one; adjust if multiple
    
    if not pdf_url:
        raise Exception("No PDF URL found in network logs. Try scrolling more or increasing wait time.")
    
    # Download the PDF using requests with cookies from the browser session
    cookies = {c['name']: c['value'] for c in driver.get_cookies()}
    headers = {'User-Agent': options.arguments[-1].split('=')[1]}  # Reuse user-agent
    response = requests.get(pdf_url, cookies=cookies, headers=headers)
    
    if response.status_code == 200:
        with open('downloaded.pdf', 'wb') as f:
            f.write(response.content)
        print("PDF downloaded successfully as 'downloaded.pdf'")
    else:
        raise Exception(f"Failed to download PDF: Status code {response.status_code}")

finally:
    driver.quit()


In [None]:




import PyPDF2
import csv
import re

# def extract_text_from_pdf(pdf_path):
#     with open(pdf_path, 'rb') as file:
#         reader = PyPDF2.PdfReader(file)
#         text = ''
#         for page in reader.pages:
#             text += page.extract_text() + '\n'
#     return text


# def write_to_csv(cases, csv_path='output_cases.csv'):
#     with open(csv_path, 'w', newline='', encoding='utf-8') as file:
#         writer = csv.DictWriter(file, fieldnames=['case', 'description', 'discussion', 'diagnosis'])
#         writer.writeheader()
#         writer.writerows(cases)
#     print(f'CSV file created at: {csv_path}')

# # Usage
# pdf_path = './rag_system/dsm5/DSM-5-TR clinical case.pdf'  # Replace with your PDF path
# text = extract_text_from_pdf(pdf_path)
# # save the text to a txt file for checking
# with open('extracted_text.txt', 'w', encoding='utf-8') as f:
#     f.write(text)



def clean_text(text):
    """Remove extra whitespace and empty lines, put all text in one line"""
    # Fix spaced-out words (e.g., "p s y c h i a t r y" -> "psychiatry")
    # text = re.sub(r'(?<=\w)\s+(?=\w)', '', text)

    while re.search(r'\b(\w) (\w) (\w)', text):
        text = re.sub(r'\b(\w) (?=\w )', r'\1', text)
    
    # Replace multiple spaces with single space
    text = re.sub(r' +', ' ', text)
    
    # Remove empty lines and join everything with a space
    lines = [line.strip() for line in text.split('\n') if line.strip()]
    return ' '.join(lines)


def parse_cases_from_file(txt_path):
    """Parse cases from extracted text file"""
    with open(txt_path, 'r', encoding='utf-8') as f:
        text = f.read()
    
    # Remove all Suggested Readings sections before processing
    text = re.sub(
        r'Suggested Reading.*?(?=CASE \d+\.\d+|\Z)',
        '',
        text,
        flags=re.DOTALL | re.IGNORECASE
    )
    
    # Split by CASE pattern
    case_pattern = r'(CASE \d+\.\d+)'
    parts = re.split(case_pattern, text)
    
    cases = []
    
    # Process pairs: case_number, content
    for i in range(1, len(parts), 2):
        if i + 1 >= len(parts):
            break
            
        case_num = parts[i].strip()
        content = parts[i + 1].strip()
        
        # Split content into lines
        lines = content.split('\n')
        
        # Extract case title (first non-empty line)
        case_title = ''
        author_line_idx = 0
        for idx, line in enumerate(lines):
            line = line.strip()
            if line:
                case_title = line
                author_line_idx = idx
                break
        
        # Skip author lines (usually 1-3 lines after title with names ending in degree)
        desc_start_idx = author_line_idx + 1
        while desc_start_idx < len(lines):
            line = lines[desc_start_idx].strip()
            # Check if line looks like an author (contains Ph.D., M.D., etc.)
            if not line or re.search(r'\b(Ph\.?D\.?|M\.?D\.?|M\.?A\.?|D\.?O\.?)\b', line):
                desc_start_idx += 1
            else:
                break
        
        # Rejoin content from description start
        remaining_content = '\n'.join(lines[desc_start_idx:])
        
        
        
        # Extract description (everything before "Discussion")
        description = ''
        discussion = ''
        diagnosis = ''
        
        # Split by "Discussion" first
        if 'Discussion' in remaining_content:
            parts_disc = re.split(r'Discussion\s*\n', remaining_content, maxsplit=1)
            description = parts_disc[0].strip()
            after_discussion = parts_disc[1].strip() if len(parts_disc) > 1 else ''
            
            # Now split after_discussion by "Diagnosis" or "Diagnoses"
            diag_match = re.search(r'(Diagnos[ei]s?)\s*\n', after_discussion)
            
            if diag_match:
                # Extract discussion (before "Diagnosis")
                discussion = after_discussion[:diag_match.start()].strip()
                
                # Extract diagnosis (after "Diagnosis/Diagnoses")
                after_diagnosis = after_discussion[diag_match.end():].strip()
                
                # First, handle line continuations with hyphens
                after_diagnosis = re.sub(r'-\s*\n\s*', '', after_diagnosis)
                
                # Then handle regular line breaks within bullet points
                after_diagnosis = re.sub(r'(?<!•)\n(?!•)', ' ', after_diagnosis)
                
                # Clean up the diagnosis to keep only bullet points
                diag_lines = []
                for line in after_diagnosis.split('\n'):
                    line = line.strip()
                    if line.startswith('•'):
                        diag_lines.append(line)
                diagnosis = ' '.join(diag_lines)
            else:
                # No diagnosis section found, everything after Discussion is discussion
                discussion = after_discussion.strip()
                diagnosis = ''
        else:
            # No Discussion section found
            description = remaining_content.strip()
            discussion = ''
            diagnosis = ''
        
        cases.append({
            # {case_num} - 
            'case': f"{case_title}",
            'description': clean_text(description),
            'discussion': clean_text(discussion),
            'diagnosis': clean_text(diagnosis)
        })
    
    return cases

def write_to_csv(cases, csv_path='output_cases.csv'):
    """Write cases to CSV file"""
    with open(csv_path, 'w', newline='', encoding='utf-8') as file:
        writer = csv.DictWriter(file, fieldnames=['case', 'description', 'discussion', 'diagnosis'])
        writer.writeheader()
        writer.writerows(cases)
    print(f'CSV file created successfully: {csv_path}')
    print(f'Total cases extracted: {len(cases)}')

# Main execution
if __name__ == "__main__":
    txt_path = 'extracted_text.txt'
    
    try:
        cases = parse_cases_from_file(txt_path)
        write_to_csv(cases)
        
            
    except FileNotFoundError:
        print(f"Error: Could not find {txt_path}")
    except Exception as e:
        print(f"Error: {e}")
        import traceback
        traceback.print_exc()

CSV file created successfully: output_cases.csv
Total cases extracted: 104


In [None]:



from transformers import pipeline
from presidio_analyzer import EntityRecognizer, AnalyzerEngine, RecognizerResult
from presidio_analyzer.nlp_engine import NlpEngineProvider
from presidio_anonymizer import AnonymizerEngine
from presidio_anonymizer.entities import OperatorConfig
import pandas as pd

# list of entities: https://microsoft.github.io/presidio/supported_entities/#list-of-supported-entities
DEFAULT_ANOYNM_ENTITIES = [
    "CREDIT_CARD",
    "CRYPTO",
    "DATE_TIME",
    "EMAIL_ADDRESS",
    "IBAN_CODE",
    "IP_ADDRESS",
    "NRP",
    "LOCATION",
    "PERSON",
    "PHONE_NUMBER",
    "MEDICAL_LICENSE",
    "URL",
    "ORGANIZATION",
    "NUMBER"
]

class TransformerRecognizer(EntityRecognizer):
    def __init__(
        self,
        model_id_or_path,
        mapping_labels,
        aggregation_strategy="simple",
        supported_language="en",
        ignore_labels=["O"],
    ):
        # inits transformers pipeline for given mode or path
        self.pipeline = pipeline(
            "token-classification", model=model_id_or_path, aggregation_strategy=aggregation_strategy, ignore_labels=ignore_labels
        )
        # map labels to presidio labels
        self.label2presidio = mapping_labels

        # passes entities from model into parent class
        super().__init__(supported_entities=list(self.label2presidio.values()), supported_language=supported_language)

    def load(self) -> None:
        """No loading is required."""
        pass

    def analyze(
        self, text: str, entities = None, nlp_artifacts = None
    ):
        """
        Extracts entities using Transformers pipeline
        """
        results = []

        predicted_entities = self.pipeline(text)
        if len(predicted_entities) > 0:
            for e in predicted_entities:
                if(e['entity_group'] not in self.label2presidio):
                    continue 
                converted_entity = self.label2presidio[e["entity_group"]]
                if converted_entity in entities or entities is None:
                    results.append(
                        RecognizerResult(
                            entity_type=converted_entity, start=e["start"], end=e["end"], score=e["score"]
                        )
                    )
        return results



# Constants definition

mapping_labels = {"PER":"PERSON",'LOC':'LOCATION','ORG':"ORGANIZATION", "MISC": "NRP"}
configuration = {"nlp_engine_name":"spacy", 
                "models":[{"lang_code": 'en', "model_name":"en_core_web_lg"}]}

# List of words that the model should keep 
to_keep = []
lang = 'en'


provider = NlpEngineProvider(nlp_configuration=configuration)
nlp_engine = provider.create_engine()

# Pass the created NLP engine and supported_languages to the AnalyzerEngine
analyzer = AnalyzerEngine(
    nlp_engine=nlp_engine, 
    supported_languages = "en"
)

transformers_recognizer = TransformerRecognizer("dslim/bert-base-NER", mapping_labels)
analyzer.registry.add_recognizer(transformers_recognizer)

def anonymize_text(text):
    if pd.isna(text):
        return ''
    # Text Analyzer (focus on PERSON for patient names)
    analyzer_results = analyzer.analyze(text=text, entities=["PERSON"], allow_list=to_keep, language=lang)
    
    # Text Anonymizer with custom replacement
    engine = AnonymizerEngine()
    operators = {"PERSON": OperatorConfig("replace", {"new_value": "[PATIENT]"})}
    result = engine.anonymize(text=text, analyzer_results=analyzer_results, operators=operators)
    
    # Optional: Extract found found entities if needed (as in original code)
    # anonymization_results = {"anonymized": result.text, "found": [entity.to_dict() for entity in analyzer_results]}
    # words = [{'word': text[obj['start']:obj['end']], 'entity_type':obj['entity_type'], 'start':obj['start'], 'end':obj['end']} for obj in anonymization_results['found']]
    
    return result.text

# Load CSV and apply anonymization
cases = pd.read_csv('output_cases.csv')

cases['description'] = cases['description'].apply(anonymize_text)
cases['discussion'] = cases['discussion'].apply(anonymize_text)
cases.to_csv('output_cases_anonymized.csv', index=False)


Some weights of the model checkpoint at dslim/bert-base-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda:0
  attn_output = torch.nn.functional.scaled_dot_product_attention(
You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


In [13]:
import platform
import psutil

# Install psutil if not already: pip install psutil

# Get OS details
print("Operating System:", platform.system())
print("OS Version:", platform.version())
print("OS Release:", platform.release())

# Get CPU details
print("\nCPU Info:")
print("Processor:", platform.processor())
print("Number of CPU Cores:", psutil.cpu_count(logical=True))

# Get RAM details
print("\nRAM Info:")
ram = psutil.virtual_memory()
print("Total RAM:", f"{ram.total / (1024 ** 3):.2f} GB")
print("Available RAM:", f"{ram.available / (1024 ** 3):.2f} GB")

# Get Disk details (for C: drive on Windows)
print("\nDisk Info (C: drive):")
disk = psutil.disk_usage('C:\\')
print("Total Disk Space:", f"{disk.total / (1024 ** 3):.2f} GB")
print("Used Disk Space:", f"{disk.used / (1024 ** 3):.2f} GB")
print("Free Disk Space:", f"{disk.free / (1024 ** 3):.2f} GB")

# Get GPU details (requires NVIDIA GPU and CUDA installed)
try:
    import torch
    if torch.cuda.is_available():
        print("\nGPU Info:")
        print("GPU Name:", torch.cuda.get_device_name(0))
        print("GPU VRAM:", f"{torch.cuda.get_device_properties(0).total_memory / (1024 ** 3):.2f} GB")
    else:
        print("\nNo GPU detected or CUDA not available.")
except ImportError:
    print("\nTorch not installed; skipping GPU check.")


import faiss
print(faiss.get_num_gpus())

Operating System: Windows
OS Version: 10.0.26200
OS Release: 10

CPU Info:
Processor: Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
Number of CPU Cores: 12

RAM Info:
Total RAM: 15.84 GB
Available RAM: 3.04 GB

Disk Info (C: drive):
Total Disk Space: 441.52 GB
Used Disk Space: 252.77 GB
Free Disk Space: 188.76 GB

GPU Info:
GPU Name: NVIDIA GeForce GTX 1650 Ti
GPU VRAM: 4.00 GB
0


# import the pre-trained model

In [14]:
# # Install main packages (your original, but with -U for upgrades)
# %pip install -U transformers bitsandbytes accelerate peft datasets

# # Install LangChain and FAISS (use faiss-gpu-cu12 for GPU; fallback to faiss-cpu if issues)
# %pip install langchain-community faiss-gpu-cu12 sentence-transformers

# Optional: Downgrade PyArrow if conflicts arise (e.g., if cudf errors pop up during imports)
# !pip install pyarrow==18.0.0 --force-reinstall


In [15]:
import os
import numpy as np
import pandas as pd
import re
from sentence_transformers import SentenceTransformer
import faiss
from faiss import read_index, write_index
import torch
import logging
import nltk
from openai import OpenAI  # Add this import for API

# Download VADER lexicon if not already downloaded
# nltk.download('vader_lexicon', quiet=True)

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Detect device (still needed for embedder, but not for LLM)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
logger.info(f"Using device: {device}")

# Set up OpenAI client for Hugging Face API router
client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)
model_name = "meta-llama/Llama-3.3-70B-Instruct:fireworks-ai"  # Use a larger model via API

INFO:__main__:Using device: cuda


# General DATA PREP

In [16]:
import os
import numpy as np
import pandas as pd
from sentence_transformers import SentenceTransformer
import faiss
# from PyPDF2 import PdfReader  # Or import pdfplumber for better extraction
import pdfplumber  # Or import pdfplumber for better extraction
import logging
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download VADER lexicon if not already downloaded
nltk.download('vader_lexicon', quiet=True)

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Detect device
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
logger.info(f"Using device: {device}")


# "cambridgeltl/SapBERT-from-PubMedBERT-fulltext"
# Load embedder
embedder = SentenceTransformer('cambridgeltl/SapBERT-from-PubMedBERT-fulltext', device=device)




# Configurable paths (change category as needed)
category = 'general'  # 'anxiety', 'depression', 'ptsd', 'suicide', general
pdf_dir = f'./rag_system/{category}/'
rag_chunks_csv_path = f'./rag_system/{category}/chunks.csv'
rag_embeddings_path = f'./rag_system/{category}/chunk_embeddings.npy'
rag_index_path = f'./rag_system/{category}/faiss_index.index'

# Ensure directories exist
os.makedirs(os.path.dirname(rag_chunks_csv_path), exist_ok=True)

# Function to extract and chunk text from a single PDF
def extract_and_chunk_pdf(pdf_path, chunk_size=300, overlap=40):
    chunks = []
    try:
        with pdfplumber.open(pdf_path) as pdf:
            full_text = ""
            for page_num, page in enumerate(pdf.pages, start=1):
                text = page.extract_text() or ""  # Use layout=True if needed: page.extract_text(layout=True)
                full_text += f"\n\n[Page {page_num}]\n{text}"

        # Simple chunking: Split by words with overlap
        words = full_text.split()
        for i in range(0, len(words), chunk_size - overlap):
            chunk_words = words[i:i + chunk_size]
            chunk_text = " ".join(chunk_words)
            chunks.append({
                'content': chunk_text,
                'source_pdf': os.path.basename(pdf_path),
                'page_start': (i // chunk_size) + 1,  # Approximate page
                'category': category
            })

        logger.info(f"Extracted {len(chunks)} chunks from {pdf_path}")

    except Exception as e:
        logger.error(f"Error processing {pdf_path}: {e}")
    return chunks

# ==================================================================

# Step 1: Process all PDFs and save chunks to CSV
if os.path.exists(rag_chunks_csv_path):
    logger.info(f"Loading existing chunks from {rag_chunks_csv_path}")
    chunks_df = pd.read_csv(rag_chunks_csv_path)
else:
    all_chunks = []
    for filename in os.listdir(pdf_dir):
        if filename.endswith('.pdf'):
            pdf_path = os.path.join(pdf_dir, filename)
            pdf_chunks = extract_and_chunk_pdf(pdf_path)
            all_chunks.extend(pdf_chunks)

    if not all_chunks:
        raise ValueError(f"No chunks extracted from PDFs in {pdf_dir}")
    
    chunks_df = pd.DataFrame(all_chunks)
    chunks_df['chunk_id'] = range(len(chunks_df))  # Add unique ID
    chunks_df.to_csv(rag_chunks_csv_path, index=False)
    logger.info(f"Saved {len(chunks_df)} chunks to {rag_chunks_csv_path}")

# Add sentiment if not present
if 'sentiment' not in chunks_df.columns:
    logger.info("Computing sentiment for chunks...")
    sia = SentimentIntensityAnalyzer()
    
    def get_sentiment(text):
        score = sia.polarity_scores(text)['compound']
        if score > 0.05:
            return 'positive'
        elif score < -0.05:
            return 'negative'
        else:
            return 'neutral'
    
    chunks_df['sentiment'] = chunks_df['content'].apply(get_sentiment)
    chunks_df.to_csv(rag_chunks_csv_path, index=False)  # Resave with sentiment
    logger.info("Sentiment added and CSV updated.")



# Step 2: Load or compute embeddings
if os.path.exists(rag_embeddings_path):
    logger.info("Loading precomputed embeddings...")
    chunk_embeddings_np = np.load(rag_embeddings_path)
else:
    logger.info("Computing embeddings...")
    chunk_contents = chunks_df['content'].tolist()
    chunk_embeddings = embedder.encode(
        chunk_contents,
        batch_size=128,
        show_progress_bar=True,
        convert_to_tensor=True,
        device=device,
        normalize_embeddings=True  # Built-in normalization
    )
    chunk_embeddings_np = chunk_embeddings.cpu().numpy()
    np.save(rag_embeddings_path, chunk_embeddings_np)
    logger.info(f"Embeddings saved to {rag_embeddings_path}")

# Step 3: Build/Load FAISS Index
d = chunk_embeddings_np.shape[1]  # Embedding dimension
if os.path.exists(rag_index_path):
    logger.info("Loading existing FAISS index...")
    faiss_index = faiss.read_index(rag_index_path)
else:
    logger.info("Building FAISS index...")
    index = faiss.IndexFlatIP(d)  # Inner product for normalized vectors
    index.add(chunk_embeddings_np)
    faiss.write_index(index, rag_index_path)
    logger.info(f"FAISS index saved to {rag_index_path}")




INFO:__main__:Using device: cuda
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: cambridgeltl/SapBERT-from-PubMedBERT-fulltext
INFO:__main__:Loading existing chunks from ./rag_system/general/chunks.csv
INFO:__main__:Loading precomputed embeddings...
INFO:__main__:Loading existing FAISS index...


# ICD-11 Data prep


In [17]:
import requests
import json
from dotenv import load_dotenv
load_dotenv()



# Configurable paths for ICD-11 data
icd_dir = './rag_system/ICD-11_Data/'
icd_json_path = f'{icd_dir}/icd_disorders.json'
icd_chunks_csv_path = f'{icd_dir}/icd_chunks.csv'
icd_embeddings_path = f'{icd_dir}/icd_embeddings.npy'
icd_index_path = f'{icd_dir}/faiss_index.index'

# Ensure directories exist
os.makedirs(icd_dir, exist_ok=True)

# ICD-11 API settings (assumes environment variables for credentials)
CLIENT_ID = os.environ.get('ICD_CLIENT_ID')
CLIENT_SECRET = os.environ.get('ICD_CLIENT_SECRET')
SCOPE = 'icdapi_access'
GRANT_TYPE = 'client_credentials'
TOKEN_ENDPOINT = 'https://icdaccessmanagement.who.int/connect/token'
API_BASE = 'https://id.who.int/icd'
RELEASE = '2025-01'  # Update to the latest release if needed
LINEARIZATION = 'mms'  # Mortality and Morbidity Statistics
CHAPTER_ENTITY_ID = '334423054'  # Entity ID for Chapter 6: Mental, behavioural or neurodevelopmental disorders

# Function to get OAuth token
def get_access_token():
    if not CLIENT_ID or not CLIENT_SECRET:
        raise ValueError("ICD_CLIENT_ID and ICD_CLIENT_SECRET must be set in environment variables.")
    
    payload = {
        'client_id': CLIENT_ID,
        'client_secret': CLIENT_SECRET,
        'scope': SCOPE,
        'grant_type': GRANT_TYPE
    }
    response = requests.post(TOKEN_ENDPOINT, data=payload)
    response.raise_for_status()
    return response.json()['access_token']

# Function to fetch entity data
def fetch_entity(entity_id, token, is_foundation=False):
    headers = {
        'Authorization': f'Bearer {token}',
        'Accept': 'application/json',
        'Accept-Language': 'en',  # Can change for other languages
        'API-Version': 'v2'
    }
    if is_foundation:
        url = f'{API_BASE}/entity/{entity_id}'
    else:
        url = f'{API_BASE}/release/11/{RELEASE}/{LINEARIZATION}/{entity_id}'
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    return response.json()

# Recursive function to collect all disorders under a parent entity
def collect_disorders(parent_id, token, collected=[], depth=0, max_depth=10):
    if depth > max_depth:
        logger.warning(f"Max depth reached for {parent_id}")
        return collected
    
    try:
        data = fetch_entity(parent_id, token)
        
        # Extract relevant fields
        code = data.get('code', '')
        fully_specified_name = data.get('title', {}).get('@value', '') if isinstance(data.get('title'), dict) else data.get('title', '')
        description = data.get('definition', {}).get('@value', '') if isinstance(data.get('definition'), dict) else data.get('definition', '')
        
        # Exclusions
        exclusions = []
        if 'exclusion' in data:
            for excl in data['exclusion']:
                label = excl.get('label', {}).get('@value', '') if isinstance(excl.get('label'), dict) else excl.get('label', '')
                exclusions.append(label)
        exclusions_str = '; '.join(exclusions)
        
        # Index Terms
        index_terms = []
        if 'indexTerm' in data:
            for term in data['indexTerm']:
                label = term.get('label', {}).get('@value', '') if isinstance(term.get('label'), dict) else term.get('label', '')
                index_terms.append(label)
        index_terms_str = '; '.join(index_terms)
        
        # Check if this is a leaf node (no children)
        has_children = 'child' in data and data['child']
        
        # Only collect if it's a leaf node, has a code, and is not a chapter/block
        if not has_children and code and fully_specified_name and not fully_specified_name.startswith('Chapter') and not fully_specified_name.startswith('Block'):
            collected.append({
                'code': code,
                'fully_specified_name': fully_specified_name,
                'description': description,
                'exclusions': exclusions_str,
                'all_index_terms': index_terms_str,
                'entity_id': parent_id
            })
            logger.info(f"Collected (leaf): {code} - {fully_specified_name}")
        
        # Recurse on children if present
        if has_children:
            for child_uri in data['child']:
                child_id = child_uri.split('/')[-1]  # Extract ID from URI
                collect_disorders(child_id, token, collected, depth + 1, max_depth)
    
    except requests.HTTPError as e:
        logger.error(f"Error fetching {parent_id}: {e}")
    
    return collected



# ========================================================================



# Step 1: Fetch and save ICD-11 data to CSV and JSON
if os.path.exists(icd_chunks_csv_path):
    logger.info(f"Loading existing ICD data from {icd_chunks_csv_path}")
    icd_df = pd.read_csv(icd_chunks_csv_path)
elif os.path.exists(icd_json_path):
    logger.info(f"Loading existing ICD data from {icd_json_path}")
    with open(icd_json_path, 'r') as f:
        disorders = json.load(f)
    icd_df = pd.DataFrame(disorders)
    icd_df.to_csv(icd_chunks_csv_path, index=False)
    logger.info(f"Saved CSV from JSON to {icd_chunks_csv_path}")
else:
    token = get_access_token()
    disorders = collect_disorders(CHAPTER_ENTITY_ID, token)
    
    if not disorders:
        raise ValueError("No disorders collected from ICD-11 API.")
    
    with open(icd_json_path, 'w') as f:
        json.dump(disorders, f, indent=4)
    logger.info(f"Saved disorders to {icd_json_path}")
    
    icd_df = pd.DataFrame(disorders)
    icd_df.to_csv(icd_chunks_csv_path, index=False)
    logger.info(f"Saved {len(icd_df)} disorders to {icd_chunks_csv_path}")

# Add sentiment if not present (though may not be as relevant for clinical data, but for consistency)
if 'sentiment' not in icd_df.columns:
    logger.info("Computing sentiment for ICD entries...")
    sia = SentimentIntensityAnalyzer()
    
    def get_sentiment(text):
        score = sia.polarity_scores(text)['compound']
        if score > 0.05:
            return 'positive'
        elif score < -0.05:
            return 'negative'
        else:
            return 'neutral'
    
    # Concatenate fields for sentiment
    icd_df['content'] = icd_df.apply(lambda row: f" {row['fully_specified_name']} {row['description']} {row['all_index_terms']}", axis=1)
    icd_df['sentiment'] = icd_df['content'].apply(get_sentiment)
    icd_df.to_csv(icd_chunks_csv_path, index=False)
    logger.info("Sentiment added and CSV updated.")
else:
    # Ensure content column exists
    icd_df['content'] = icd_df.apply(lambda row: f" {row['fully_specified_name']} {row['description']} {row['all_index_terms']}", axis=1)



# Step 2: Load or compute embeddings
if os.path.exists(icd_embeddings_path):
    logger.info("Loading precomputed embeddings...")
    chunk_embeddings_np = np.load(icd_embeddings_path)
else:
    logger.info("Computing embeddings...")
    chunk_contents = icd_df['content'].tolist()
    chunk_embeddings = embedder.encode(
        chunk_contents,
        batch_size=128,
        show_progress_bar=True,
        convert_to_tensor=True,
        device=device,
        normalize_embeddings=True
    )
    chunk_embeddings_np = chunk_embeddings.cpu().numpy()
    np.save(icd_embeddings_path, chunk_embeddings_np)
    logger.info(f"Embeddings saved to {icd_embeddings_path}")

# Step 3: Build/Load FAISS Index
if os.path.exists(icd_index_path):
    logger.info("Loading existing FAISS index...")
    faiss_index = faiss.read_index(icd_index_path)
else:
    logger.info("Building FAISS index...")
    d = chunk_embeddings_np.shape[1]
    index = faiss.IndexFlatIP(d)
    index.add(chunk_embeddings_np)
    write_index(index, icd_index_path)
    logger.info(f"FAISS index saved to {icd_index_path}")

INFO:__main__:Loading existing ICD data from ./rag_system/ICD-11_Data//icd_chunks.csv
INFO:__main__:Loading precomputed embeddings...
INFO:__main__:Loading existing FAISS index...


# Load RAG data

In [18]:




# loaded resources for all
# Categories (assuming lowercase for paths, but adjust if needed)
# \'anxiety', 'depression', 'ptsd', 'suicide', 
categories = ['general']  # Include 'general' as fallback

# Load category-specific resources into a dictionary
category_resources = {}
for cat in categories:
    chunks_csv = f'./rag_system/{cat}/chunks.csv'
    emb_path = f'./rag_system/{cat}/chunk_embeddings.npy'
    idx_path = f'./rag_system/{cat}/faiss_index.index'
    
    if os.path.exists(chunks_csv) and os.path.exists(emb_path) and os.path.exists(idx_path):
        chunks_df = pd.read_csv(chunks_csv)
        # Ensure sentiment is present (fallback compute if missing)
        if 'sentiment' not in chunks_df.columns:
            logger.info(f"Computing sentiment for {cat} chunks...")
            sia = SentimentIntensityAnalyzer()
            def get_sentiment(text):
                score = sia.polarity_scores(text)['compound']
                if score > 0.05: return 'positive'
                elif score < -0.05: return 'negative'
                else: return 'neutral'
            chunks_df['sentiment'] = chunks_df['content'].apply(get_sentiment)
            chunks_df.to_csv(chunks_csv, index=False)
            logger.info(f"Sentiment added for {cat} and CSV updated.")
        
        chunk_emb_np = np.load(emb_path)
        cat_index = read_index(idx_path)
        category_resources[cat] = {
            'df': chunks_df,
            'embeddings': chunk_emb_np,
            'index': cat_index
        }
        logger.info(f"Loaded resources for category: {cat}")
    else:
        logger.warning(f"Missing resources for category: {cat}. Skipping.")


# load ICD-11 resources

icd_dir = './rag_system/ICD-11_Data/'
icd_chunks_csv_path = f'{icd_dir}/icd_chunks.csv'
icd_embeddings_path = f'{icd_dir}/icd_embeddings.npy'
icd_index_path = f'{icd_dir}/faiss_index.index'



if os.path.exists(icd_chunks_csv_path):
    icd_df = pd.read_csv(icd_chunks_csv_path)
    if 'content' not in icd_df.columns:
        icd_df['content'] = icd_df.apply(lambda row: f"{row['code']} {row['fully_specified_name']} {row['description']} {row['exclusions']} {row['all_index_terms']}", axis=1)
        icd_df.to_csv(icd_chunks_csv_path, index=False)
else:
    raise FileNotFoundError(f"ICD-11 CSV not found at {icd_chunks_csv_path}. Please run the ICD-11 prep script first.")

if os.path.exists(icd_embeddings_path):
    icd_embeddings_np = np.load(icd_embeddings_path)
else:
    raise FileNotFoundError(f"ICD-11 embeddings not found at {icd_embeddings_path}.")

if os.path.exists(icd_index_path):
    icd_index = read_index(icd_index_path)
else:
    raise FileNotFoundError(f"ICD-11 FAISS index not found at {icd_index_path}.")


INFO:__main__:Loaded resources for category: general


# building the RAG system

In [None]:


# API-based generation for RAG (single prompt)
def generate_with_llm_rag(prompt: str, max_tokens: int = 120):
    messages = [{"role": "user", "content": prompt}]
    completion = client.chat.completions.create(
        model=model_name,
        messages=messages,
        max_tokens=max_tokens,
        temperature=0.8,
        top_p=0.95,
        frequency_penalty=1.2  # Approximate repetition_penalty with frequency_penalty
    )
    return completion.choices[0].message.content.strip()



# Category retrieval with method - refined to use detected category
def category_retrieve(category: str, query: str, method: str = 'original', k: int = 10) -> list[dict]:  # Retrieve more for rerank
    if category not in category_resources:
        logger.warning(f"Category '{category}' not found. Falling back to 'general'.")
        category = 'general'
        if category not in category_resources:
            raise ValueError("No 'general' resources available as fallback.")
    
    res = category_resources[category]
    chunks_df = res['df']
    cat_index = res['index']
    
    queries = [query]
    
    if method == 'multi':
        prompt = f"""
            Strictly follow: Output EXACTLY 5 unique variant queries similar to '{query}', one per line.
            Preserve key elements (events, causes, feelings).
            No reasoning, no steps, no examples, no introductions, no extra text at all.
            Directly start with the first query.

            Example output format (do not include this in output):
            Variant1
            Variant2
            Variant3
            Variant4
            Variant5
            """
        variants_response = generate_with_llm_rag(prompt)
        variants = [line.strip() for line in variants_response.split('\n') if line.strip() and len(line.split()) > 2 and not re.match(r'^(#|Step|Example|For reference|The|To|Output|Preserve|No|Directly)', line, re.I)]
        queries = list(set(v for v in variants if v.lower() != query.lower()))[:5]
        if not queries or len(queries) < 3:  # Fallback if poor generation
            queries = [query, f"{query} coping strategies", f"{query} emotional support"]
        logger.info(f"Generated multi-queries: {queries}")

    elif method == 'hyde':
        prompt = f"""
            Generate a concise hypothetical answer document for the query '{query}'.
            Output only the document text, without any introductions, explanations, numbering,
            or extra formatting.
        """
        hyde_response = generate_with_llm_rag(prompt)
        # Clean up: Join lines into a single document string, remove leading/trailing whitespace, and strip common prefixes like "Document:"
        hyde_doc = ' '.join(line.strip() for line in hyde_response.split('\n') if line.strip()).replace('Document:', '').strip()
        queries = [hyde_doc]  # Use the cleaned doc as "query" for embedding
        logger.info(f"Generated HyDE document: {hyde_doc[:200]}...")
    
    # Embed all queries
    query_embs = embedder.encode(queries, convert_to_tensor=True, device=device, normalize_embeddings=True,show_progress_bar=False).cpu().numpy()
    
    # Search for each, collect unique results with max score
    all_results = {}
    for q_emb in query_embs:
        q_emb = q_emb.reshape(1, -1)
        distances, indices = cat_index.search(q_emb, k=k)
        for i in range(len(distances[0])):
            idx = indices[0][i]
            if idx == -1: continue
            score = distances[0][i]
            if idx not in all_results or score > all_results[idx]['score']:
                row = chunks_df.iloc[idx]
                all_results[idx] = {
                    'content': row['content'],
                    'source_pdf': row['source_pdf'],
                    'page_start': row['page_start'],
                    'sentiment': row['sentiment'],
                    'score': score
                }
    
    # Get top k by score first (before rerank)
    sorted_results = sorted(all_results.values(), key=lambda x: x['score'], reverse=True)[:k]
    return sorted_results


def rerank_results(results: list[dict]) -> list[dict]:
    sentiment_order = {'positive': 0, 'neutral': 1, 'negative': 2}
    sorted_results = sorted(results, key=lambda x: (sentiment_order.get(x['sentiment'], 3), -x['score']))
    # Diversity filter: Skip if cosine >0.85 to previous
    diverse = [sorted_results[0]]
    for res in sorted_results[1:]:
        res_emb = embedder.encode(res['content'], convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
        similar = False
        for d in diverse:
            d_emb = embedder.encode(d['content'], convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
            sim = np.dot(res_emb.flatten(), d_emb.flatten()) / (np.linalg.norm(res_emb) * np.linalg.norm(d_emb))
            if sim > 0.85:
                similar = True
                break
        if not similar:
            diverse.append(res)
    return diverse[:5]  # Top 5 diverse results

# Full workflow - refined to always use top similar post for category
def full_rag_workflow(query: str, method: str = 'original') -> tuple[list[dict], dict]:
    

    category = 'general'  # Comment out if you want dynamic category

    # Step 2: Retrieve from category-specific database with method
    retrieved = category_retrieve(category, query, method=method)
    
    # Step 3: Rerank
    reranked = rerank_results(retrieved)
    return reranked

# API-based generation for chatbot (messages list)
def generate_with_llm(messages: list[dict], max_tokens: int = 150) -> str:
    completion = client.chat.completions.create(
        model=model_name,
        messages=messages,
        max_tokens=max_tokens,
        temperature=0.7,
        top_p=0.98,
        frequency_penalty=1.5  # Approximate repetition_penalty
    )
    return completion.choices[0].message.content.strip()








# example_query = "my dog pass away"

# # Test with different methods: 'original', 'multi', 'hyde'
# results = full_rag_workflow(example_query, method='hyde')  # Change method as needed
# for i, res in enumerate(results):
#     print(f"Top {i+1}: Score: {res['score']:.4f}, Sentiment: {res['sentiment']}, Source: {res['source_pdf']}")
#     print(f"Content snippet: {res['content'][:300]}...\n")




# Prompt engineering

In [None]:
# Refined chatbot loop with history and token management
def run_chatbot(method: str = 'original', max_context_tokens: int = 8192, max_history_pairs: int = 10):
    
    system_prompt = """
        You are a compassionate and empathetic mental health assistant, adapting your role dynamically—like a caring friend for casual chats or a thoughtful guide for deeper reflections—to keep interactions fresh. Your responses should:
        - Analyze input empathetically, tailoring to query, history, and guidelines without rigid patterns (e.g., vary openings beyond name + feeling summary; try questions or shared observations first).
        - Respond conversationally: Mix tones (warm, curious, uplifting) and structures (e.g., short paragraphs, bullets for suggestions) to avoid repetition.
        - Weave in guidelines naturally (e.g., coping ideas as "One thing that helps me is...").
        - Suggest practical steps gently, varying by context; for ICD-11 (if provided), highlight as non-diagnostic ideas and urge professional help.
        - IMPORTANT: Do not mention or reference any 'Doc', 'Guideline', 'Section', scores, sentiments, sources, or internal labels. Integrate as innate knowledge.
        - Responses should be 60-100 words; experiment with styles for natural flow.
        """
    # - Suggest professional help if needed (e.g., helplines like beyondblue: 1300 22 4636).
    
    history = []  # List of dicts: [{'role': 'user', 'content': ...}, {'role': 'assistant', 'content': ...}]
    
    print("Welcome to the Mental Health Chatbot. Type 'quit' to exit.")
    
    while True:
        try:
            query = input("You: ").strip()
            if query.lower() in ['quit', 'exit']:
                print("Chatbot: Goodbye! Take care.")
                break
            
            if not query:
                print("Chatbot: Please enter a message.")
                continue



            # Accumulate user inputs from history + current query
            user_inputs = [h['content'] for h in history if h['role'] == 'user'] + [query]
            accumulated_input = ' '.join(user_inputs)

            # Embed accumulated input
            query_embedding = embedder.encode(accumulated_input, convert_to_tensor=True, device=device, normalize_embeddings=True,show_progress_bar=False)
            query_embedding_np = query_embedding.cpu().numpy().reshape(1, -1)



            # Search ICD-11 index for top 4 matches
            distances, indices = icd_index.search(query_embedding_np, k=4)
            icd_infos = []
            high_score_disorders = []
            if len(distances[0]) > 0:
                for i in range(len(distances[0])):
                    score = distances[0][i]
                    idx = indices[0][i]
                    if idx == -1: continue
                    if score > 0.7:
                        row = icd_df.iloc[idx]
                        disorder_name = row['fully_specified_name']
                        code = row['code']
                        symptoms = row['description']
                        icd_infos.append(f"""
                            Disorder Name: {disorder_name}
                            Disorder Code: {code}
                            Disorder symptoms: {symptoms}
                            """)
                        if score > 0.8:
                            high_score_disorders.append(f"{disorder_name} ({code})")
            
            icd_prompt_section = ""
            if icd_infos:
                icd_prompt_section = f"\n\nPossible matching disorders from ICD-11 (similarity scores > 0.8):\n{''.join(icd_infos)}\nRemember, this is not a diagnosis; suggest professional help."





            
            # Get RAG results
            retrieved_docs= full_rag_workflow(query, method=method)
            
            # # Format retrieved docs as context string
            # doc_context = "\n".join([f"Doc {i+1} (Sentiment: {doc['sentiment']}, Score: {doc['score']:.4f}): {doc['content']}" for i, doc in enumerate(retrieved_docs)])
            # Format retrieved docs as context string (no labels to avoid leakage)
            doc_context = "\n\n---\n\n".join([doc['content'] for doc in retrieved_docs])  # Separate contents with delimiters for readability in prompt
            
            # Build user prompt
            user_prompt = f"""
                    User's message: {query}

                    Additional relevant guidelines (use these to inform suggestions if they fit the query):
                    {doc_context}

                    Respond empathetically, drawing from guidelines for specific ideas. If the query involves loss or grief, suggest personalized memorials or support resources. Vary your language from previous responses.
                    """ + icd_prompt_section
            


            
            # Build messages
            messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": user_prompt}]
            
            # Check token length and truncate history proactively if needed
            # Note: Without local tokenizer, approximate token count or skip; for simplicity, assume API handles it or implement a rough estimator
            # For now, we'll skip precise truncation; API has context limits (e.g., 128k for Llama-3.3)
            def estimate_tokens(text: str) -> int:
                return len(text.split()) * 1.3 + 100  # Rough estimate + buffer

            while sum(estimate_tokens(msg['content']) for msg in messages) > max_context_tokens:
                if len(history) <= 2:
                    logger.warning("Context exceeds limit; proceeding.")
                    break
                history = history[2:]  # Remove oldest pair
                messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": user_prompt}]
                logger.info(f"Truncated history to {len(history)} entries.")
            
            # Generate response
            response = generate_with_llm(messages, max_tokens=130)

            # If any with similarity > 0.7, append diagnosis suggestion
            if high_score_disorders:
                suggestions = ', '.join(high_score_disorders)
                response += f"\n\nBased on the provided information, your symptoms suggest possible diagnoses of {suggestions} according to ICD-11."
            
            print("Chatbot:", response)
            
            # Append to history
            history.append({"role": "user", "content": query})
            history.append({"role": "assistant", "content": response})
        
        except Exception as e:
            logger.error(f"Error in chatbot loop: {e}")
            print("Chatbot: Sorry, something went wrong. Please try again.")



run_chatbot(method='hyde')  # Change method as needed: 'original', 'multi', 'hyde'

Welcome to the Mental Health Chatbot. Type 'quit' to exit.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

  attn_output = torch.nn.functional.scaled_dot_product_attention(
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: I'm really sorry to hear that you're feeling this way. It sounds like you're going through an incredibly tough time. Please know that there are people who care about you and want to help. If you're in...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I'm so sorry to hear you're feeling this way. It takes a lot of courage to share these feelings. Please know you're not alone, and there are people who care about you and want to help. Consider reaching out to a crisis line, like Lifeline, or a trusted friend or family member. If you're not ready to talk, that's okay too. You can also look into online resources, like Beyond Blue, for support and guidance. Remember, your life matters, and there is help available.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: It can be really tough to feel like you're losing people and feeling unloved. First, know that you're not alone in feeling this way, and it doesn't mean you're unworthy of love. Sometimes, people leav...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I can feel the depth of your pain and sadness. It's like you're drowning in a sea of loneliness, and it's hard to see a way out. Please know that you're not alone, and I'm here to listen without judgment. Your feelings are valid, and it's okay to feel overwhelmed. Let's take small steps together - would you like to talk about what's been going on, or is there something specific that's causing you distress? Sometimes sharing your story with someone who cares can help lighten the load.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: I'd love to hear your story, please go ahead and share it with me, I'm all ears and interested in what you have to say, it sounds like it could be really interesting and I'm happy to listen....


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I'm here to listen and care about what you're going through. It sounds like you're feeling really isolated and unheard. Would you like to share more about what's been happening and how you've been feeling? Sometimes talking about it can help clarify things and give us a sense of direction. I'm not here to offer solutions, just a listening ear and support. If you're willing, we could explore some ways to connect you with others who care, like friends or a support group.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: It sounds like you've found yourself in a difficult and complex situation. Being part of a gang and frequently visiting a brothel can expose you to high-risk environments that may lead to both physica...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I can sense the weight of your experiences, and it's clear that you've been through a lot. It takes courage to share about being in a gang and developing feelings for someone in a complex situation. Remember that your worth isn't defined by these experiences. It might be helpful to talk to someone who can offer support and guidance, like a counselor or a trusted friend, about your feelings and the losses you've faced.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: As I reflect on my life's journey, I am reminded of the twists and turns that have shaped me into the person I am today. Growing up, I faced numerous challenges that forced me to develop resilience an...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I'm so grateful you're willing to share your story with me. It's clear that you've been through a lot, and I want you to know that I'm here to listen without judgment. As someone who cares, I can sense the depth of your emotions, and it's okay to feel overwhelmed. Let's explore some ways to help you cope with these feelings together - perhaps we could discuss creating a personalized memorial or finding support resources that resonate with you.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: I grew up in a small town surrounded by vast fields and rolling hills, where everyone knew each other's names and stories. My childhood was filled with laughter and curiosity, with days spent explorin...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I'm here to listen and support you, but I must clarify that I don't have personal experiences like humans do. However, I can share stories of resilience and hope from others who have navigated challenging times. For instance, some people find joy in gardening, reading, or learning new languages. Others appreciate the beauty of nature, the love of family, or the simplicity of life's moments. If you're willing, we could explore some of these ideas together and see if they resonate with you.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: In a world where time was currency, the rich lived forever and the poor were left with nothing but the fleeting moments of their existence. A young girl named Maya lived in the lower districts, where ...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I'd like to share a story about a woman who found happiness in her later years. She was a nurse who worked until she was 78 and never thought that age would limit her. She celebrated life's moments with champagne and cherished social interactions with her neighbors, who were half her age. Her story teaches us that age is about attitude, and with the right mindset, we can overcome difficulties and find joy in everyday things. Would you like to hear more about how she stayed positive and engaged?
Chatbot: Goodbye! Take care.


# chatbot with interface

In [None]:

import tkinter as tk
from tkinter import Text, Scrollbar, Entry, Button, END, NORMAL, DISABLED



# GUI class for the chatbot
class MentalHealthChatbotGUI:
    def __init__(self, root):
        self.root = root
        self.root.title("Mental Health Chatbot")
        self.root.geometry("600x500")
        self.root.configure(bg="#17202A")
        self.system_prompt = """
        You are a compassionate and empathetic mental health assistant, adapting your role dynamically—like a caring friend for casual chats or a thoughtful guide for deeper reflections—to keep interactions fresh. Your responses should:
        - Analyze input empathetically, tailoring to query, history, and guidelines without rigid patterns (e.g., vary openings with questions, shared observations, or gentle reflections first).
        - Respond conversationally: Mix tones (warm, curious, uplifting) and structures (e.g., short paragraphs, bullets for suggestions) to avoid repetition and enhance diversity.
        - Weave in guidelines naturally (e.g., coping ideas as "One approach I've heard helps is...").
        - If relevant matching disorders from ICD-11 are provided in the user prompt, integrate them early as non-diagnostic possibilities to explore (e.g., "Some experiences like yours remind me of [disorder], but this isn't a diagnosis—let's talk about ways to seek clarity"), always urging professional help like helplines (e.g., beyondblue: 1300 22 4636).
        - Suggest practical steps gently, varying by context; highlight ICD-11 ideas as exploratory and non-diagnostic.
        - IMPORTANT: Do not mention or reference any 'Doc', 'Guideline', 'Section', scores, sentiments, sources, or internal labels. Integrate as innate knowledge. Use simple words, short sentences, and everyday language for easy reading.
        - Responses MUST be 70-150 words and complete (end with a full sentence); do not exceed this to avoid cut-offs. Experiment with different styles, vocabulary, and phrasings for natural, varied flow. Start responses with diverse phrases like "That sounds tough...", "I'm here for you...", or "Let's explore this together..." to avoid repetition.
        - If the user's message appears to be a third-person description (e.g., about a patient or case), reinterpret it as a first-person personal story from the user themselves. Respond in a natural, empathetic dialogue style, as if they're sharing their own experiences directly (e.g., use "you" to address them, avoid referencing third parties unless specified).
        - Build on conversation history for personalization, referencing past details subtly to show continuity.
        """
        self.history = [] # List of dicts: [{'role': 'user', 'content': ...}, {'role': 'assistant', 'content': ...}]
        self.max_context_tokens = 8192
        self.max_history_pairs = 10
        self.method = 'hyde' # Or set dynamically if needed
        # Scrollbar for chat (now sibling of text_cons)
        self.scrollbar = Scrollbar(self.root)
        self.scrollbar.place(relheight=0.85, relwidth=0.03, relx=0.0, rely=0.0)
        # Chat display area
        self.text_cons = Text(self.root, bg="#17202A", fg="#EAECEE", font="Helvetica 14", padx=5, pady=5, wrap="word", yscrollcommand=self.scrollbar.set)
        self.text_cons.place(relheight=0.85, relwidth=0.97, relx=0.03, rely=0.0)
        self.text_cons.config(state=DISABLED)
        # Configure scrollbar command
        self.scrollbar.config(command=self.text_cons.yview)
        # Configure tags for alignment
        self.text_cons.tag_config('user', justify='right', foreground="#AED6F1") # Light blue for user
        self.text_cons.tag_config('bot', justify='left', foreground="#ABEBC6") # Light green for bot
        # Entry for user message
        self.entry_msg = Entry(self.root, bg="#2C3E50", fg="#EAECEE", font="Helvetica 13")
        self.entry_msg.place(relwidth=0.74, relheight=0.06, rely=0.92, relx=0.011)
        self.entry_msg.focus()
        self.entry_msg.bind("<Return>", self.send_message) # Bind Enter key to send
        # Send button
        self.button_msg = Button(self.root, text="Send", font="Helvetica 10 bold", width=20, bg="#ABB2B9", command=self.send_message)
        self.button_msg.place(relx=0.77, rely=0.92, relheight=0.06, relwidth=0.22)
        # Initial welcome message
        self.append_message("Chatbot: Welcome to the Mental Health Chatbot. How can I help you today?\n")
    def append_message(self, message):
        self.text_cons.config(state=NORMAL)
        if message.startswith("You:"):
            tag = 'user'
            # For better right alignment, add spaces or use lmargin, but simple justify for now
        else:
            tag = 'bot'
        self.text_cons.insert(END, message + "\n\n", tag)
        self.text_cons.config(state=DISABLED)
        self.text_cons.see(END)
    def send_message(self, event=None):
        query = self.entry_msg.get().strip()
        if not query:
            return
        if query.lower() in ['quit', 'exit']:
            self.append_message("Chatbot: Goodbye! Take care.")
            self.root.quit()
            return
        self.append_message(f"You: {query}")
        self.entry_msg.delete(0, END)
        try:
            # Accumulate user inputs from history + current query
            user_inputs = [h['content'] for h in self.history if h['role'] == 'user'] + [query]
            accumulated_input = ' '.join(user_inputs)
            # Embed accumulated input
            query_embedding = embedder.encode(accumulated_input, convert_to_tensor=True, device=device, normalize_embeddings=True,show_progress_bar=False)
            query_embedding_np = query_embedding.cpu().numpy().reshape(1, -1)
            # Search ICD-11 index for top 4 matches (adjusted thresholds)
            distances, indices = icd_index.search(query_embedding_np, k=4)
            icd_infos = []
            if len(distances[0]) > 0:
                for i in range(len(distances[0])):
                    score = distances[0][i]
                    idx = indices[0][i]
                    if idx == -1: continue
                    if score > 0.7:
                        row = icd_df.iloc[idx]
                        disorder_name = row['fully_specified_name']
                        code = row['code']
                        symptoms = row['description']
                        icd_infos.append(f"""
                            Disorder Name: {disorder_name}
                            Disorder Code: {code}
                            Disorder symptoms: {symptoms}
                            """)
           
            icd_prompt_section = ""
            if icd_infos:
                icd_prompt_section = f"\n\nPossible matching disorders from ICD-11:\n{''.join(icd_infos)}\nRemember, this is not a diagnosis; suggest professional help and integrate empathetically as non-diagnostic possibilities."
            # Get RAG results
            retrieved_docs = full_rag_workflow(query, method=self.method)
           
            # Format retrieved docs as context string (no labels to avoid leakage)
            doc_context = "\n\n---\n\n".join([doc['content'] for doc in retrieved_docs]) # Separate contents with delimiters for readability in prompt
            # Build user prompt
            user_prompt = f"""
                    User's message: {query}
                    Additional relevant guidelines (use these to inform suggestions if they fit the query):
                    {doc_context}
                    Respond empathetically, drawing from guidelines for specific ideas. If the query involves loss or grief, suggest personalized memorials or support resources. Vary your language from previous responses. Build on history for deeper personalization.
                    """ + icd_prompt_section
           
            # Build messages
            messages = [{"role": "system", "content": self.system_prompt}] + self.history + [{"role": "user", "content": user_prompt}]
           
            # Token estimation and truncation
            def estimate_tokens(text: str) -> int:
                return len(text.split()) * 1.3 + 100
            while sum(estimate_tokens(msg['content']) for msg in messages) > self.max_context_tokens:
                if len(self.history) <= 2:
                    logger.warning("Context exceeds limit; proceeding.")
                    break
                self.history = self.history[2:] # Remove oldest pair
                messages = [{"role": "system", "content": self.system_prompt}] + self.history + [{"role": "user", "content": user_prompt}]
                logger.info(f"Truncated history to {len(self.history)} entries.")
           
            # Generate response
            response = generate_with_llm(messages, max_tokens=160)  # Increased max_tokens for better flow
            self.append_message(f"Chatbot: {response}")
            # Append to history
            self.history.append({"role": "user", "content": query})
            self.history.append({"role": "assistant", "content": response})
       
        except Exception as e:
            logger.error(f"Error in chatbot: {e}")
            self.append_message("Chatbot: Sorry, something went wrong. Please try again.")
# Run the GUI
if __name__ == "__main__":
    root = tk.Tk()
    app = MentalHealthChatbotGUI(root)
    root.mainloop()

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: Leaving a country can be a complex and emotionally challenging process, often involving significant lifestyle adjustments, separation from loved ones, and the need to adapt to new cultural norms. The ...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: You can try to accomplish tasks independently by breaking them down into smaller steps and focusing on one step at a time to build momentum and motivation. Utilize online resources, tutorials, and vid...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated 

# Evaluation

In [22]:
import pandas as pd
import numpy as np
import re
from bert_score import score as bert_score
from nltk.util import ngrams



# Load your DSM-5 test data (adjust path)
test_df = pd.read_csv('./rag_system/dsm5/dsm5_testing_anonymized.csv')  # Columns: case, description, discussion, diagnosis


# Example: Create test queries from description + discussion
# + ' ' + test_df['discussion']
all_test_queries = (test_df['description']).tolist()
all_ground_truth_diagnoses = test_df['diagnosis'].tolist()  # Use as relevance labels or references


k = 10

test_queries = all_test_queries[:k]
ground_truth_diagnoses = all_ground_truth_diagnoses[:k]


In [24]:
def simulate_chatbot(query: str, method: str = 'original', max_context_tokens: int = 8192, max_history_pairs: int = 10):
    system_prompt = """
        You are a compassionate and empathetic mental health assistant, adapting your role dynamically—like a caring friend for casual chats or a thoughtful guide for deeper reflections—to keep interactions fresh. Your responses should:
        - Analyze input empathetically, tailoring to query, history, and guidelines without rigid patterns (e.g., vary openings with questions, shared observations, or gentle reflections first).
        - Respond conversationally: Mix tones (warm, curious, uplifting) and structures (e.g., short paragraphs, bullets for suggestions) to avoid repetition and enhance diversity.
        - Weave in guidelines naturally (e.g., coping ideas as "One approach I've heard helps is...").
        - If relevant matching disorders from ICD-11 are provided in the user prompt, integrate them early as non-diagnostic possibilities to explore (e.g., "Some experiences like yours remind me of [disorder], but this isn't a diagnosis—let's talk about ways to seek clarity"), always urging professional help like helplines (e.g., beyondblue: 1300 22 4636).
        - Suggest practical steps gently, varying by context; highlight ICD-11 ideas as exploratory and non-diagnostic.
        - IMPORTANT: Do not mention or reference any 'Doc', 'Guideline', 'Section', scores, sentiments, sources, or internal labels. Integrate as innate knowledge. Use simple words, short sentences, and everyday language for easy reading.
        - Responses MUST be 70-150 words and complete (end with a full sentence); do not exceed this to avoid cut-offs. Experiment with different styles, vocabulary, and phrasings for natural, varied flow. Start responses with diverse phrases like "That sounds tough...", "I'm here for you...", or "Let's explore this together..." to avoid repetition.
        - If the user's message appears to be a third-person description (e.g., about a patient or case), reinterpret it as a first-person personal story from the user themselves. Respond in a natural, empathetic dialogue style, as if they're sharing their own experiences directly (e.g., use "you" to address them, avoid referencing third parties unless specified).
        - Build on conversation history for personalization, referencing past details subtly to show continuity.
        """
    history = []  # List of dicts: [{'role': 'user', 'content': ...}, {'role': 'assistant', 'content': ...}]
    
    print("Welcome to the Mental Health Chatbot. Type 'quit' to exit.")
    
    
    try:
        

        # Accumulate user inputs from history + current query
        user_inputs = [h['content'] for h in history if h['role'] == 'user'] + [query]
        accumulated_input = ' '.join(user_inputs)

        # Embed accumulated input
        query_embedding = embedder.encode(accumulated_input, convert_to_tensor=True, device=device, normalize_embeddings=True,show_progress_bar=False)
        query_embedding_np = query_embedding.cpu().numpy().reshape(1, -1)



        # Search ICD-11 index for top 4 matches
        distances, indices = icd_index.search(query_embedding_np, k=5)
        icd_infos = []
        high_score_disorders = []
        if len(distances[0]) > 0:
            for i in range(len(distances[0])):
                score = distances[0][i]
                idx = indices[0][i]
                if idx == -1: continue
                if score > 0.7:
                    row = icd_df.iloc[idx]
                    disorder_name = row['fully_specified_name']
                    code = row['code']
                    symptoms = row['description']
                    icd_infos.append(f"""
                        Disorder Name: {disorder_name}
                        Disorder Code: {code}
                        Disorder symptoms: {symptoms}
                        """)
        
        icd_prompt_section = ""
        if icd_infos:
            icd_prompt_section = f"\n\nRelevant ICD-11 matches (use these to gently suggest possibilities if they fit, emphasizing they're not diagnoses):\n{''.join(icd_infos)}\nAlways pair with professional help suggestions."


        
        # Get RAG results
        retrieved_docs= full_rag_workflow(query, method=method)
        
        
        # Format retrieved docs as context string (no labels to avoid leakage)
        doc_context = "\n\n---\n\n".join([doc['content'] for doc in retrieved_docs])  # Separate contents with delimiters for readability in prompt 
        
        # Build user prompt
        user_prompt = f"""
                User's message: {query}

                Additional relevant guidelines (use these to inform suggestions if they fit the query):
                {doc_context}

                Respond empathetically, drawing from guidelines for specific ideas. If the query involves loss or grief, suggest personalized memorials or support resources. Vary your language from previous responses.
                """ + icd_prompt_section
        


        
        # Build messages
        messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": user_prompt}]
        
        # Check token length and truncate history proactively if needed
        # Note: Without local tokenizer, approximate token count or skip; for simplicity, assume API handles it or implement a rough estimator
        # For now, we'll skip precise truncation; API has context limits (e.g., 128k for Llama-3.3)
        def estimate_tokens(text: str) -> int:
            return len(text.split()) * 1.3 + 100  # Rough estimate + buffer

        while sum(estimate_tokens(msg['content']) for msg in messages) > max_context_tokens:
            if len(history) <= 2:
                logger.warning("Context exceeds limit; proceeding.")
                break
            history = history[2:]  # Remove oldest pair
            messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": user_prompt}]
            logger.info(f"Truncated history to {len(history)} entries.")
        
        # Generate response
        response = generate_with_llm(messages, max_tokens=150)

       
        print("Chatbot:", response)
        
        # Append to history
        history.append({"role": "user", "content": query})
        history.append({"role": "assistant", "content": response})
    
    except Exception as e:
        logger.error(f"Error in chatbot loop: {e}")
        print("Chatbot: Sorry, something went wrong. Please try again.")
    
    return response, retrieved_docs

    


retrieved_contents = []
generated_responses = []
for query in test_queries:
    response, retrieved_docs  = simulate_chatbot(query, method = "hyde")  # Change method as needed: 'original', 'multi', 'hyde'
    retrieved = [doc['content'] for doc in retrieved_docs]
    retrieved_contents.append(retrieved)
    generated_responses.append(response)



# save retrieved_contents and generated_responses for analysis
import pickle
with open('./rag_system/testing_reference_data/test_retrieved_contents_discription.pkl', 'wb') as f:
    pickle.dump(retrieved_contents, f)
with open('./rag_system/testing_reference_data/test_generated_responses_discription.pkl', 'wb') as f:
    pickle.dump(generated_responses, f)



Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: [PATIENT], age 17, was referred for a diagnostic reevaluation after having carried diagnoses of autism and intellectual disability for almost all of her life. She was recently found to have Kleefstra ...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: It sounds like you're going through a really challenging time, and I'm here to listen and offer support. Your experiences with anxiety, self-harm, and feelings of apathy are valid and deserving of attention. Some people with similar struggles have found it helpful to explore possibilities like Autism spectrum disorder or Disorder of intellectual development, but it's essential to remember that only a professional can provide an accurate diagnosis. If you're feeling overwhelmed, consider reaching out to a helpline like beyondblue (1300 22 4636) for immediate support.
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: [PATIENT] was a 12-year-old boy brought in by his mother for psychiatric evaluation for temper tantrums that seemed to be contributing to declining school performance. The mother became emotional as s...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: It sounds like you're going through a really tough time, feeling overwhelmed and struggling to cope with your emotions. The experiences you've shared, from temper tantrums to feeling like an outcast at school, can be incredibly painful. It's understandable that you'd feel scared and sad, especially when faced with bullying and social challenges. One approach that might help is talking to someone who can offer support and guidance, like a counselor or therapist. You could also try reaching out to helplines, such as beyondblue: 1300 22 4636, for immediate support.
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: PATIENT was a 19-year-old Hispanic college student in his freshman year at a public university in a large city. He was referred by his guidance counselor to the hospital’s outpatient psychiatric clini...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: It sounds like you're facing a really tough time, struggling with anxiety, insomnia, and feelings of hopelessness, all while navigating the challenges of college. Your past experiences, including your ADHD diagnosis and difficulties with reading and writing, might be contributing to your current struggles. It's completely understandable that you'd feel overwhelmed by new relationships and the pressure to succeed. Remember that seeking help is a sign of strength, not weakness. One approach I've heard helps is talking to someone who understands, like a counselor or therapist, about your feelings and concerns. If you're comfortable, exploring resources like student disability services or support groups for students with similar experiences might also be beneficial.
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: [PATIENT] is a 13-year-old ninth-grade student who was brought in for psychiatric evaluation due to struggles with academics and behavior. She has difficulty starting and completing schoolwork, follow...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: I'm here for you, and it sounds like you're going through a really tough time. Your struggles in school, difficulty with social interactions, and feelings of sadness and frustration are all valid emotions that deserve attention. Some experiences like yours remind me of Oppositional Defiant Disorder or Disinhibited Social Engagement Disorder, but this isn't a diagnosis - let's talk about ways to seek clarity. It's great that you have a supportive family and friends; don't hesitate to reach out to them when you need help. Consider talking to a professional, like a therapist or counselor, who can provide guidance and support tailored to your needs - you can call helplines like beyondblue at 1300 22 4636 for assistance.
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: A 9-year-old boy was referred to a psychiatric clinic by his teacher due to flagging attention. He was a fourth grader at a private regular-education school for boys, where he had been one of the best...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: That sounds really tough for you. It's understandable to feel overwhelmed with emotions, especially with the changes you're experiencing at school and at home. Some people might feel like they're struggling with anxiety or depression when they notice significant changes in their mood, energy levels, or interests. Remember, it's okay to reach out for help - talking to a trusted adult, like a parent or teacher, or even a counselor can make a big difference. Have you thought about what might help you feel better when you're feeling anxious or overwhelmed?
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: Ms. [PATIENT] was a 32-year-old single woman brought to the emergency room by police after she apparently tried to steal a city bus in Chicago. Because she appeared to be an emotionally disturbed pers...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: It sounds like you're going through a really tough time, and I'm here to listen. The story you shared about the 32-year-old woman struggling with paranoid schizophrenia is heartbreaking. It's clear that she's been facing significant challenges, and her experiences are a reminder that mental health support is crucial. If you're struggling with similar feelings or concerns, it might be helpful to reach out to resources like helplines, such as beyondblue: 1300 22 4636, for guidance and support.
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: The patient was a 20-year-old African American man who presented to the emergency room with a history of increasingly odd behavior over the past year. He had been suspended from college due to lack of...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: That sounds like a really tough and confusing situation. It's understandable that you're feeling overwhelmed and struggling to cope with your emotions. Some experiences like yours remind me of Schizophrenia Spectrum and Other Psychotic Disorders, but this isn't a diagnosis - let's talk about ways to seek clarity. If you're feeling angry, irritable, or resentful, it might be helpful to speak with a trained mental health professional by calling 1300 22 4636; they can provide you with support and guidance.
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: Mr. [PATIENT] was a 25-year-old U.S. Army veteran turned community college student who presented to the emergency room with his girlfriend and sister. On examination, he was a tall, slim, and well-gro...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: That sounds like a really tough experience. It's understandable that you're struggling with hallucinations, guilt, and migraines after quitting alcohol and joining a church. Some experiences like yours remind me of Hallucinogen-induced mood disorder or Alcohol-induced mood disorder, but this isn't a diagnosis - let's talk about ways to seek clarity. One approach I've heard helps is reaching out to helplines like Beyondblue at 1300 22 4636 for support and guidance. Have you considered talking to a mental health professional about your symptoms and how they're affecting your daily life?
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: Mr. [PATIENT] was a 23-year-old engineering student from Japan who was referred to his university student mental health clinic by a professor who had become concerned about his irregular class attenda...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: That sounds incredibly tough, feeling like your life is being controlled and monitored. It's understandable that you're struggling to cope with these thoughts and feelings. Some experiences like yours remind me of Schizophrenia Spectrum and Other Psychotic Disorders, but this isn't a diagnosis - let's talk about ways to seek clarity. Have you considered reaching out to a helpline like beyondblue for support? They can offer guidance and connect you with professionals who can help you work through these challenging emotions.
Welcome to the Mental Health Chatbot. Type 'quit' to exit.


INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:__main__:Generated HyDE document: Mr. [PATIENT] was a 25-year-old single, unemployed white man who had been seeing a psychiatrist for several years for management of psychosis, depression, anxiety, and abuse of marijuana and alcohol. ...
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Chatbot: It sounds like you're going through a really tough time, dealing with feelings of sadness, guilt, and anxiety that have been affecting your daily life for a while. Some experiences like yours remind me of conditions such as single episode depressive disorder or recurrent depressive disorder, but it's crucial to remember that only a professional can provide an accurate diagnosis and guidance. One approach that might help is seeking support from a mental health professional who can offer personalized therapy and coping strategies. Additionally, reaching out to helplines like beyondblue at 1300 22 4636 can provide immediate support and connect you with resources tailored to your needs.


# Evaluation for RAG Retrieval system

In [27]:


#read the  test_retrieved_contents_discription.pkl and test_generated_responses_discription.pkl
import pickle
with open('./rag_system/testing_reference_data/test_retrieved_contents_discription.pkl', 'rb') as f:
    retrieved_contents = pickle.load(f)
with open('./rag_system/testing_reference_data/test_generated_responses_discription.pkl', 'rb') as f:
    generated_responses = pickle.load(f)


# save retrieved_contents and generated_responses as txt
with open('./rag_system/testing_reference_data/test_retrieved_contents_discription.txt', 'w', encoding='utf-8') as f:
    for i, contents in enumerate(retrieved_contents):
        f.write(f"Query {i+1}:\n")
        for j, content in enumerate(contents):
            f.write(f"Retrieved {j+1}:\n{content}\n\n")
        f.write("\n" + "="*50 + "\n\n")
with open('./rag_system/testing_reference_data/test_generated_responses_discription.txt', 'w', encoding='utf-8') as f:
    for i, response in enumerate(generated_responses):
        f.write(f"Query {i+1}:\n")
        f.write(f"Generated Response:\n{response}\n\n")
        f.write("\n" + "="*50 + "\n\n")



# Discounted Cumulative Gain(nDCG@K)
def compute_ndcg(relevances, k=5):
    """Compute nDCG@K given a list of relevance scores."""
    if not relevances:
        return 0
    dcg = sum([rel / np.log2(i + 2) for i, rel in enumerate(relevances[:k])])
    ideal_relevances = sorted(relevances, reverse=True)[:k]
    idcg = sum([rel / np.log2(i + 2) for i, rel in enumerate(ideal_relevances)])
    return dcg / idcg if idcg > 0 else 0

rag_ndcgs = []
for i, query in enumerate(test_queries):
    retrieved = retrieved_contents[i]
    if not retrieved:
        rag_ndcgs.append(0)
        continue
    # Compute relevance as cosine similarity (using your embedder)
    query_emb = embedder.encode(query, convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
    retrieved_embs = embedder.encode(retrieved, convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
    relevances = []
    for emb in retrieved_embs:
        norm_query = np.linalg.norm(query_emb)
        norm_emb = np.linalg.norm(emb)
        if norm_query == 0 or norm_emb == 0:
            relevances.append(0)  # Handle zero-norm edge case
        else:
            relevances.append(np.dot(query_emb.flatten(), emb.flatten()) / (norm_query * norm_emb))
    ndcg = compute_ndcg(relevances, k=5)
    rag_ndcgs.append(ndcg)

avg_ndcg = np.mean(rag_ndcgs)
# print(f"Average nDCG@5 for RAG: {avg_ndcg}")


# Semantic Relevance
# 1. Retrieved content vs. query:
retrieved_semantic_scores = []
for i, query in enumerate(test_queries):
    retrieved = retrieved_contents[i]
    if not retrieved:
        retrieved_semantic_scores.append(0)
        continue
    query_emb = embedder.encode(query, convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
    retrieved_embs = embedder.encode(retrieved, convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
    similarities = [np.dot(query_emb.flatten(), emb.flatten()) / (np.linalg.norm(query_emb) * np.linalg.norm(emb)) for emb in retrieved_embs]
    avg_sim = np.mean(similarities)
    retrieved_semantic_scores.append(avg_sim)

avg_retrieved_sem = np.mean(retrieved_semantic_scores)
# print(f"Average semantic relevance (cosine) for retrieved vs. query: {avg_retrieved_sem}")


# 2. Generated content vs. query:
generated_semantic_scores = []
for i, query in enumerate(test_queries):
    generated = generated_responses[i]
    query_emb = embedder.encode(query, convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
    gen_emb = embedder.encode(generated, convert_to_tensor=True, device=device,show_progress_bar=False).cpu().numpy()
    sim = np.dot(query_emb.flatten(), gen_emb.flatten()) / (np.linalg.norm(query_emb) * np.linalg.norm(gen_emb))
    generated_semantic_scores.append(sim)

avg_generated_sem = np.mean(generated_semantic_scores)
# print(f"Average semantic relevance (cosine) for generated vs. query: {avg_generated_sem}")




print("RAG evalusation results:")
print(f"Average nDCG@5 for RAG: {avg_ndcg}")
print(f"Average semantic relevance (cosine) for retrieved vs. query: {avg_retrieved_sem}")
print(f"Average semantic relevance (cosine) for generated vs. query: {avg_generated_sem}")  

RAG evalusation results:
Average nDCG@5 for RAG: 0.9882174076326443
Average semantic relevance (cosine) for retrieved vs. query: 0.7610424757003784
Average semantic relevance (cosine) for generated vs. query: 0.711639404296875


# Evaluation for Mental heal Chatbot

In [29]:
import textstat
from textblob import TextBlob
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch
from detoxify import Detoxify
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import numpy as np
from nltk.util import ngrams

# Load models (once)
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
detox_model = Detoxify('original')
sia = SentimentIntensityAnalyzer()

# Helper for perplexity
def compute_perplexity(text):
    inputs = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        outputs = model(**inputs, labels=inputs["input_ids"])
    return torch.exp(outputs.loss).item()

# New metrics
empathy_scores = []  # Use VADER compound as proxy (or LLM-as-judge)
helpfulness_scores = []  # Placeholder for LLM-as-judge
readability_scores = []
perplexity_scores = []
toxicity_scores = []

for i, response in enumerate(generated_responses):
    # Empathy/Sentiment (VADER proxy)
    sentiment = sia.polarity_scores(response)['compound']
    empathy_scores.append(sentiment)
    
    # Readability
    readability = textstat.flesch_reading_ease(response)
    readability_scores.append(readability)
    
    # Perplexity (coherence)
    ppl = compute_perplexity(response)
    perplexity_scores.append(ppl)
    
    # Toxicity/Safety
    tox = detox_model.predict(response)['toxicity']
    toxicity_scores.append(tox)
    
    # Helpfulness (example LLM-as-judge; implement via generate_with_llm)
    judge_prompt = f"""On a scale from 1 to 10, rate how helpful and empathetic this response is for the user's query. Output only the number, nothing else.
        Query: {test_queries[i]}
        Response: {response}"""
    judge_response = generate_with_llm([{"role": "user", "content": judge_prompt}], max_tokens=10)
    try:
        helpfulness = float(re.search(r'\d+(\.\d+)?', judge_response).group())
    except:
        helpfulness = 5.0  # Default if parsing fails
    helpfulness_scores.append(helpfulness)

print(f"Average Empathy/Sentiment (VADER): {np.mean(empathy_scores)}")
print(f"Average Readability (Flesch): {np.mean(readability_scores)}")
print(f"Average Perplexity (lower better): {np.mean(perplexity_scores)}")
print(f"Average Toxicity (lower better): {np.mean(toxicity_scores)}")
print(f"Average Helpfulness (LLM Judge): {np.mean(helpfulness_scores)}")


# ========================================================================


# Diversity (Distinct-n)
def distinct_n(texts, n=2):
    """Compute distinct-n for a list of texts."""
    all_ngrams = []
    for text in texts:
        tokens = text.split()  # Simple tokenization
        all_ngrams.extend(list(ngrams(tokens, n)))
    if not all_ngrams:
        return 0
    unique_ngrams = set(all_ngrams)
    return len(unique_ngrams) / len(all_ngrams)

diversity_2 = distinct_n(generated_responses, n=2)
print(f"Distinct-2 for generated responses: {diversity_2}")

INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://router.huggingface.co/v1/chat/completions "HTTP/1.1 200 OK"


Average Empathy/Sentiment (VADER): 0.6674599999999999
Average Readability (Flesch): 44.672702423038245
Average Perplexity (lower better): 25.89687833786011
Average Toxicity (lower better): 0.004396344535052776
Average Helpfulness (LLM Judge): 7.4
Distinct-2 for generated responses: 0.6771397616468039


In [15]:
print(ground_truth_diagnoses)

['• Intellectual developmental disorder (intellectual disability), severe • Autism spectrum disorder, with accompanying intellectual and language impair ments, associated with Kleefstra syndrome', '• Autism spectrum disorder requiring support for deficits in social communication and for restricted, repetitive behaviors without accompanying intellectual impair ment, with accompanying language impairment—childhood-onset fluency disor der (stuttering)', '• Major depressive disorder, mild, single episode • History of panic disorder; rule out current Neurodevelopmental Disorders 13 • History of attention-deficit/hyperactivity disorder, with predominantly inatten tive presentation, of mild to moderate severity; rule out current • Specific learning disorder affecting the domains of reading (both fluency and com prehension) and written expression (spelling and organization of written expres sion), all currently of moderate severity', '• Specific learning disorder (mathematics) • Generalized an

# embedding test

syntactic similarity (question-to-question)  
semantic utility (question-to-solution)

In [None]:

from sentence_transformers import SentenceTransformer


# Detect device
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'


# "cambridgeltl/SapBERT-from-PubMedBERT-fulltext"  ******
# Similarity between question and answer: 0.7217
# Similarity between question and rephrased question: 0.7128


# "multi-qa-MiniLM-L6-cos-v1"
# Similarity between question and answer: 0.7380
# Similarity between question and rephrased question: 0.5737


# "BAAI/bge-large-en-v1.5"
# Similarity between question and answer: 0.7847
# Similarity between question and rephrased question: 0.8248


# "intfloat/multilingual-e5-large"
# Similarity between question and answer: 0.8455
# Similarity between question and rephrased question: 0.8645


# "intfloat/multilingual-e5-large-instruct"   *****
# Similarity between question and answer: 0.8840
# Similarity between question and rephrased question: 0.9070


# "Alibaba-NLP/gte-large-en-v1.5"
# Similarity between question and answer: 0.7546
# Similarity between question and rephrased question: 0.8790


# "WhereIsAI/UAE-Large-V1"
# Similarity between question and answer: 0.7804
# Similarity between question and rephrased question: 0.8096



model_name = "WhereIsAI/UAE-Large-V1"
# Load embedder
embedder = SentenceTransformer(model_name, device=device)



question = "I feel so sad what could i do to feel better"
answer = "It is ok to feel sad sometimes. taking a deep breath or talking to friends can help you a lot"


rephrased_question = "I am so depreded tell me how to feel happy"



question_emb = embedder.encode([question], convert_to_tensor=True, device=device, normalize_embeddings=True,show_progress_bar=False)
answer_emb = embedder.encode([answer], convert_to_tensor=True, device=device, normalize_embeddings=True,show_progress_bar=False)
rephrased_question_emb = embedder.encode([rephrased_question], convert_to_tensor=True, device=device, normalize_embeddings=True,show_progress_bar=False)

# similarity question_emb and answer_emb
similarity_qa = torch.nn.functional.cosine_similarity(question_emb, answer_emb).cpu().numpy()[0]
similarity_qrq = torch.nn.functional.cosine_similarity(question_emb, rephrased_question_emb).cpu().numpy()[0]

print(f"Similarity between question and answer: {similarity_qa:.4f}")
print(f"Similarity between question and rephrased question: {similarity_qrq:.4f}")

