### Mental Health Chatbot

This project implements a mental health chatbot using Gradio, Transformers, and LangChain. The chatbot leverages a fine-tuned Falcon-7B model for generating responses to user queries about mental health.

## Installation
To run this chatbot, ensure you have the required dependencies installed. Use the following commands to install them:

```sh
# Install core dependencies
!pip install gradio==3.41.0 transformers langchain -Uqqq

# Install additional dependencies for model optimization
!pip install accelerate bitsandbytes einops peft -Uqqq
```

These packages include:
- `gradio`: For building the chatbot interface.
- `transformers`: For working with pre-trained language models.
- `langchain`: To handle the LLM chain logic.
- `accelerate`, `bitsandbytes`, `einops`, `peft`: For efficient model loading and optimization.

## Code Modules
The implementation is structured into multiple functions and classes to maintain modularity and readability.

### 1. Model and Tokenizer Initialization
The `init_model_and_tokenizer` function loads the fine-tuned Falcon-7B model using PEFT (Parameter-Efficient Fine-Tuning) and configures it for efficient inference using `bitsandbytes`.

### 2. Custom LLM Chain
A `CustomLLM` class is implemented to integrate the model into LangChain. The `init_llm_chain` function sets up a LangChain `LLMChain` with a custom prompt template.

### 3. Chatbot Logic
Functions `user_input` and `bot_response` handle user interaction and chatbot responses while maintaining chat history.

### 4. Response Formatting
The `post_process_response` function ensures the chatbot outputs clean, readable responses by removing unnecessary tokens and formatting lists.

### 5. Gradio UI
The chatbot is deployed using Gradio in a simple, interactive web interface.

In [3]:
"""
Mental Health Chatbot using Falcon-7B
=====================================
This chatbot provides educational responses about mental health topics using a finetuned Falcon-7B model.
It is implemented with Gradio for an interactive UI and uses Langchain for structured LLM interaction.
"""

import gradio as gr
import torch
import warnings
import re
from langchain import PromptTemplate, LLMChain
from langchain.llms.base import LLM
from transformers import (
    AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig
)
from peft import PeftConfig, PeftModel

warnings.filterwarnings("ignore")

In [4]:
# ============================
# Model Initialization Module
# ============================
def init_model_and_tokenizer(PEFT_MODEL):
    """
    Initializes the model and tokenizer for text generation.
    
    Args:
        peft_model_name (str): Path or name of the PEFT model.
    
    Returns:
        tuple: Loaded model and tokenizer.
    """
    config = PeftConfig.from_pretrained(PEFT_MODEL)
    
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_use_double_quant=True,
        bnb_4bit_compute_dtype=torch.float16,
    )

    base_model = AutoModelForCausalLM.from_pretrained(
        config.base_model_name_or_path,
        return_dict=True,
        quantization_config=bnb_config,
        device_map="auto",
        trust_remote_code=True,
    )

    model = PeftModel.from_pretrained(base_model, PEFT_MODEL)

    tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
    tokenizer.pad_token = tokenizer.eos_token  #Fix attention mask issue

    return model, tokenizer

In [5]:
# ============================
# Custom LLM Module
# ============================
def init_llm_chain(model, tokenizer):
    """
    Custom LLM class for text generation using a fine-tuned Falcon-7B model.
    """
    class CustomLLM(LLM):
        """
        Generates a response from the model based on the provided prompt.
        """
        def _call(self, prompt: str, stop=None, run_manager=None) -> str:
            device = "cuda" if torch.cuda.is_available() else "cpu"
            encoding = tokenizer(prompt, return_tensors="pt").to(device)

            #Fix generation settings
            output = model.generate(
                input_ids=encoding.input_ids, 
                attention_mask=encoding.attention_mask,
                generation_config=GenerationConfig(
                    max_new_tokens=256,
                    pad_token_id=tokenizer.eos_token_id,
                    eos_token_id=tokenizer.eos_token_id,
                    temperature=0.4, 
                    top_p=0.6, 
                    repetition_penalty=1.3,
                    num_return_sequences=1,
                )
            )

            return tokenizer.decode(output[0], skip_special_tokens=True)

        @property
        def _llm_type(self) -> str:
            """
            Property to access the model in a way that langchain expects.
            """
            return "custom"

    llm = CustomLLM()

    # Initializes the LLM chain with a custom prompt template.
    template = """
    : {query}
    :"""

    prompt = PromptTemplate(template=template, input_variables=["query"])
    llm_chain = LLMChain(prompt=prompt, llm=llm)

    return llm_chain

In [6]:
# ============================
# Chatbot Logic Module
# ============================
def user_input(user_message, history):
    """
    Handles user input by updating the chat history.
    
    Args:
        user_message (str): The message from the user.
        history (list): The chat history.
    
    Returns:
        tuple: Updated history with user input.
    """
    history = history or []  #Initialize history if empty
    history.append([user_message, None])  #Append new message
    return "", history  

def bot_response(history):
    """
    Generates a bot response based on the latest user input.
    
    Args:
        history (list): The chat history.
        llm_chain (LLMChain): The initialized LLM chain.
    
    Returns:
        list: Updated chat history with the bot response.
    """
    if not history or not history[-1][0]:  
        return history  # No valid input

    query = history[-1][0]  # Get latest query
    print("DEBUG: User Query:", query)

    bot_message = llm_chain.run(query)  
    print("DEBUG: Raw Bot Response:", bot_message)

    bot_message = post_process_response(bot_message)  
    history[-1][1] = bot_message  #Update chatbot history

    return history

In [7]:
# ============================
# Response Formatting Module
# ============================
import re

def post_process_response(text):
    """
    Cleans and formats the chatbot's response.
    
    Args:
        text (str): Raw text generated by the model.
    
    Returns:
        str: Cleaned response.
    """
    if not text:
        return "Sorry, I couldn't generate a response."

    # Remove unnecessary colons at the beginning
    text = re.sub(r"^\s*:\s*", "", text)

    # Remove "Begin!" or any unwanted prompt-related headers
    text = re.sub(r"Begin!.*", "", text, flags=re.DOTALL).strip()

    # Replace numbered lists (1., 2., etc.) with bullet points (•)
    text = re.sub(r"\n\d+\.\s+", "\n• ", text)

    # Ensure properly formatted output
    return text.strip()

In [8]:
# ============================
# Initialize Model
# ============================
model_name = "FardinAbrar/Falcon-7b-model-Finetuned-By-Mental-Health-Data"
peft_model, peft_tokenizer = init_model_and_tokenizer(model_name)
llm_chain = init_llm_chain(peft_model, peft_tokenizer)

adapter_config.json:   0%|          | 0.00/778 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/581 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/17.7k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/8 [00:00<?, ?it/s]

model-00001-of-00008.safetensors:   0%|          | 0.00/1.92G [00:00<?, ?B/s]

model-00002-of-00008.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

model-00003-of-00008.safetensors:   0%|          | 0.00/1.91G [00:00<?, ?B/s]

model-00004-of-00008.safetensors:   0%|          | 0.00/1.91G [00:00<?, ?B/s]

model-00005-of-00008.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

model-00006-of-00008.safetensors:   0%|          | 0.00/1.91G [00:00<?, ?B/s]

model-00007-of-00008.safetensors:   0%|          | 0.00/1.91G [00:00<?, ?B/s]

model-00008-of-00008.safetensors:   0%|          | 0.00/921M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/180 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.73M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/281 [00:00<?, ?B/s]

In [9]:
# ============================
# Gradio UI
# ============================
with gr.Blocks() as demo:
    gr.HTML("<h2>🤖 Mental Health Chatbot</h2>")
    gr.Markdown("This chatbot provides educational responses about mental health topics.")

    chatbot = gr.Chatbot()
    query = gr.Textbox(label="Type your question and press 'Enter'")
    clear_btn = gr.Button(value="Clear Chat")

    query.submit(user_input, [query, chatbot], [query, chatbot], queue=False).then(
        bot_response, chatbot, chatbot
    )
    clear_btn.click(lambda: [], None, chatbot, queue=False)  #Clear button

demo.launch()

Running on local URL:  http://127.0.0.1:7860
IMPORTANT: You are using gradio version 3.41.0, however version 4.44.1 is available, please upgrade.
--------
Kaggle notebooks require sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Running on public URL: https://c60620a5fe641f6b56.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




DEBUG: Raw Bot Response: 
    : 1. Feeling sad or down most of the time
    : 2. Loss of interest or pleasure in activities that you once enjoyed
    : 3. Changes in appetite or weight
    : 4. Insomnia or sleeping too much
    : 5. Irritability or restlessness
    : 6. Fatigue or low energy
    : 7. Feelings of guilt or worthlessness
    : 8. Difficulty concentrating, remembering things, or making decisions
    : 9. Thoughts of death or suicide
    : 10. Physical symptoms such as headaches, digestive issues, or chronic pain
    : 11. Withdrawing from friends and family
    : 12. Engaging in risky behaviors such as substance abuse or self-harm
    : 13. Noticeable changes in mood, behavior, or personality
    : 14. Persistent feelings of sadness, hopelessness, or emptiness
    : 15. Aches or pains that don't go away
    : 16. Unexplained physical ailments
    : 17. Changes in sleep patterns
    : 18. Loss of interest in hobbies or activities that you once enjoyed
    : 19. Difficulty f