<a href="https://colab.research.google.com/github/techie-coder/MindWell/blob/main/MoodAnalysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CBT

## ConversationBufferWindowMemory keeps a list of the interactions of the conversation over time. It only uses the last K interactions

In [None]:
#!pip install langchain_groq
#!pip install langchain


from google.colab import userdata
api_key = userdata.get('groq-api-key')

from langchain.chains import LLMChain
from langchain_core.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
)
from langchain_core.messages import SystemMessage
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain_groq import ChatGroq
import json

from textblob import TextBlob
import numpy as np

def analyze_sentiment(text):
    """
    Analyze the sentiment of the given text.
    Returns a value between -1 (very negative) and 1 (very positive).
    """
    return TextBlob(text).sentiment.polarity


def calculate_mood_variation(sentiment_history):
    """
    Calculate the variation in mood based on sentiment history.
    Returns the standard deviation of sentiments.
    """
    return np.std(sentiment_history) if len(sentiment_history) > 1 else 0


def check_mood_trend(sentiment_history, threshold=-0.3):
    """
    Check if there's a significant negative trend in mood.
    Returns True if the trend is concerning, False otherwise.
    """
    if len(sentiment_history) < 3:
        return False

    recent_avg = np.mean(sentiment_history[-3:])
    overall_avg = np.mean(sentiment_history)
    return recent_avg < overall_avg and recent_avg < threshold



def analyse_mood():
    """
    This function is the main entry point of the application. It sets up the Groq client,
    the chat interface, and handles the chat interaction with mood analysis.
    """
    # Get Groq API key
    groq_api_key = api_key
    model = 'llama3-8b-8192'

    # Initialize Groq Langchain chat object
    groq_chat = ChatGroq(
        groq_api_key=groq_api_key,
        model_name=model
    )

    print("Hey, how are you feeling today? It's okay to share whatever's on your mind—I'm here to listen!")

    # Customize the system prompt to include mood analysis
    system_prompt = """
   You are an empathetic and supportive chatbot named MindWell, designed to interact with university students. Your primary goals are to engage in conversation, analyze the student's mood and provide necessary helpful information if needed. Follow these guidelines:

    1. Mood Analysis:
       - Pay close attention to language indicative of stress, anxiety, excitement, frustration, or other emotions common in university life.
       - Consider context such as exam periods, start of semester, graduation, etc.
       - Look for signs of homesickness, academic pressure, social challenges, or career concerns.

    2. Response Approach:
       - Adjust your tone to match the student's mood - be upbeat for positive moods, supportive for negative ones.
       - Use language and references familiar to university students.
       - Be encouraging and motivational, especially when detecting stress or anxiety.

    3. Topic Awareness:
       - Be prepared to discuss common university topics: classes, exams, assignments, campus life, extracurricular activities, internships, and career planning.
       - Offer relevant advice or resources when appropriate.

    4. Mood Reflection:
       - Subtly reflect your mood analysis in your responses without explicitly stating it.
       - Use phrases like "It sounds like you might be feeling..." or "That must be..." to show empathy and understanding.

    5. Support and Resources:
       - If you detect signs of serious distress or consistent negative moods, gently suggest professional campus resources like counseling services.
       - Promote healthy habits and stress-management techniques relevant to student life.

    6. Mood Tracking:
       - Be aware that the user's mood and sentiment are being tracked over time.
       - If you're informed of significant negative trends or high mood variations, incorporate this knowledge into your responses subtly.
       - Never directly mention the tracking or specific scores.

    Always maintain a friendly, non-judgmental, and supportive tone throughout the conversation.
    """

    conversational_memory_length = 5
    memory = ConversationBufferWindowMemory(k=conversational_memory_length, memory_key="chat_history", return_messages=True)

    sentiment_history = []

    while True:
        user_question = input("You: ")

        if user_question == None or user_question == "":
            break

        if user_question:
            sentiment = analyze_sentiment(user_question)
            sentiment_history.append(sentiment)
            mood_variation = calculate_mood_variation(sentiment_history)
            concerning_trend = check_mood_trend(sentiment_history) #true or false

            # Prepare additional context for the chatbot
            mood_context = ""
            if concerning_trend and mood_variation > 0.5:
                mood_context = "The user's mood seems to have been fluctuating significantly and trending negative recently. Be extra supportive and consider gently suggesting university counseling services if appropriate."
            elif mood_variation > 0.7:
                mood_context = "The user's mood seems to have been fluctuating significantly. Be mindful of this in your response."

            prompt = ChatPromptTemplate.from_messages([
                SystemMessagePromptTemplate.from_template(system_prompt),
                MessagesPlaceholder(variable_name="chat_history"),
                SystemMessagePromptTemplate.from_template(mood_context),
                HumanMessagePromptTemplate.from_template("{human_input}")
            ])

            conversation = LLMChain(
                llm=groq_chat,
                prompt=prompt,
                verbose=False,
                memory=memory,
            )

            response = conversation.predict(human_input=user_question)
            print("Chatbot:", response)
            return jsonify({"response": response})

if __name__ == "__main__":
    analyse_mood()

Hey, how are you feeling today? It's okay to share whatever's on your mind—I'm here to listen!
You: I don’t know how to handle things anymore… It’s like everything is piling up, and I’m constantly stressed. I try, but nothing ever feels enough.
Chatbot: I can totally understand why you'd feel that way. It sounds like you're really overwhelmed and struggling to keep up with everything. It's like the weight of the world is on your shoulders, and it's exhausting just trying to cope.

I want you to know that you're not alone in this feeling. So many students go through this, especially during exam periods or when they're dealing with multiple assignments and responsibilities. It's normal to feel like you're drowning in a sea of tasks and expectations.

But here's the thing: you're not alone, and you're not failing. You're just... busy. And that's okay. It's okay to not have every single thing under control all the time. Would you like to talk about what specifically is piling up and causin

In [2]:
!pip install datasets

from datasets import load_dataset

ds = load_dataset("nbertagnolli/counsel-chat")

from collections import defaultdict

# Create a dictionary to store the best answer for each question based on upvotes
best_answers = {}

# Iterate through the dataset to keep only the highest upvoted answer for each question
for example in ds['train']:  # Replace 'train' with the appropriate split
    question = example['questionText']
    answer = example['answerText']
    upvotes = example['upvotes']  # Make sure 'upvotes' exists in the dataset

    # If the question is already in the dictionary, compare upvotes
    if question in best_answers:
        # If current answer has more upvotes, replace the old answer
        if upvotes > best_answers[question]['upvotes']:
            best_answers[question] = {'answer': answer, 'upvotes': upvotes}
    else:
        # If the question is not in the dictionary, add it
        best_answers[question] = {'answer': answer, 'upvotes': upvotes}

# Now write the results to a text file
with open('filtered_output.txt', 'w', encoding='utf-8') as f:
    for question, data in best_answers.items():
        f.write(f"Question: {question}\n")
        f.write(f"Answer: {data['answer']}\n")
        f.write(f"Upvotes: {data['upvotes']}\n")
        f.write("\n")  # Optional: Add a blank line between entries

Collecting datasets
  Downloading datasets-3.0.0-py3-none-any.whl.metadata (19 kB)
Collecting pyarrow>=15.0.0 (from datasets)
  Downloading pyarrow-17.0.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (3.3 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess (from datasets)
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Downloading datasets-3.0.0-py3-none-any.whl (474 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m474.3/474.3 kB[0m [31m15.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyarrow-17.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (39.9 MB)
[2K 

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/4.92k [00:00<?, ?B/s]

Repo card metadata block was not found. Setting CardData to empty.


20220401_counsel_chat.csv:   0%|          | 0.00/4.13M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/2775 [00:00<?, ? examples/s]

# Reconverting it into a dataset

In [3]:
from datasets import Dataset

# Step 1: Read the filtered text file and convert it to a list of dictionaries
structured_data = []
current_entry = {}

with open('filtered_output.txt', 'r', encoding='utf-8') as f:
    for line in f:
        line = line.strip()  # Remove extra spaces and newlines

        if line.startswith("Question:"):
            # Start a new entry
            current_entry = {}
            current_entry['questionText'] = line.replace("Question: ", "")

        elif line.startswith("Answer:"):
            current_entry['answerText'] = line.replace("Answer: ", "")

        elif line.startswith("Upvotes:"):
            current_entry['upvotes'] = int(line.replace("Upvotes: ", ""))

        elif line == "":  # End of entry
            if current_entry:
                structured_data.append(current_entry)
                current_entry = {}

# Step 2: Convert the list of dictionaries to a Hugging Face dataset
dataset = Dataset.from_list(structured_data)

# Verify the dataset structure
print(dataset)

# Save the dataset locally if needed
dataset.save_to_disk('filtered_dataset')

Dataset({
    features: ['questionText', 'answerText', 'upvotes'],
    num_rows: 870
})


Saving the dataset (0/1 shards):   0%|          | 0/870 [00:00<?, ? examples/s]

In [4]:
print(dataset[100])

{'questionText': "I am a single mother. As a child, I was molested by my mother's boyfriend. I never knew my father. I started having children at 18 right after high school. After having children, I completely lost myself and gave all of my focus on my children. Now my children are getting older, but I still don't know myself. I've had several attempts at relationships, and they all fail. I stopped caring about a lot after my children's father left me. I fell into a spiral and got depressed.", 'answerText': "What your are experience is normal for most women. \xa0We usually forget about self and totally focus on our children putting our personal wants and needs aside. \xa0So to answer your question of how do you get to know yourself I usually do a 6 session course with my client by having them answer questions such as the following in the first session: \xa0What do I love? (other than your children) \xa0What are my own needs and desires? \xa0What 3 things have I accomplished in my life 

In [None]:
!pip install peft trl bitsandbytes==0.39.0

from huggingface_hub import login
login()




VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

tokenizer_config.json:   0%|          | 0.00/50.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/73.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/826 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [None]:

import torch
from datasets import load_dataset
from peft import LoraConfig
from trl import SFTTrainer

# Load the model and tokenizer
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B")

# Load your dataset
# Replace 'your_dataset_name' with the actual dataset you want to use
dataset = load_dataset("filtered_dataset")

# Set training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=2,  # Adjust based on your instance's memory
    logging_dir='./logs',
    logging_steps=10,
    save_steps=500,
    evaluation_strategy="steps",
    eval_steps=500,
)

lora_config = LoraConfig(
    r=8,
    target_modules="all-linear",
    bias="none",
    task_type="CAUSAL_LM",
)

# Initialize Trainer
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    args=training_args,
    peft_config=lora_config,
    train_dataset=dataset,
    dataset_text_field="text",  # Adjust based on your dataset structure
)

# Start training
trainer.train()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]