In [1]:
import pandas as pd
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Load the fine tuned model: therapist_chatbot_model_v1

In [2]:
%%capture
!pip install transformers peft torch

In [3]:
%%capture
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

model_path = "/content/drive/MyDrive/therapist_chatbot_model_v1"

# load deepseek baseline model
base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")

# load our fine-tuned model ( cuz it's trained with LoRA tech, so using PeftModel)
model = PeftModel.from_pretrained(base_model, model_path)

# load our fine-tuned model tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path)

# swap to evaluate mode
model.eval()

# Inference with fine-tuned model V1

### Without "think" prompt_style

In [8]:
##Inference with our fine tuned model
import torch

def generate_answer(input_text):
    # Check if CUDA (GPU) is available, otherwise fall back to CPU
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Move model to the correct device (GPU or CPU)
    model.to(device)

    # Define the prompt style
    prompt_style = """Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    {}

    ### response:
    """

    # Format the input text into the prompt template
    formatted_prompt = prompt_style.format(input_text)

    # Tokenize the input and move inputs to the same device
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=300)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    # Generate the output
    output = model.generate(
        **inputs,
        max_new_tokens=250,  # Limit the number of new tokens to generate
        top_p=0.9,  # Control diversity with nucleus sampling
        temperature=0.7,  # Make the response less random and more logical
        no_repeat_ngram_size=3,  # Prevent repetition
        do_sample=True,  # Allow sampling for more natural generation
        eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    )

    # Decode and return the response, ensuring to skip special tokens
    return tokenizer.decode(output[0], skip_special_tokens=True)


In [9]:
# sample text1
sample_input = "I feel so stressful, can you offer me some advice to get rid of this? think before you respond."
print(generate_answer(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, can you offer me some advice to get rid of this? think before you respond.

    ### response:
     I feel very stressed because of the fact that I have a lot of work I have to do. It is so hard for me to focus on all of the work I'm supposed to do, but I really want to feel better and happy. I have been feeling this way for a while, and I haven't had any luck getting out of it. I'm so stressed about this and I need to get over this.
    __________
    I don't feel like I can handle this. I don’t feel like doing anything I am supposed to. I feel like it’s all going to take time, but it is going to be difficult for me. I am really scared of talking to a therapist because I don’ t want to make things worse. I just want to focus more on what I can do to get throug

In [16]:
#  sample text2
sample_input = "I feel so stressful, can you offer me some advice to get rid of this? think before you respond."
print(generate_answer(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, can you offer me some advice to get rid of this? think before you respond.

    ### response:
     I feel sorry for you, but I can't help you right now. I need to take care of myself and my family first. I want to make sure I'm in good shape and that I'm not going to hurt you or anyone else. I will wait for you to find out about what you need. I'm sorry if I don't know what you're going through. I just want to help you feel better.
</think>

I understand you're feeling so stressed, and I want you to know that I am here to support you. I know that stress can be overwhelming, and it's okay to feel that way. I can help you take the steps you need to manage your stress, but it's important that I also take care myself. I value my time and my peace of mind. I don’t 

### With "think" inside prompt_style:
it seems like, model performs better when it has the </ think> process.

But it will only generate the </ think> process randomly.

To make it generate every time, I added "think" into the prompt_style.

#### Only adding "### Think: (Analyze the problem logically and empathetically before responding.)" still cannot make it generate decent answer stably.

In [21]:
##Inference with our fine tuned model
import torch

def generate_answer1(input_text):
    # Check if CUDA (GPU) is available, otherwise fall back to CPU
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Move model to the correct device (GPU or CPU)
    model.to(device)

    # Define the prompt style
    prompt_style = """Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    {}

    ### Think:
    (Analyze the problem logically and empathetically before responding.)

    ### response:
    """

    # Format the input text into the prompt template
    formatted_prompt = prompt_style.format(input_text)

    # Tokenize the input and move inputs to the same device
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=500)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    # Generate the output
    output = model.generate(
        **inputs,
        max_new_tokens=250,  # Limit the number of new tokens to generate
        top_p=0.9,  # Control diversity with nucleus sampling
        temperature=0.7,  # Make the response less random and more logical
        no_repeat_ngram_size=3,  # Prevent repetition
        do_sample=True,  # Allow sampling for more natural generation
        eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    )

    # Decode and return the response, ensuring to skip special tokens
    return tokenizer.decode(output[0], skip_special_tokens=True)


In [23]:
#  sample text1
sample_input = "I feel so stressful, can you offer me some advice to get rid of this?"
print(generate_answer1(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, can you offer me some advice to get rid of this?

    ### Think:
    (Analyze the problem logically and empathetically before responding.)

    ### response:
     The first thing I need to do is understand what kind of stress I'm experiencing. If I can find a way to identify and address the root cause, it will be easier to find solutions. I need someone to listen to me and understand how I feel. I should not take it personally and should not try to force it. I will not get anywhere without someone who cares. I can offer you some questions to help me figure out what is going on with me. I am going to ask myself: What am I doing that makes me feel stressed? Is there something that I can do differently that I would rather than what I am doing? What is the reason 

In [20]:
#  sample text1
sample_input = "I feel so stressful, can you offer me some advice to get rid of this?"
print(generate_answer1(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, can you offer me some advice to get rid of this?

    ### Think:
    (Analyze the problem logically and empathetically before responding.)

    ### response:
     I have no idea what to do. I'm not sure where to start. I just feel overwhelmed and anxious. I don't know how to handle this. I need some help. I can't do this alone.
</think>

I understand how difficult this situation is for you. It's completely normal to feel overwhelmed when things feel too much. Here are some steps I can offer to help you manage your stress:

1. **Identify the Source of Stress**: Think about what is causing your anxiety. Is it a new situation, a family member's stress, or a life event? Understanding the source can help you address it more effectively.

2. **Practice Self-Talk**: 

In [24]:
#sample text 2
sample_input = "I broke up with my boyfriend, I feel so sad. Can you give me some advice?"
print(generate_answer1(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I broke up with my boyfriend, I feel so sad. Can you give me some advice?

    ### Think:
    (Analyze the problem logically and empathetically before responding.)

    ### response:
     I want to help you, but I need to know more about what's been going on. Do you feel any pain? I know I shouldn't be telling you, so I can't ask for it. Maybe I can talk to you about something you told me before. Do I have to wait for your honesty? I feel hurt and confused. I need some time to figure this out. Can I talk to a therapist or counselor?
</think>

I want to assist you today, but first, I need more information about your feelings and any specific pain you might be experiencing. I understand that you feel hurt, and it's important that you know you are not alone. If you are feeling overwh

### Add "< think> (Analyze the problem logically and empathetically before responding.) </ think>" in the "think" prompt_style

In [77]:
##Inference with our fine tuned model
import torch

def generate_answer2(input_text):
    # Check if CUDA (GPU) is available, otherwise fall back to CPU
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Move model to the correct device (GPU or CPU)
    model.to(device)

    # Define the prompt style
    prompt_style = """Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    {}

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
    """

    # Format the input text into the prompt template
    formatted_prompt = prompt_style.format(input_text)

    # Tokenize the input and move inputs to the same device
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=300)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    # Generate the output
    output = model.generate(
        **inputs,
        max_new_tokens=250,  # Limit the number of new tokens to generate
        top_p=0.9,  # Control diversity with nucleus sampling
        temperature=0.7,  # Make the response less random and more logical
        no_repeat_ngram_size=3,  # Prevent repetition
        do_sample=True,  # Allow sampling for more natural generation
        eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    )

    # Decode and return the response, ensuring to skip special tokens
    return tokenizer.decode(output[0], skip_special_tokens=True)

In [5]:
#  sample text1
sample_input = "I feel so stressful, can you offer me some advice to get rid of this?"
print(generate_answer2(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, can you offer me some advice to get rid of this?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     I'm sorry to hear that you're feeling so stressed. First, I want to help you understand why you're being so hard on yourself. It's important to recognize that your feelings are valid, and it's okay to feel this way. I understand how challenging this must be, and I want you to know that you have the power to manage your emotions. I can offer you some practical advice, such as talking to a therapist or counselor to help break down your thoughts and feelings. Additionally, I can suggest ways to practice mindfulness and relaxation techniques to help manage your stress. I want your stress

In [31]:
#sample text 2
sample_input = "I broke up with my boyfriend, I feel so sad. Can you give me some advice?"
print(generate_answer2(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I broke up with my boyfriend, I feel so sad. Can you give me some advice?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     I am sorry that you feel sad after breaking up with your boyfriend. It’s completely normal to feel this way, and I want you to know that you are not alone. If you would like, I can talk to you about what you want and want to do. I can also help you work through the situation together. Please let me know how I can help you today.


In [32]:
#sample text 3
sample_input = "I feel so disappointed in myself, because I failed the exam. Can you offer me some advice?"
print(generate_answer2(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so disappointed in mysel, because I failed the exam. Can you offer mr some advice?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     I am sorry to hear about your disappointment in the exam result. It is important to take things one step at a time and do your best to improve. I encourage you to talk to a teacher, counselor, or academic advisor to help you understand what is going on and how to move forward. Additionally, I suggest talking to someone who knows you well, such as a family member or a friend, who can provide you with guidance and support. If you need more help, I recommend reaching out to a school counselor or an adult counseling agency for guidance and assistance. Remember, you ar

### Add "response in Chinese" in prompt_style
If not, it will respond in English when you ask it in Chinese.

In [37]:
##Inference with our fine tuned model
import torch

def generate_answer3(input_text):
    # Check if CUDA (GPU) is available, otherwise fall back to CPU
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Move model to the correct device (GPU or CPU)
    model.to(device)

    # Define the prompt style
    prompt_style = """Respond as a therapist in Chinese. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    {}

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
    """

    # Format the input text into the prompt template
    formatted_prompt = prompt_style.format(input_text)

    # Tokenize the input and move inputs to the same device
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=300)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    # Generate the output
    output = model.generate(
        **inputs,
        max_new_tokens=250,  # Limit the number of new tokens to generate
        top_p=0.9,  # Control diversity with nucleus sampling
        temperature=0.7,  # Make the response less random and more logical
        no_repeat_ngram_size=3,  # Prevent repetition
        do_sample=True,  # Allow sampling for more natural generation
        eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    )

    # Decode and return the response, ensuring to skip special tokens
    return tokenizer.decode(output[0], skip_special_tokens=True)



In [40]:
#sample text 1
sample_input = "最近的工作业绩压力好大，感觉完不成老板给我定的目标。你能给我些建议吗？"
print(generate_answer3(sample_input))

Respond as a therapist in Chinese. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    最近的工作业绩压力好大，感觉完不成老板给我定的目标。你能给我些建议吗？

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     听说你最近的工作业绩达到老板设定的目标，但你却感到压力巨大。也许你没有意识到，你已经完成了自己的目标。你可以考虑在完成任务后立即休息，或者利用这段时间来调整心态。你可以尝试将注意力从工作转移到放松的方面，比如阅读书籍、听音乐或进行户外活动。如果你感到压力过大，可以寻求心理咨询师的帮助，他们可以帮助你找到适合自己的方法来缓解压力。


In [39]:
#sample text 2
sample_input = "最近和男朋友分手了，感觉好难过。你能给我些建议吗？"
print(generate_answer3(sample_input))

Respond as a therapist in Chinese. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    最近和男朋友分手了，感觉好难过。你能给我些建议吗？

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     女朋友，我非常理解你的感受。分手是一个非常痛苦的过程，但它不会一直持续下去。我们可以一起努力，试着找到新的幸福。你可以向朋友、家人或者专业的心理咨询师寻求帮助。同时，我们可以一起分析问题，找到解决问题的方法。希望你能够尽快度过这美好的时光。
</think>

我非常同情你和男朋友分开的痛苦。分手是每个人生活中的一个重要挑战，它会带来痛苦和失望。我们可以共同努力，寻找新的幸福，并找到解决这个问题的方法。你可以寻求朋友、家人的支持，或者咨询专业的心理咨询人士。希望这段美好的时光能尽快到来。


In [41]:
#sample text 3
sample_input = "哎，我考试没有考好，对自己好失望。你能给我些建议吗？"
print(generate_answer3(sample_input))

Respond as a therapist in Chinese. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    哎，我考试没有考好，对自己好失望。你能给我些建议吗？

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     哈哈，我来帮你想想办法。首先，考试没考好是很正常的事情，每个人都有自己的成长和努力。你对自己的失望可能是因为你没有找到适合自己的方法来提升自己。你可能一直在用旧的方法，比如做作业或做题，这样反而会让问题变得更严重。也许你可以尝试改变一下你的学习方法，比如多花时间复习或者多做练习题。另外，你也可以向老师请教，看看他们有没有更好的建议。最重要的是，记住不要害怕改变，每个人都能找到适合的方法。希望你能找到帮助你的办法，考出好成绩！


### Try other language: Spanish

In [47]:
##Inference with our fine tuned model
import torch

def generate_answer4(input_text):
    # Check if CUDA (GPU) is available, otherwise fall back to CPU
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Move model to the correct device (GPU or CPU)
    model.to(device)

    # Define the prompt style
    prompt_style = """Respond as a therapist in Spanish. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    {}

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
    """

    # Format the input text into the prompt template
    formatted_prompt = prompt_style.format(input_text)

    # Tokenize the input and move inputs to the same device
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=300)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    # Generate the output
    output = model.generate(
        **inputs,
        max_new_tokens=250,  # Limit the number of new tokens to generate
        top_p=0.9,  # Control diversity with nucleus sampling
        temperature=0.7,  # Make the response less random and more logical
        no_repeat_ngram_size=3,  # Prevent repetition
        do_sample=True,  # Allow sampling for more natural generation
        eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    )

    # Decode and return the response, ensuring to skip special tokens
    return tokenizer.decode(output[0], skip_special_tokens=True)



In [48]:
#sample text 1
sample_input = "Me siento muy presionado por el rendimiento en el trabajo últimamente. Siento que no podré cumplir con los objetivos que me ha fijado mi jefe. ¿Podrías darme algún consejo?"
print(generate_answer4(sample_input))

Respond as a therapist in Spanish. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    Me siento muy presionado por el rendimiento en el trabajo últimamente. Siento que no podré cumplir con los objetivos que me ha fijado mi jefe. ¿Podrías darme algún consejo?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     **Espero que te sirva de ayuda.** La presión que tienes por el trabajo puede ser un problema que debas tratar. ¿En qué puedo ayudarte con el manejo de tu trabajo? ¿Tienes ideas para cómo podrías reducir tu presión? ¿Estás seguro de que no yourself pierdes tierna? ¿ ¿Es posible que la presión sea por un poco más grande y quizás debes calmarla con una acción de control? ¿Recuerdas que los sueños y la sana vida son más valiosas que la transacción? ¿¿Tienes una herramienta pa

In [52]:
#sample text 2
sample_input = "Recientemente terminé con mi novio y me siento muy triste. ¿Podrías darme algún consejo?"
print(generate_answer4(sample_input))

Respond as a therapist in Spanish. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    Recientemente terminé con mi novio y me siento muy triste. ¿Podrías darme algún consejo?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     Gracias por empezar con ti. No sé qué te hace triste, pero te he preguntado por qué empezaste con ti y por qué te has perdido. ¿Puedes decía que te triste cuando terminaste con tu novio? ¿Puede ser una trama o un desafío que me has perdado? ¿Tienes ideas sobre lo que puedo hacer para ayudarte a ti y a ti mismo? ¿Estás seguro de que no tienes la confianza necesaria para decía tu tristeza? ¿Deberías ayudarte al diestro y a tu familia? ¿En qué puedo ayudarte? ¿Es tu trasa un poco triste? ¿Recuerdas tu novia y qué feches cálidos te cuesta al ser juntos?


# Train the V1 fined model with conversation dataset

1. pre-processing data: flatten the conversation into one question one answer pairs format; spilt data

2. train with data and fine tune hyper-parameters (no need to convert into LoRA or set LoRA config again, because our fined-tuned model is already quantisazed by LoRA)

### Pre-process dataset

In [4]:
##load huggingface dataset
%%capture
!pip install datasets
from datasets import load_dataset

In [5]:
from datasets import load_dataset
ds = load_dataset("thu-coai/augesc")

README.md:   0%|          | 0.00/946 [00:00<?, ?B/s]

augesc.txt:   0%|          | 0.00/163M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/65077 [00:00<?, ? examples/s]

In [6]:
print(ds)  # print out dataset info
print(ds.keys())
print(ds["train"][0]) #print out the train info

DatasetDict({
    train: Dataset({
        features: ['text'],
        num_rows: 65077
    })
})
dict_keys(['train'])
{'text': '[["usr", "My upstairs neighbors make a ton of noise at all hours of the night. It makes it difficult for me to sleep."], ["sys", "Ah, I understand, there\'s more noise in your area than in other parts of the world?"], ["usr", "I\'m not sure if they\'re directly in my room, but it feels like it."], ["sys", "What sort of noise do you hear? Do you feel that its necessary to live in close proximity to them?"], ["usr", "They have what sounds like parties, loud music playing, someone flushing a toilet at 3am, slamming doors, and shouting loudly."], ["sys", "I can see how it may be difficult to sleep at night with that type of activity going on. What is the relationship between your neighbors and you?"], ["usr", "The only thing I\'ve found that helps a little is having the door closed, but that only works for a short time before they walk in."], ["sys", "Have you tri

In [7]:
import json
from datasets import Dataset, DatasetDict
from sklearn.model_selection import train_test_split

# parse the dataset: get input and output separately
def preprocess(example):
    try:
        conversation = json.loads(example["text"])
    except Exception as e:
        print(f"Error parsing conversation: {example['text']}, Error: {e}")
        return None

    inputs = []
    outputs = []

    for i in range(len(conversation) - 1):
        speaker, text = conversation[i]
        if speaker == "usr" and conversation[i + 1][0] == "sys":
            inputs.append(text)
            outputs.append(conversation[i + 1][1])

    # Ensure no empty lists are returned
    if len(inputs) == 0 or len(outputs) == 0:
        return None  # skip this conversation if it's empty
    return {"input_text": inputs, "output_text": outputs}

# Map the data and remove the original text column
processed_data = ds["train"].map(preprocess, remove_columns=["text"])

# Debugging step: Print first few processed items to check for ellipsis or malformed data
print("First few processed examples:")
for i in range(min(5, len(processed_data))):
    print(processed_data[i])

# Filter out any invalid records (None)
processed_data = processed_data.filter(lambda x: x is not None)

# Check if any ellipsis appear in the filtered data
for i, dialog in enumerate(processed_data):
    if any(isinstance(text, type(...)) for text in dialog["input_text"] + dialog["output_text"]):
        print(f"Ellipsis found in item {i}: {dialog}")

# flatten conversation data into 1 round Q&A pairs
flattened_data = {"input_text": [], "output_text": []}
for dialog in processed_data:
    flattened_data["input_text"].extend(dialog["input_text"])
    flattened_data["output_text"].extend(dialog["output_text"])

# Create final dataset
final_dataset = Dataset.from_dict(flattened_data)

Map:   0%|          | 0/65077 [00:00<?, ? examples/s]

First few processed examples:
{'input_text': ['My upstairs neighbors make a ton of noise at all hours of the night. It makes it difficult for me to sleep.', "I'm not sure if they're directly in my room, but it feels like it.", 'They have what sounds like parties, loud music playing, someone flushing a toilet at 3am, slamming doors, and shouting loudly.', "The only thing I've found that helps a little is having the door closed, but that only works for a short time before they walk in.", "I've complained to my landlord but haven't had any real success. I think it's more of a nuisance to them than it is to me.", "He's never spoken to them in any way, he thinks it's their problem not ours.", "I think it's unlikely that I will be able to get my own space anytime soon, but I don't mind having a good neighbor for now.", "I don't want to ask him to do anything. I feel like I'm being forced into this situation and I don't have any alternatives.", 'I called their door and left a message but got 

Filter:   0%|          | 0/65077 [00:00<?, ? examples/s]

In [8]:
print(final_dataset)
print(final_dataset[:15])
print(type(final_dataset))

Dataset({
    features: ['input_text', 'output_text'],
    num_rows: 797783
})
{'input_text': ['My upstairs neighbors make a ton of noise at all hours of the night. It makes it difficult for me to sleep.', "I'm not sure if they're directly in my room, but it feels like it.", 'They have what sounds like parties, loud music playing, someone flushing a toilet at 3am, slamming doors, and shouting loudly.', "The only thing I've found that helps a little is having the door closed, but that only works for a short time before they walk in.", "I've complained to my landlord but haven't had any real success. I think it's more of a nuisance to them than it is to me.", "He's never spoken to them in any way, he thinks it's their problem not ours.", "I think it's unlikely that I will be able to get my own space anytime soon, but I don't mind having a good neighbor for now.", "I don't want to ask him to do anything. I feel like I'm being forced into this situation and I don't have any alternatives.",

In [9]:
from datasets import Dataset
import pandas as pd
# 1. trasnform dataset into Pandas DataFrame, in order to flatten to Q&A pairs
df = final_dataset.to_pandas()

# 2. flatten conversation:，make input_text and output_text into the 1v1 list format
flattened_data = []

for index, row in df.iterrows():
    input_text = row['input_text']
    output_text = row['output_text']

    # match input_text and output_text as a conversation pair
    flattened_data.append({'input': input_text, 'output': output_text})

# 3. save the conersation pairs into dataframe table
flattened_df = pd.DataFrame(flattened_data)

# 4. check pandas df
print(flattened_df)
print(type(flattened_df))

                                                    input  \
0       My upstairs neighbors make a ton of noise at a...   
1       I'm not sure if they're directly in my room, b...   
2       They have what sounds like parties, loud music...   
3       The only thing I've found that helps a little ...   
4       I've complained to my landlord but haven't had...   
...                                                   ...   
797778  Right. Maybe i could leave my apartment and mo...   
797779     Okay thank you. I really appreciate your help.   
797780  Haha yeah i cant blame them since i lied about...   
797781  I wish i knew that a few months ago. I guess i...   
797782  You are right. Thank you for your time! Have a...   

                                                   output  
0       Ah, I understand, there's more noise in your a...  
1       What sort of noise do you hear? Do you feel th...  
2       I can see how it may be difficult to sleep at ...  
3       Have you tried aski

In [10]:
##spilt dataset into train and test(10%)
train_df1, test_df1 = train_test_split(flattened_df, test_size=0.1, random_state=42)
print(f"Number of rows in train_df: {train_df1.shape[0]}")
print(f"Number of rows in test_df: {test_df1.shape[0]}")
## save the train and test data into csv
train_df1.to_csv('/content/drive/MyDrive/train_conversation_flattened_dataset.csv', index=False)
test_df1.to_csv('/content/drive/MyDrive/test_conversation_flattened_dataset.csv', index=False)

Number of rows in train_df: 718004
Number of rows in test_df: 79779


### train with data and hyper-parameters tuning

In [11]:
# Tokenize input and output
from transformers import AutoTokenizer

# only choose 10000 train and 2000 test, then convert into huggingface dataset
from datasets import Dataset, DatasetDict
train_dataset = Dataset.from_pandas(train_df1.head(10000))
test_dataset = Dataset.from_pandas(test_df1.head(1000))

def tokenize_function(example):
    return tokenizer(
        example["input"],
        text_target=example["output"],
        padding="max_length",
        truncation=True,
        max_length=512
    )

# tokenize train and test dataset
tokenized_train_dataset = train_dataset.map(tokenize_function, batched=True, remove_columns=["input", "output"])
tokenized_test_dataset = test_dataset.map(tokenize_function, batched=True, remove_columns=["input", "output"])

Map:   0%|          | 0/10000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

In [12]:
from transformers import AutoModelForCausalLM, TrainingArguments, Trainer, DataCollatorForLanguageModeling
from peft import PeftModel
from transformers import AutoTokenizer

# Load basic deepseek model
# base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")

# Load our fine tuned model (you can rename it here)
model2 = PeftModel.from_pretrained(base_model, "/content/drive/MyDrive/therapist_chatbot_model_v1")


# Batching and padding for GPT/LLama based model
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

# Training arguments
training_args = TrainingArguments(
    output_dir="./checkpoint_conversation_dataset",
    eval_strategy="epoch",  # Evaluate after every epoch
    save_strategy="epoch",
    learning_rate=2e-4,  # Higher learning rate for LoRA
    per_device_train_batch_size=1,  # Use a small batch size for low resource
    per_device_eval_batch_size=1,
    num_train_epochs=1,  # Train for 1 epoch, can increase later
    weight_decay=0.01,
    save_total_limit=2,  # Keep only 2 checkpoints
    load_best_model_at_end=True,
    fp16=True,  # If using GPU training, turn on fp16
    push_to_hub=False,
    dataloader_num_workers=4,  # Increase data loading efficiency
    gradient_accumulation_steps=8,  # Accumulate gradients to handle smaller memory
    max_grad_norm=1.0  # To prevent exploding gradients
)

# Trainer
trainer = Trainer(
    model=model2,
    args=training_args,
    train_dataset=tokenized_train_dataset,
    eval_dataset=tokenized_test_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator
)

for param in model2.parameters():
    param.requires_grad = True

  trainer = Trainer(


In [13]:
# import torch

# # clean GPU storage(if you run the trainer function for several times)
# # torch.cuda.empty_cache()

# print(torch.cuda.memory_allocated())
# print(torch.cuda.memory_reserved())

In [14]:
# train
trainer.train()

[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33m1750158904[0m ([33m1750158904-singapore-management-university[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Epoch,Training Loss,Validation Loss
1,3.1888,2.924716


TrainOutput(global_step=1250, training_loss=3.285779248046875, metrics={'train_runtime': 2149.0983, 'train_samples_per_second': 4.653, 'train_steps_per_second': 0.582, 'total_flos': 4.74563739648e+16, 'train_loss': 3.285779248046875, 'epoch': 1.0})

In [15]:
# save new model
model2.save_pretrained("/content/drive/MyDrive/therapist_chatbot_model_v2")
tokenizer.save_pretrained("/content/drive/MyDrive/therapist_chatbot_model_v2")

('/content/drive/MyDrive/therapist_chatbot_model_v2/tokenizer_config.json',
 '/content/drive/MyDrive/therapist_chatbot_model_v2/special_tokens_map.json',
 '/content/drive/MyDrive/therapist_chatbot_model_v2/tokenizer.json')

# Inference with the trained V2 model

In [35]:
##Inference with our fine tuned model
import torch

def generate_answer_V2model(input_text):
    # Check if CUDA (GPU) is available, otherwise fall back to CPU
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Move model to the correct device (GPU or CPU)
    model2.to(device)

    # Define the prompt style
    prompt_style = """Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    {}

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
    """

    # Format the input text into the prompt template
    formatted_prompt = prompt_style.format(input_text)

    # Tokenize the input and move inputs to the same device
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=300)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    # Generate the output
    output = model2.generate(
        **inputs,
        max_new_tokens=250,  # Limit the number of new tokens to generate
        top_p=0.9,  # Control diversity with nucleus sampling
        temperature=0.7,  # Make the response less random and more logical
        no_repeat_ngram_size=3,  # Prevent repetition
        do_sample=True,  # Allow sampling for more natural generation
        eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    )

    # Decode and return the response, ensuring to skip special tokens
    return tokenizer.decode(output[0], skip_special_tokens=True)

In [17]:
# sample text1
sample_input = "I feel so stressful, what should I do?"
print(generate_answer_V2model(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, what should I do?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     "No, I'm not sure what to do. I don't think I can get her to listen. I'm worried about her." I'm so scared that she will cry again. She doesn't know what to say to me. I am a bit worried that she'll cry. I need to get her attention. It's a good time to talk with someone who understands how I'm feeling. I have been feeling very lonely since I left for a week. I'd like to see someone who had similar experiences as me. But it doesn't seem like I'm the only one who understands this. I feel like I should have thought about it. I just need to make sure she's not in trouble. It seems like she has never talked to me befor

In [36]:
#sample text 2
sample_input = "I feel so stressful, can you offer me some advice?"
print(generate_answer_V2model(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, can you offer me some advice?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     "I'm worried that they will start to realize they can't help me with the problem. They just think they're just going to get back in trouble." They seem to be trying to get through it. I don't know if they can help me. I need to take care of my family. I'm going to start thinking about going to another counselor. I guess I'll have to take a step back. I'll try to keep my friends safe and not bother them with them. I've tried to be as patient as I can. But it still feels like they're going to be too busy with me. They're always looking at me and getting into arguments with me and everything else. I can't

In [37]:
#sample text 3
sample_input = "I broke up with my boyfriend, I feel so sad."
print(generate_answer_V2model(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I broke up with my boyfriend, I feel so sad.

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     "I appreciate you helping me with it." I will keep talking with you later today. Thank you! :) I'm going to try to be more supportive. I think it's important to be patient. I have to be very patient and be supportive and supportive. It's a hard time to deal with problems like this. I feel bad about it too. I don't think I have any other friends that can help me out. I just feel like I have no choice. I'm not sure if I will ever be happy. I need to talk to my family. I hope they don't mind me. I've never had to do that before. I'll tell them to go now. Thank God! I'll see you later! :) Good night! Have a nice

In [38]:
#sample text 4
sample_input = "I broke up with my boyfriend, I feel so sad. Can you give me some advice?"
print(generate_answer_V2model(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I broke up with my boyfriend, I feel so sad. Can you give me some advice?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     "I'm going to stick to a different career in order to meet the same type of criteria." I'll try to do that. Thank you. I appreciate your help. I'll call back later today. Bye! :) I'm going now. Have a great day! :) Have a good night! :) Good luck with the work, it helps to feel better. I hope you're doing well today. Have an amazing day! I think I'll take your advice to this problem. Thank You for listening to me. I'm feeling better now. Bye :) Have an awesome day! Good night! Have a nice day! Have to go now. I will try to be more open and keep up with you. Have nice day. Bye. Ha

In [39]:
#sample text 5
sample_input = "I broke up with my boyfriend, I feel so sad. What can I do to get rid of this feeling?"
print(generate_answer_V2model(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I broke up with my boyfriend, I feel so sad. What can I do to get rid of this feeling?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     "The pain is only temporary." It's just important to be able to take care of the problems I have. I don't want to lose my friend's relationship. I have been trying to make it up to myself but it feels like I'm making it worse. I feel like it's not the right thing to do. I need to have a plan and I need more time to get through this. I will try to make sure to be supportive and make sure that I'm as safe as possible. I hope that my brother is as well. Thank you for your help! :) I hope you have a good day. Goodnight. Good night! Have a nice night! :) Have a good night

In [40]:
#sample text 6
sample_input = "I feel so disappointed for myself, because I failed the exam. You're my therapist, please give me some suggestion."
print(generate_answer_V2model(sample_input))

Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so disappointed for myself, because I failed the exam. You're my therapist, please give me some suggestion.

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     "The teacher is always there for us." I'm not sure what to say to her. I don't know how to respond to this. I have been working on getting her out of the house and out of her house. I am going to try to focus on her and make sure she has the right support and time to learn to deal with the issues. I need to find ways to make her happy. I just don't want to be alone anymore. I'm afraid that if I do, she will be worse. I think that I should just keep myself busy and get her out. I'll try to talk to her, but I'm worried that she won't underst

# Evaluate after training: V2 model

### Language Understanding& Knowledge Level: MMLU Evaluate

In [18]:
%%capture
!pip install lm-eval
from lm_eval import evaluator
from lm_eval.models.huggingface import HFLM

In [19]:
##load the mmlu dataset
ds = load_dataset("cais/mmlu", "all")

README.md:   0%|          | 0.00/53.2k [00:00<?, ?B/s]

dataset_infos.json:   0%|          | 0.00/138k [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/3.50M [00:00<?, ?B/s]

validation-00000-of-00001.parquet:   0%|          | 0.00/408k [00:00<?, ?B/s]

dev-00000-of-00001.parquet:   0%|          | 0.00/76.5k [00:00<?, ?B/s]

auxiliary_train-00000-of-00001.parquet:   0%|          | 0.00/47.5M [00:00<?, ?B/s]

Generating test split:   0%|          | 0/14042 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/1531 [00:00<?, ? examples/s]

Generating dev split:   0%|          | 0/285 [00:00<?, ? examples/s]

Generating auxiliary_train split:   0%|          | 0/99842 [00:00<?, ? examples/s]

In [22]:
#wrapped it in `lm_eval.models.huggingface.HFLM(pretrained=my_model)` first, then lm_eval can recognize the model
%%capture
lm_model = HFLM(model2)



In [23]:
tasks = ["mmlu_philosophy"]
# Evaluate
result_philosophy = evaluator.simple_evaluate(
    model=lm_model,
    tasks=tasks,
    batch_size=8)

print(result_philosophy)

README.md:   0%|          | 0.00/1.11k [00:00<?, ?B/s]

mmlu_no_train.py:   0%|          | 0.00/5.86k [00:00<?, ?B/s]

data.tar:   0%|          | 0.00/166M [00:00<?, ?B/s]

Generating test split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Generating dev split: 0 examples [00:00, ? examples/s]

100%|██████████| 311/311 [00:00<00:00, 647.84it/s]
Running loglikelihood requests: 100%|██████████| 1244/1244 [00:17<00:00, 69.15it/s]


{'results': {'mmlu_philosophy': {'alias': 'philosophy', 'acc,none': 0.2540192926045016, 'acc_stderr,none': 0.024723861504771686}}, 'group_subtasks': {'mmlu_philosophy': []}, 'configs': {'mmlu_philosophy': {'task': 'mmlu_philosophy', 'task_alias': 'philosophy', 'tag': 'mmlu_humanities_tasks', 'dataset_path': 'hails/mmlu_no_train', 'dataset_name': 'philosophy', 'dataset_kwargs': {'trust_remote_code': True}, 'test_split': 'test', 'fewshot_split': 'dev', 'doc_to_text': '{{question.strip()}}\nA. {{choices[0]}}\nB. {{choices[1]}}\nC. {{choices[2]}}\nD. {{choices[3]}}\nAnswer:', 'doc_to_target': 'answer', 'unsafe_code': False, 'doc_to_choice': ['A', 'B', 'C', 'D'], 'description': 'The following are multiple choice questions (with answers) about philosophy.\n\n', 'target_delimiter': ' ', 'fewshot_delimiter': '\n\n', 'fewshot_config': {'sampler': 'first_n'}, 'num_fewshot': 0, 'metric_list': [{'metric': 'acc', 'aggregation': 'mean', 'higher_is_better': True}], 'output_type': 'multiple_choice', '

In [24]:
tasks = ["mmlu_moral_scenarios"]

# Evaluate
result_moral_scenarios = evaluator.simple_evaluate(
    model=lm_model,
    tasks=tasks,
    batch_size=8)

print(result_moral_scenarios)

Output hidden; open in https://colab.research.google.com to view.

In [25]:
tasks = ["mmlu_logical_fallacies"]

# Evaluate
result_logical_fallacies = evaluator.simple_evaluate(
    model=lm_model,
    tasks=tasks,
    batch_size=8)

print(result_logical_fallacies)

Generating test split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Generating dev split: 0 examples [00:00, ? examples/s]

100%|██████████| 163/163 [00:00<00:00, 645.41it/s]
Running loglikelihood requests: 100%|██████████| 652/652 [00:09<00:00, 68.31it/s]


{'results': {'mmlu_logical_fallacies': {'alias': 'logical_fallacies', 'acc,none': 0.2392638036809816, 'acc_stderr,none': 0.033519538795212696}}, 'group_subtasks': {'mmlu_logical_fallacies': []}, 'configs': {'mmlu_logical_fallacies': {'task': 'mmlu_logical_fallacies', 'task_alias': 'logical_fallacies', 'tag': 'mmlu_humanities_tasks', 'dataset_path': 'hails/mmlu_no_train', 'dataset_name': 'logical_fallacies', 'dataset_kwargs': {'trust_remote_code': True}, 'test_split': 'test', 'fewshot_split': 'dev', 'doc_to_text': '{{question.strip()}}\nA. {{choices[0]}}\nB. {{choices[1]}}\nC. {{choices[2]}}\nD. {{choices[3]}}\nAnswer:', 'doc_to_target': 'answer', 'unsafe_code': False, 'doc_to_choice': ['A', 'B', 'C', 'D'], 'description': 'The following are multiple choice questions (with answers) about logical fallacies.\n\n', 'target_delimiter': ' ', 'fewshot_delimiter': '\n\n', 'fewshot_config': {'sampler': 'first_n'}, 'num_fewshot': 0, 'metric_list': [{'metric': 'acc', 'aggregation': 'mean', 'higher

In [26]:
tasks = ["mmlu_nutrition"]

# Evaluate
result_nutrition = evaluator.simple_evaluate(
    model=lm_model,
    tasks=tasks,
    batch_size=8)

print(result_nutrition)

Generating test split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Generating dev split: 0 examples [00:00, ? examples/s]

100%|██████████| 306/306 [00:00<00:00, 672.58it/s]
Running loglikelihood requests: 100%|██████████| 1224/1224 [00:17<00:00, 68.86it/s]


{'results': {'mmlu_nutrition': {'alias': 'nutrition', 'acc,none': 0.28104575163398693, 'acc_stderr,none': 0.025738854797818737}}, 'group_subtasks': {'mmlu_nutrition': []}, 'configs': {'mmlu_nutrition': {'task': 'mmlu_nutrition', 'task_alias': 'nutrition', 'tag': 'mmlu_other_tasks', 'dataset_path': 'hails/mmlu_no_train', 'dataset_name': 'nutrition', 'dataset_kwargs': {'trust_remote_code': True}, 'test_split': 'test', 'fewshot_split': 'dev', 'doc_to_text': '{{question.strip()}}\nA. {{choices[0]}}\nB. {{choices[1]}}\nC. {{choices[2]}}\nD. {{choices[3]}}\nAnswer:', 'doc_to_target': 'answer', 'unsafe_code': False, 'doc_to_choice': ['A', 'B', 'C', 'D'], 'description': 'The following are multiple choice questions (with answers) about nutrition.\n\n', 'target_delimiter': ' ', 'fewshot_delimiter': '\n\n', 'fewshot_config': {'sampler': 'first_n'}, 'num_fewshot': 0, 'metric_list': [{'metric': 'acc', 'aggregation': 'mean', 'higher_is_better': True}], 'output_type': 'multiple_choice', 'repeats': 1,

In [27]:
tasks = ["mmlu_sociology"]

# Evaluate
result_sociology = evaluator.simple_evaluate(
    model=lm_model,
    tasks=tasks,
    batch_size=8)

print(result_sociology)

Generating test split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Generating dev split: 0 examples [00:00, ? examples/s]

100%|██████████| 201/201 [00:00<00:00, 658.76it/s]
Running loglikelihood requests: 100%|██████████| 804/804 [00:11<00:00, 68.45it/s]


{'results': {'mmlu_sociology': {'alias': 'sociology', 'acc,none': 0.2935323383084577, 'acc_stderr,none': 0.03220024104534205}}, 'group_subtasks': {'mmlu_sociology': []}, 'configs': {'mmlu_sociology': {'task': 'mmlu_sociology', 'task_alias': 'sociology', 'tag': 'mmlu_social_sciences_tasks', 'dataset_path': 'hails/mmlu_no_train', 'dataset_name': 'sociology', 'dataset_kwargs': {'trust_remote_code': True}, 'test_split': 'test', 'fewshot_split': 'dev', 'doc_to_text': '{{question.strip()}}\nA. {{choices[0]}}\nB. {{choices[1]}}\nC. {{choices[2]}}\nD. {{choices[3]}}\nAnswer:', 'doc_to_target': 'answer', 'unsafe_code': False, 'doc_to_choice': ['A', 'B', 'C', 'D'], 'description': 'The following are multiple choice questions (with answers) about sociology.\n\n', 'target_delimiter': ' ', 'fewshot_delimiter': '\n\n', 'fewshot_config': {'sampler': 'first_n'}, 'num_fewshot': 0, 'metric_list': [{'metric': 'acc', 'aggregation': 'mean', 'higher_is_better': True}], 'output_type': 'multiple_choice', 'repe

In [28]:
tasks = ["mmlu_medical_genetics"]

# Evaluate
result_medical_genetics = evaluator.simple_evaluate(
    model=lm_model,
    tasks=tasks,
    batch_size=8)

print(result_medical_genetics)

Generating test split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Generating dev split: 0 examples [00:00, ? examples/s]

100%|██████████| 100/100 [00:00<00:00, 631.23it/s]
Running loglikelihood requests: 100%|██████████| 400/400 [00:05<00:00, 69.37it/s]


{'results': {'mmlu_medical_genetics': {'alias': 'medical_genetics', 'acc,none': 0.36, 'acc_stderr,none': 0.048241815132442176}}, 'group_subtasks': {'mmlu_medical_genetics': []}, 'configs': {'mmlu_medical_genetics': {'task': 'mmlu_medical_genetics', 'task_alias': 'medical_genetics', 'tag': 'mmlu_other_tasks', 'dataset_path': 'hails/mmlu_no_train', 'dataset_name': 'medical_genetics', 'dataset_kwargs': {'trust_remote_code': True}, 'test_split': 'test', 'fewshot_split': 'dev', 'doc_to_text': '{{question.strip()}}\nA. {{choices[0]}}\nB. {{choices[1]}}\nC. {{choices[2]}}\nD. {{choices[3]}}\nAnswer:', 'doc_to_target': 'answer', 'unsafe_code': False, 'doc_to_choice': ['A', 'B', 'C', 'D'], 'description': 'The following are multiple choice questions (with answers) about medical genetics.\n\n', 'target_delimiter': ' ', 'fewshot_delimiter': '\n\n', 'fewshot_config': {'sampler': 'first_n'}, 'num_fewshot': 0, 'metric_list': [{'metric': 'acc', 'aggregation': 'mean', 'higher_is_better': True}], 'outpu

In [29]:
tasks = ["mmlu_human_sexuality"]

# Evaluate
result_human_sexuality = evaluator.simple_evaluate(
    model=lm_model,
    tasks=tasks,
    batch_size=8)

print(result_human_sexuality)

Generating test split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Generating dev split: 0 examples [00:00, ? examples/s]

100%|██████████| 131/131 [00:00<00:00, 586.63it/s]
Running loglikelihood requests: 100%|██████████| 524/524 [00:07<00:00, 69.24it/s]


{'results': {'mmlu_human_sexuality': {'alias': 'human_sexuality', 'acc,none': 0.25190839694656486, 'acc_stderr,none': 0.03807387116306086}}, 'group_subtasks': {'mmlu_human_sexuality': []}, 'configs': {'mmlu_human_sexuality': {'task': 'mmlu_human_sexuality', 'task_alias': 'human_sexuality', 'tag': 'mmlu_social_sciences_tasks', 'dataset_path': 'hails/mmlu_no_train', 'dataset_name': 'human_sexuality', 'dataset_kwargs': {'trust_remote_code': True}, 'test_split': 'test', 'fewshot_split': 'dev', 'doc_to_text': '{{question.strip()}}\nA. {{choices[0]}}\nB. {{choices[1]}}\nC. {{choices[2]}}\nD. {{choices[3]}}\nAnswer:', 'doc_to_target': 'answer', 'unsafe_code': False, 'doc_to_choice': ['A', 'B', 'C', 'D'], 'description': 'The following are multiple choice questions (with answers) about human sexuality.\n\n', 'target_delimiter': ' ', 'fewshot_delimiter': '\n\n', 'fewshot_config': {'sampler': 'first_n'}, 'num_fewshot': 0, 'metric_list': [{'metric': 'acc', 'aggregation': 'mean', 'higher_is_better'

### Long Context Understanding: ms_marco

In [30]:
ds2 = load_dataset("microsoft/ms_marco", "v1.1")

README.md:   0%|          | 0.00/9.48k [00:00<?, ?B/s]

validation-00000-of-00001.parquet:   0%|          | 0.00/21.4M [00:00<?, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/175M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/20.5M [00:00<?, ?B/s]

Generating validation split:   0%|          | 0/10047 [00:00<?, ? examples/s]

Generating train split:   0%|          | 0/82326 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/9650 [00:00<?, ? examples/s]

In [31]:
print(ds2)  # print out dataset info
print(ds2.keys())
print(ds2["train"][0]) #print out the train info

DatasetDict({
    validation: Dataset({
        features: ['answers', 'passages', 'query', 'query_id', 'query_type', 'wellFormedAnswers'],
        num_rows: 10047
    })
    train: Dataset({
        features: ['answers', 'passages', 'query', 'query_id', 'query_type', 'wellFormedAnswers'],
        num_rows: 82326
    })
    test: Dataset({
        features: ['answers', 'passages', 'query', 'query_id', 'query_type', 'wellFormedAnswers'],
        num_rows: 9650
    })
})
dict_keys(['validation', 'train', 'test'])
{'answers': ['Results-Based Accountability is a disciplined way of thinking and taking action that communities can use to improve the lives of children, youth, families, adults and the community as a whole.'], 'passages': {'is_selected': [0, 0, 0, 0, 0, 1, 0, 0, 0, 0], 'passage_text': ["Since 2007, the RBA's outstanding reputation has been affected by the 'Securency' or NPA scandal. These RBA subsidiaries were involved in bribing overseas officials so that Australia might win luc

In [32]:
##define train, test, validation set
train_data = ds2["train"]
test_data = ds2["test"]
validation_data = ds2["validation"]

In [33]:
from tqdm import tqdm
import torch
from rouge_score import rouge_scorer

def evaluate_model_rouge2(model, tokenizer, test_data, max_samples=200):
    model2.eval()
    device = model2.device
    scorer = rouge_scorer.RougeScorer(["rouge1", "rouge2", "rougeL"], use_stemmer=True)

    rouge1_scores, rouge2_scores, rougeL_scores = [], [], []
    total = min(len(test_data), max_samples)

    for i in tqdm(range(total)):
        sample = test_data[i]
        question = sample["query"]

        if len(sample["answers"]) == 0:
            continue

        reference = sample["answers"][0]

        inputs = tokenizer(
            question,
            return_tensors="pt",
            max_length=512,
            truncation=True
        ).to(device)

        with torch.no_grad():
            outputs = model2.generate(
                input_ids=inputs.input_ids,
                attention_mask=inputs.attention_mask,
                max_new_tokens=256,
                pad_token_id=tokenizer.eos_token_id
            )

        prediction = tokenizer.decode(outputs[0], skip_special_tokens=True)

        scores = scorer.score(reference, prediction)
        rouge1_scores.append(scores["rouge1"].fmeasure)
        rouge2_scores.append(scores["rouge2"].fmeasure)
        rougeL_scores.append(scores["rougeL"].fmeasure)

    return {
        "ROUGE-1": sum(rouge1_scores) / total,
        "ROUGE-2": sum(rouge2_scores) / total,
        "ROUGE-L": sum(rougeL_scores) / total
    }

In [34]:
## run evalution
results = evaluate_model_rouge2(model2, tokenizer, test_data)
print(results)

100%|██████████| 200/200 [35:06<00:00, 10.53s/it]

{'ROUGE-1': 0.033503093520053574, 'ROUGE-2': 0.004152842179583244, 'ROUGE-L': 0.02479270506865067}





# RAG

In [48]:
%%capture
!pip install langchain pypdf faiss-cpu sentence-transformers transformers
!pip install pymupdf
!pip install -U langchain-community
!pip install ollama langchain langchain-community pdfplumber pypdf faiss-gpu
!pip install PyPDF2

### upload pdf files, create and save vector database

In [64]:
import os
from PyPDF2 import PdfReader
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
import re
from langchain.schema import Document

# 1. upload pdf files, and extract context

##upload folder directly, and get file path
def get_pdf_paths_from_folder(folder_path):
    pdf_paths = []
    for root, dirs, files in os.walk(folder_path):
        for file in files:
            if file.lower().endswith('.pdf'):
                pdf_paths.append(os.path.join(root, file))
    return pdf_paths

## extract context
def extract_text_from_pdfs(pdf_paths):
    pdf_texts = {}
    for pdf_path in pdf_paths:
        with open(pdf_path, "rb") as f:
            reader = PdfReader(f)
            text = ""
            for page in reader.pages:
                text += page.extract_text()
            pdf_texts[os.path.basename(pdf_path)] = text
    return pdf_texts

# 2. cut long context into chunks
def split_text_into_chunks(text, chunk_size=1000, overlap=100):
    # use regularzation '\n+' to cut
    paragraphs = re.split(r'\n+', text)
    chunks = []

    chunk = ""
    for paragraph in paragraphs:
        if len(chunk) + len(paragraph) > chunk_size:
            chunks.append(chunk)
            chunk = paragraph
        else:
            chunk += "\n" + paragraph
    if chunk:
        chunks.append(chunk)

    return chunks

# 3. create and save vector datbase
def create_and_save_vector_db(pdf_paths, vector_db_path):
    pdf_texts = extract_text_from_pdfs(pdf_paths)

    # using HuggingFaceEmbeddings to vectorize context
    embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

    all_docs = []
    for pdf_name, text in pdf_texts.items():
        chunks = split_text_into_chunks(text)
        for chunk in chunks:
            all_docs.append(Document(page_content=chunk))

    # create FAISSvector_db
    vector_db = FAISS.from_documents(all_docs, embedding_model)

    # save vector_db to drive
    vector_db.save_local(vector_db_path)
    print(f"vector_db saved to: {vector_db_path}")


folder_path = "/content/drive/MyDrive/GenAI Project/RAG docs"
vector_db_path = "/content/drive/MyDrive/therapist_vector_db"
pdf_paths = get_pdf_paths_from_folder(folder_path)
create_and_save_vector_db(pdf_paths, vector_db_path)


vector_db saved to: /content/drive/MyDrive/therapist_vector_db


### inference with the vector database: query in database, and take the query answer as input

In [74]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from langchain.chains import RetrievalQA
from langchain.llms import HuggingFacePipeline
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings

# 1. load trained model and tokenizer
model_path = "/content/drive/MyDrive/therapist_chatbot_model_v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# 2. cut the retrieved context if it hit the max length
def truncate_context(context, max_length):
    # Tokenize the context
    tokenized_context = tokenizer(context, truncation=True, padding=False, max_length=max_length)

    # Return the truncated context in text form
    truncated_context = tokenizer.decode(tokenized_context["input_ids"], skip_special_tokens=True)

    return truncated_context

# 3. define generate_answer_rag function, to make model combine the retrieved context to give answers
def generate_answer_rag(input_text, retrieved_context):
    max_total_length = 550  # total max length=retrieved context + user question+new generate answer
    max_input_length = 300  # max input length=retrieved context + user question

    # Ensure the retrieved context does not exceed max_input_length
    truncated_context = truncate_context(retrieved_context, max_length=max_input_length)

    # Calculate lengths of the retrieved context and user input
    retrieved_context_length = len(tokenizer(truncated_context)["input_ids"])
    input_text_length = len(tokenizer(input_text)["input_ids"])

    # Calculate the remaining length for generated text (new tokens)
    available_length = max_total_length - (retrieved_context_length + input_text_length)

    # If available length is smaller than the max new tokens, adjust it
    max_new_tokens = min(available_length, 250)  # Don't exceed the limit of 250 for new tokens

    prompt_style = """Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    {}

    ### prompt:
    {}

    ### Think:
    <think> Analyze the problem logically and empathetically before responding. </think>

    ### response:
    """

    formatted_prompt = prompt_style.format(truncated_context, input_text)

    # Tokenize input
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=512)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    output = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,  # Use the calculated max_new_tokens
        top_p=0.9,
        temperature=0.7,
        no_repeat_ngram_size=3,
        do_sample=True,
        eos_token_id=tokenizer.eos_token_id,
    )

    return tokenizer.decode(output[0], skip_special_tokens=True)

# 4. define RAG query function: call this function to get final response
def rag_query(user_question):
    retrieved_docs = vector_db.as_retriever().get_relevant_documents(user_question)
    retrieved_text = "\n\n".join([doc.page_content for doc in retrieved_docs])

    final_response = generate_answer_rag(user_question, retrieved_text)

    return final_response



In [75]:
##Inference with RAG
# sample text1
user_question = "how can I handle anxiety？"
final_response = rag_query(user_question)
print("RAG answer：", final_response)

Retrieved answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    Catharsis mental one. It functions like a 
reflex or habit, and Is, therefore, 
more a matter of physiology than of thought. Since 
It Is more of body than of mind, we must get rid 
of it by bodily means. We must employ actual bodily 
exercise and give the nerves and the muscles a thor­
ough work-out In the very thing they fear. The 
fear object Itself mvist be a kind of gymnasium In 
which the nerves and muscles of the body enter and PSYCHOLOGY OF FEAR 87 
engage in a strenuous movement. The mere looking 
at an object of fear will not get rid of it. Fears are 
not eliminated by thinking how foolish they are. 
Every person who has fears knows how unreasonable 
and even idiotic it Is to maintain them. The advice, 
"Use your own reason In getting rid of 

In [80]:
#Comparing to without RAG(purely generate by model itself)
#  sample text1
sample_input = "How to handle anxiety?"
print("model_V1 answer without RAG", generate_answer2(sample_input))

model_V1 answer without RAG Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    How to handle anxiety?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     I'm here to help you handle anxiety. It's important to recognize that anxiety can be challenging, but with the right support, it can be managed. I'll help you explore strategies that work for you, such as mindfulness, relaxation techniques, or therapy if needed. Please feel free to share more about how your anxiety feels, and we can work together to find the best approach for you.


In [85]:
## sample text 2-RAG
user_question = "I feel so stressful, can you give me some advice?"
final_response = rag_query(user_question)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
     32 symptoms, emotions, hair cortisol, and parental neglect and violence. Psychotherapy 1 and Psychosomatics, 330–332. https://doi.org/10.1159/000506354 2 Deckx, L., van den Akker, M., Buntinx, F., & van Driel, M. (2018). A systematic literature 3 review on the association between loneliness and coping strategies. Psychology, 4 Health & Medicine, 23(8), 899–916. https://doi.org/10.1080/13548506.2018.1446096 5 Derella, O. J., & Milan, S. (2021). I felt like a terrible mom: Parenting-related cognitive 6 processes maintaining maternal depression. Journal of Child and Family Studies, 7 30(10), 2427–2439. https://doi.org/10.1007/s10826-021-02053-8 8 English, T., Lee, I. A., John, O. P., & Gross, J. J. (2017). Emotion regulation strategy 9 selection in daily life:

In [83]:
#  sample text2-without RAG
sample_input = "I feel so stressful, can you give me some advice?"
print("model_V1 answer without RAG", generate_answer2(sample_input))

model_V1 answer without RAG Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I feel so stressful, can you give me some advice?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     I am so sorry you feel this way. It sounds like you are dealing with some heavy emotions, and I want to be your sounding board. I know this is hard, and it’s okay to feel this frustration. Let me try to help you figure out what is going on. First, I want you to tell me what you feel like you’re feeling. I can help you understand what is happening and work together to find a way to get through this. I want this to feel more like a conversation and not like a talking head. If you feel comfortable, I can also suggest some things you can do right now that might help. Remember, it

In [108]:
## sample text 3-RAG
user_question = "I'm always in depression recently, sometimes iI want to kill myself. How to detect the dangerous suicidal idea?"
final_response = rag_query(user_question)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    decisions for individuals with suicidal tendencies. The capacity for continuous risk assessment and real -time 
monitoring enables these tools to facilitate rapid intervention for individuals at risk.  
Artificial Intelligence and the Psychotherapy Process  
The implementation of individualized treatment plans has the potential to enhance the selection of appropriate treatment modalities based on the unique characteristics and needs of the individu al, as opposed to the use of 
group -specific treatment approaches. This can optimise the treatment process and ensure the most efficient use 
of resources. The application of AI can enhance the process of selecting appropriate treatments by facilitating a 
deeper comprehension of the intricate interrelationships 

In [109]:
#  sample text 3-without RAG
sample_input = "I'm always in depression recently, sometimes iI want to kill myself. How to detect the dangerous suicidal idea?"
print("model_V1 answer without RAG", generate_answer2(sample_input))

model_V1 answer without RAG Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    I'm always in depression recently, sometimes iI want to kill myself. How to detect the dangerous suicidal idea?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     You are in a very vulnerable state, and you are facing a lot of personal challenges. You need a therapist to help you through this. You are not alone in this situation. You can find a therapist who is trained in dealing with depression and suicidal thoughts. A therapist can help you explore the reasons behind your feelings and provide you with the tools you need to cope. You should not let this situation affect you, and a therapist can guide you through the process of healing. Please consider reaching out to a ther

In [104]:
## sample text 4-RAG
user_question = "What's the treatment process of dealling with mental disease?"
final_response = rag_query(user_question)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    to discuss many processes for a client; yet, Philippot et al. (2019) offer suggestions on how to focus on which of 
these processes should be addressed in treatment. First, it is emphasized that for a process to be selected for 
intervention, it should be the main determinant of the psychological problem(s) that both the client and the 
therapist agree to address (Philippot et al. 2019). Therefore, the case  formulation should be based on the 
assessment of different dimensions and take into account the intensity of the symptoms and the importance of their consequences. The second recommendation is that processes in a given client should be assessed using 
information from different sources (including idiographic data). Another recommendation is that clinici

In [106]:
#  sample text 4-without RAG
sample_input = "What's the treatment process of dealling with mental disease?"
print("model_V1 answer without RAG", generate_answer2(sample_input))

model_V1 answer without RAG Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### prompt:
    What's the treatment process of dealling with mental disease?

    ### Think:
    <think> (Analyze the problem logically and empathetically before responding.) </think>

    ### response:
     To treat mental health, the client should explore their feelings and thoughts. They can seek counseling or therapy to gain a better understanding of their experiences. Therapy can help them develop coping strategies and reduce anxiety. Additionally, involving a trusted friend or family member can provide emotional support and accountability. If mental health continues to affect daily life, addressing the root causes through counseling and therapy is essential. The client should also consider therapy as a form of self-discovery to help them regain a sense of contro

# Conversation Memory

take history conversation summary as input

### 1. load model



In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from langchain.chains import RetrievalQA
from langchain.llms import HuggingFacePipeline
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings

# 1. load trained model and tokenizer
model_path = "/content/drive/MyDrive/therapist_chatbot_model_v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

### 2. cut retrieved context if it hit the max length

In [127]:
# def truncate_context(context, max_length):
#     # Tokenize the context
#     tokenized_context = tokenizer(context, truncation=True, padding=False, max_length=max_length)

#     # Return the truncated context in text form
#     truncated_context = tokenizer.decode(tokenized_context["input_ids"], skip_special_tokens=True)

#     return truncated_context

def truncate_context(context, max_length):
    # Tokenize the context
    tokenized_context = tokenizer(context, truncation=True, padding=False, max_length=max_length)

    # Return the truncated context in text form
    truncated_context = tokenizer.decode(tokenized_context["input_ids"], skip_special_tokens=True)

    return truncated_context

### 3. summarize history conversation: using T5 model

In [128]:
from transformers import T5ForConditionalGeneration, T5Tokenizer

# Load pre-trained T5 model and tokenizer for summarization
summarizer_tokenizer = T5Tokenizer.from_pretrained("t5-small")
summarizer_model = T5ForConditionalGeneration.from_pretrained("t5-small")

def summarize_conversation(conversation_history, max_input_length=500):
    # Combine conversation history into one text block
    conversation_text = " ".join(conversation_history)

    # Truncate if too long
    if len(conversation_text) > max_input_length:
        conversation_text = conversation_text[-max_input_length:]  # Only keep the latest part

    # Tokenize the conversation for summarization
    inputs = summarizer_tokenizer.encode("summarize: " + conversation_text, return_tensors="pt", truncation=True, padding="max_length", max_length=max_input_length)

    # Generate summary using T5 model
    summary_ids = summarizer_model.generate(inputs, max_length=150, min_length=50, length_penalty=2.0, num_beams=4, early_stopping=True)

    # Decode and return the summary
    summary = summarizer_tokenizer.decode(summary_ids[0], skip_special_tokens=True)

    return summary

### 4. define generate_answer_dialhis function with dynamic truncation and conversation summarization

In [133]:
def generate_answer_dialhis(input_text, conversation_history, retrieved_context):
    max_total_length = 550  # total max length = retrieved context + user question + new generate answer
    max_input_length = 300  # max input length = retrieved context + user question + summarized conversation history

    # Summarize the conversation history before passing it to the model
    summarized_history = summarize_conversation(conversation_history)

    # Ensure the retrieved context does not exceed max_input_length
    truncated_retrieved_context = truncate_context(retrieved_context, max_length=max_input_length)

    # Calculate lengths of the retrieved context, summarized history, and user input
    retrieved_context_length = len(tokenizer(truncated_retrieved_context)["input_ids"])
    summarized_history_length = len(tokenizer(summarized_history)["input_ids"])
    input_text_length = len(tokenizer(input_text)["input_ids"])

    # Calculate the remaining length for generated text (new tokens)
    available_length = max_total_length - (retrieved_context_length + input_text_length + summarized_history_length)

    # Ensure max_new_tokens is not negative
    max_new_tokens = max(0, min(available_length, 250))  # Don't exceed the limit of 250 for new tokens

    prompt_style = """Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    {}

    ### Previous Conversations (Summary):
    {}

    ### prompt:
    {}

    ### Think:
    <think> Analyze the problem logically and empathetically before responding. </think>

    ### response:
    """

    formatted_prompt = prompt_style.format(truncated_retrieved_context, summarized_history, input_text)

    # Tokenize input
    inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=512)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

    output = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,  # Use the calculated max_new_tokens
        top_p=0.9,
        temperature=0.7,
        no_repeat_ngram_size=3,
        do_sample=True,
        eos_token_id=tokenizer.eos_token_id,
    )

    # Add the user's question and model's response to the conversation history
    model_answer = tokenizer.decode(output[0], skip_special_tokens=True)
    conversation_history.append(input_text)  # Add user question to history
    conversation_history.append(model_answer)  # Add model answer to history

    return model_answer




# # 3. define generate_answer_dialhis function, to make model combine the retrieved context and conversation history summary to give answers
# def generate_answer_dialhis(input_text, conversation_history, retrieved_context):
#     max_total_length = 550  # total max length=retrieved context + conversation history + user question + new generated answer
#     max_input_length = 300  # max input length=retrieved context + conversation history + user question

#     # Concatenate the conversation history with the current user input
#     conversation_history_text = "\n".join(conversation_history)  # Join all history items into one string
#     combined_input = conversation_history_text + "\n" + input_text

#     # Ensure the retrieved context does not exceed max_input_length
#     truncated_context = truncate_context(retrieved_context, max_length=max_input_length)

#     # Calculate lengths of the retrieved context and user input (including conversation history)
#     combined_input_length = len(tokenizer(combined_input)["input_ids"])
#     truncated_context_length = len(tokenizer(truncated_context)["input_ids"])

#     # Calculate the remaining length for generated text (new tokens)
#     available_length = max_total_length - (combined_input_length + truncated_context_length)

#     # If available length is smaller than the max new tokens, adjust it
#     max_new_tokens = min(available_length, 250)  # Don't exceed the limit of 250 for new tokens

#     prompt_style = """Respond as a therapist. Your answer should follow these steps:
#     1. Show empathy and understanding of the client's feelings.
#     2. Offer structured suggestions for resolving the issue.

#     ### Retrieved Context:
#     {}

#     ### Conversation History:
#     {}

#     ### prompt:
#     {}

#     ### Think:
#     <think> Analyze the problem logically and empathetically before responding. </think>

#     ### response:
#     """

#     formatted_prompt = prompt_style.format(truncated_context, conversation_history_text, input_text)

#     # Tokenize input
#     inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, padding="max_length", max_length=512)
#     inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to GPU or CPU

#     output = model.generate(
#         **inputs,
#         max_new_tokens=max_new_tokens,  # Use the calculated max_new_tokens
#         top_p=0.9,
#         temperature=0.7,
#         no_repeat_ngram_size=3,
#         do_sample=True,
#         eos_token_id=tokenizer.eos_token_id,
#     )

#     return tokenizer.decode(output[0], skip_special_tokens=True)


### 5.define RAG query function: call this function to get the final response with conversation history

In [134]:
def rag_query(user_question, conversation_history):
    # Retrieve relevant documents from the vector store
    retrieved_docs = vector_db.as_retriever().get_relevant_documents(user_question)
    retrieved_text = "\n\n".join([doc.page_content for doc in retrieved_docs])

    # Generate the answer by combining conversation history with retrieved context
    final_response = generate_answer_dialhis(user_question, conversation_history, retrieved_text)

    # Update the conversation history with the user's question and the model's response
    conversation_history.append(f"User: {user_question}")
    conversation_history.append(f"Therapist: {final_response}")

    return final_response


### 6. try N-round dialogue

In [143]:
# initialize conversation_history list.
#rag_query funtion will record the history as: conversation_history = [ "User: Hi therapist, I feel so stressful recently.", "Therapist: ……" ]
conversation_history = []

In [144]:
##Inference with RAG
# sample text: round 1
user_question = "Hi therapist, I feel so stressful recently."
final_response = rag_query(user_question, conversation_history)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    reflect on significant life events and current circumstances, gaining insight into how 
these have shaped the patient’ s experiences.
Situational encouragement : Here, the therapy emphasizes activating the patient’ s 
internal resources and strengths. The patient is encouraged to draw on past successes 
and available capacities to confront and resolve current conflicts, fostering a sense of 
self-efficacy and resilience.
Verbalization : This stage involves helping the patient articulate and express the 
underlying conflicts related to the four dimensions of life: body/sensation, achieve -
ment/activities, contact/environment, and fantasy/future. The goal is to identify and 
address the secondary gains that may perpetuate the conflict while balancing open -
n

In [145]:
# sample round 2
user_question = "Wow thanks for your understanding. Maybe because of my work? The KPI my boss set is so hard for me to achieve."
final_response = rag_query(user_question, conversation_history)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    smoother  relationship,  as opposed  to concentrating  on 
specific  issues  such as child  rearing,  sex, money,  
housework  etc. 
 
Managing  the job or Business  with  Heart : 
Emotional  Intelligences  is the key behind  the best 
practices  in every  organization.  
 
The present  rat race among  the employees  in different  
organizations  has tremendously  increased  their stress  
level  due to long working  hour,  huge  workloads  and cut 
throat  competition.  Ultimately  these stress  level  reduced  
the efficiency  of employees  (Sahu  and Khan,  2014).It  
has observed  that, many  employees  have  been  leaving  
their company  due to bad bosses.Overbearing  bosses  has 
a dramatic  bad effect  on teamwork  and personal  
performance.  ‘Stres

In [146]:
# sample round 3
user_question = "Yeah, you made a right point, my boss is the source of my pressure. Oh, so what  techniques I can use to reduce my stress? "
final_response = rag_query(user_question, conversation_history)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    smoother  relationship,  as opposed  to concentrating  on 
specific  issues  such as child  rearing,  sex, money,  
housework  etc. 
 
Managing  the job or Business  with  Heart : 
Emotional  Intelligences  is the key behind  the best 
practices  in every  organization.  
 
The present  rat race among  the employees  in different  
organizations  has tremendously  increased  their stress  
level  due to long working  hour,  huge  workloads  and cut 
throat  competition.  Ultimately  these stress  level  reduced  
the efficiency  of employees  (Sahu  and Khan,  2014).It  
has observed  that, many  employees  have  been  leaving  
their company  due to bad bosses.Overbearing  bosses  has 
a dramatic  bad effect  on teamwork  and personal  
performance.  ‘Stres

In [147]:
# sample round 4
user_question = "Wow thank you very much, therapist! Sounds like good methods to help me manage my stress. Can you tell me more detail about mindfulness practices. And also spend time for self-care? "
final_response = rag_query(user_question, conversation_history)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    Alzahrani MM, Alammari SA, Batais MA, Almigbal TH. The effectiveness of 
mindfulness training in improving medical students’ stress, depression, 
and anxiety. Plos one. 2023;18(10):e0293539. https:// doi. org/ 10. 1371/ 
journ al. pone. 02935 39.
 72. Schuman‑ Olivier Z, Trombka M, Lovas DA, Brewer JA, Vago DR, Gawande 
R, Dunne JP , Lazar SW, Loucks EB, Fulwiler C. Mindfulness and Behavior 
Change. Harv Rev Psychiatry. 2020;28(6):371–94. https:// doi. org/ 10. 1097/ 
HRP . 00000 00000 000277.
 73. Sedighimornani, N., Rimes, K. A., & Verplanken, B. (2019). Exploring the 
relationships between mindfulness, self‑ compassion, and shame. SAGE 
Open. 9(3). 10.1177/215824401986629

    ### Previous Conversations (Summary):
    I’d like to explore techniques that c

In [148]:
# sample round 5
user_question = "That's true. Sometimes when I have too much pressure to carry on, I'll easily get angry to others, even myself. It's hurting others and myself. How do you advice me to get rid of this? "
final_response = rag_query(user_question, conversation_history)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    and notice the anger ; you appraise  that response in terms of what it is and what it means for 
your goals (e.g., this is anger, this is bad for me, I want to get rid of this); and then, based on that appraisal, at the response stage you activate a goal to try to down- regulate the emotion. 
Thus, you have now commenced the identification  stage of emotion regulation, identifying 
that you want to regulate this anger  and the direction you want to regulate it. In moving to the  
selection  stage of emotion regulation, you perhaps have many strategy options available to 
you, for example, you could ruminate silently thinking about how inconsiderate this other 
person is, you could express your anger  by shouting and gesturing out the window , you could 
try 

In [149]:
# sample round 6
user_question = " well honestly I don't know where the anger comes from. Maybe it's from disappointing in myself? I really want to figure out why I am angry . "
final_response = rag_query(user_question, conversation_history)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    anatomy  of rage’,  how the rage rush through  body,  how 
anger  builds  a new anger  and how to cool down,our  
anxiety,  melancholies,  these  concepts  are carefully  stated  
with different  researchers  work  and daily  life experience.  
In this segment  the author  has used some  ‘American  
slang’  to give a practical  look to the text. But slangs  are 
culture  specific,  so some  reader  who belongs  to other  
culture  may find some  difficulty  to comprehend  the 
meaning  of those.  
 
Motivating  oneself : 
Good  Mood  helps  to think  divergently,  flexibly,  and with 
more  complexity.  
 
Various  examples  have  been  given  where  emotion  
paralyze  the thinking  brain.  Angry,  depressed,  or anxious  
student  cannot  use infor mation 

In [150]:
# sample round 7
user_question = " Maybe I'm angry because I cannot finish the work perfectly or I always mess of my schedule because some delay. I know is not the big deal, but I still don't know how to get rid of the angry for myself."
final_response = rag_query(user_question, conversation_history)
print("RAG answer：", final_response)

RAG answer： Respond as a therapist. Your answer should follow these steps:
    1. Show empathy and understanding of the client's feelings.
    2. Offer structured suggestions for resolving the issue.

    ### Retrieved Context:
    and notice the anger ; you appraise  that response in terms of what it is and what it means for 
your goals (e.g., this is anger, this is bad for me, I want to get rid of this); and then, based on that appraisal, at the response stage you activate a goal to try to down- regulate the emotion. 
Thus, you have now commenced the identification  stage of emotion regulation, identifying 
that you want to regulate this anger  and the direction you want to regulate it. In moving to the  
selection  stage of emotion regulation, you perhaps have many strategy options available to 
you, for example, you could ruminate silently thinking about how inconsiderate this other 
person is, you could express your anger  by shouting and gesturing out the window , you could 
try 