In [23]:
from datasets import load_dataset

# Load the dataset
ds = load_dataset("ShenLab/MentalChat16K")

# Preview
print(ds)
print(ds['train'][0])  # View a sample

DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 16084
    })
})
{'instruction': "You are a helpful mental health counselling assistant, please answer the mental health questions based on the patient's description. \nThe assistant gives helpful, comprehensive, and appropriate answers to the user's questions. ", 'input': "I've been struggling with my mental health for a while now, and I can't seem to find a way to cope with it. I've tried visualization, positive thinking, and even medication, but nothing seems to work. I've been feeling lost and helpless, and I don't know what to do next. My mind is a whirlwind of thoughts and emotions, and I can't seem to make sense of it all. I feel like I'm drowning in a sea of confusion, and I can't seem to find my way out.", 'output': "I understand that you've been dealing with a sense of confusion and chaos in your thoughts and emotions for some time now. It's been a challenging journey, an

In [24]:
from langchain.schema import Document

docs = []

if 'ds' in globals():
    for row in ds["train"]:
        user_input = row['input']
        bot_reply = row['output']
        
        docs.append(Document(
            page_content=f"Q: {user_input}\nA: {bot_reply}",
            metadata={"source": "MentalChat16K"}
        ))
else:
    print("Please run the cell that loads the dataset into 'ds' first.")


In [25]:
print(len(docs))
print(docs[1].page_content)


16084
Q: I've been feeling overwhelmed with my caregiving responsibilities, and it's been a struggle to balance these duties with my personal relationships. I've tried to communicate my limitations to my friends and church members, but they don't seem to understand or respect my boundaries. I've been dealing with high anxiety levels, which makes it even harder for me to focus on my own needs. I've tried to take care of myself, but it feels like an insurmountable task.
A: Your situation is complex, and it's important to acknowledge the challenges you're facing. Balancing caregiving responsibilities with personal relationships can be a delicate dance, and it's common to encounter resistance when setting boundaries. I want to help you explore strategies for communicating your needs more effectively and setting clearer boundaries. Additionally, I see that your anxiety levels are significantly impacting your ability to focus on self-care. We can work together to identify the root causes of 

In [26]:
from langchain.schema import Document

# This is your list of original Q&A docs
docs = [
    Document(
        page_content=f"Q: {row['input']}\nA: {row['output']}",
        metadata={"source": "MentalChat16K"}
    )
    for row in ds["train"]
]


In [27]:
from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=500,       # ~100–150 tokens; adjust if needed
    chunk_overlap=50,     # small overlap keeps context
    length_function=len,
)

chunked_docs = text_splitter.split_documents(docs)
chunked_docs[:5]

Created a chunk of size 818, which is longer than the specified 500
Created a chunk of size 565, which is longer than the specified 500
Created a chunk of size 656, which is longer than the specified 500
Created a chunk of size 505, which is longer than the specified 500
Created a chunk of size 571, which is longer than the specified 500
Created a chunk of size 537, which is longer than the specified 500
Created a chunk of size 628, which is longer than the specified 500
Created a chunk of size 920, which is longer than the specified 500
Created a chunk of size 530, which is longer than the specified 500
Created a chunk of size 693, which is longer than the specified 500
Created a chunk of size 517, which is longer than the specified 500
Created a chunk of size 723, which is longer than the specified 500
Created a chunk of size 533, which is longer than the specified 500
Created a chunk of size 562, which is longer than the specified 500
Created a chunk of size 556, which is longer tha

[Document(metadata={'source': 'MentalChat16K'}, page_content="Q: I've been struggling with my mental health for a while now, and I can't seem to find a way to cope with it. I've tried visualization, positive thinking, and even medication, but nothing seems to work. I've been feeling lost and helpless, and I don't know what to do next. My mind is a whirlwind of thoughts and emotions, and I can't seem to make sense of it all. I feel like I'm drowning in a sea of confusion, and I can't seem to find my way out."),
 Document(metadata={'source': 'MentalChat16K'}, page_content="A: I understand that you've been dealing with a sense of confusion and chaos in your thoughts and emotions for some time now. It's been a challenging journey, and it's commendable that you've tried various approaches like visualization, positive thinking, and medication to manage your symptoms. However, it's clear that these methods haven't been effective for you. It's essential to acknowledge that mental health issues

In [28]:
from langchain_community.vectorstores import Chroma
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA



In [29]:
prompt = PromptTemplate.from_template("""
You are a helpful and empathetic mental health assistant.
Use the following context to answer the user's question.

Context:
{context}

User Question:
{question}

Answer:
""")

In [30]:
# Create embeddings using HuggingFaceBgeEmbeddings
huggingface_embeddings = HuggingFaceEmbeddings(
    model_name="all-MiniLM-L6-v2",      
    model_kwargs={'device': 'cpu'},
    encode_kwargs={'normalize_embeddings': True}
)

In [31]:
import  numpy as np
print(np.array(huggingface_embeddings.embed_query(chunked_docs[0].page_content)))
print(np.array(huggingface_embeddings.embed_query(chunked_docs[0].page_content)).shape)

[ 1.00227967e-01 -4.58985902e-02 -1.32571887e-02  1.67972352e-02
  6.06495552e-02  6.65660426e-02  1.67618338e-02  3.89710590e-02
  3.90160829e-02 -8.79540145e-02 -8.89472812e-02 -5.11576496e-02
 -1.75623633e-02 -1.07180234e-02  2.54784655e-02  5.85092884e-03
 -9.63734686e-02  1.73081513e-02 -4.11460847e-02  7.88984671e-02
 -8.08711722e-02  6.51032627e-02 -5.90800010e-02  1.20766442e-02
 -6.76951036e-02  1.32014990e-01 -8.15812871e-02 -4.25389707e-02
 -4.89839502e-02 -3.44678760e-02  1.03357710e-01 -6.29848167e-02
 -8.28355178e-02  3.73951085e-02  5.79311848e-02  3.87340598e-02
 -9.92183015e-02  6.07130490e-02  2.47510634e-02  2.03340650e-02
 -1.43516818e-02  4.69626784e-02  1.57353207e-02  4.45137508e-02
  3.43918130e-02 -6.38742819e-02 -6.22050650e-02  7.69000351e-02
  4.95540425e-02 -1.10678501e-01 -8.88385251e-02  2.15602759e-02
  1.31380372e-02  2.21031178e-02  3.21647502e-03  6.72739223e-02
  9.97070447e-02  1.46288779e-02 -4.40492667e-02  6.29532561e-02
  5.15529253e-02 -8.23065

In [32]:
# Create a Chroma vector store
from langchain_community.vectorstores import Chroma

db = Chroma(
    collection_name="example_collection",
    embedding_function=huggingface_embeddings,
    persist_directory="./chroma_db",  # Where to save data locally, remove if not necessary
)
db.persist()

In [33]:
# Recreate the embedding model
embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# Reload vector store from disk
db = Chroma(
    persist_directory="./chroma_db",
    embedding_function=embedding
)


In [34]:
retriever = db.as_retriever()


In [35]:
from langchain.chains import RetrievalQA
from langchain_community.llms import Ollama

llm = Ollama(model="phi3:3.8b")

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,  # retriever is already defined in cell 13
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt}
)


In [36]:
query = "I'm having frequent panic attacks. How can I cope?"

result = qa_chain.invoke({"query": query})
print(result['result'])


It's understandable that experiencing regular panic attacks is incredibly challenging, but there are effective ways to manage them and build resilience against future episodes. Firstly, practicing grounding techniques like the "5-4-3-2-1" method can help during an attack by anchoring you in the present moment with your five senses: look at 5 things you see around you, focus on four things you can touch or feel, find three things you hear right now, list two smells that are close to you (perhaps there is a cologne or scented lotion), and name one thing you tasted today.

Additionally, deep breathing exercises such as the "box" method—wherein you take slow, deliberate breaths in through your nose for four seconds, hold it at the bottom of your lungs for seven seconds while counting to eight mentally, and then slowly exhale through pursed lips aiming each out-breath towards a mental box or image that visually represents containment—can significantly reduce panic symptoms.

Seeking profess