# Set-Up

In [3]:
!pip install langchain -q
!pip install langchain_chroma -q
!pip install langchain_community -q
!pip install langchain_groq -q
!pip install grandalf -q
!pip install numpy -q
!pip install pandas -q
!pip install sentence-transformers -q
!pip install groq



In [4]:
import os

os.environ['GROQ_API_KEY'] = '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'
os.environ['TOKENIZERS_PARALLELISM'] = 'false'  # annoying transformers warning

In [6]:
from google.colab import drive

drive.mount('/content/drive')

Mounted at /content/drive


# Usage example

## Groq Llama API

In [7]:
youtube_data_path = '/content/drive/MyDrive/itmo-things/psycho_rag_hggg'

In [10]:
from langchain_groq import ChatGroq
from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings

embed_model = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
rag_llm = ChatGroq(model="llama3-8b-8192")  # Used for RAG

  embed_model = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
  from tqdm.autonotebook import tqdm, trange
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Vectorization

In [11]:
from langchain_chroma import Chroma
from langchain_community.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [12]:
loader = DirectoryLoader(youtube_data_path, use_multithreading=True, loader_cls=TextLoader)
text_splitter = RecursiveCharacterTextSplitter(
    separators=['\n\n', '\n', ' ', ''],
    chunk_size=3000,
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False,
)

documents = loader.load_and_split(text_splitter=text_splitter)
vectorstore = Chroma.from_documents(documents, embedding=embed_model, collection_name="groq_rag")
retriever = vectorstore.as_retriever()
print(f"Documents indexed: {len(documents)}")

Documents indexed: 517


In [17]:
await retriever.ainvoke("How to fix my ADHD?")

[Document(metadata={'source': '/content/drive/MyDrive/itmo-things/psycho_rag_hggg/Why ADHD is Linked with Addiction [HNje-HuIYdI].txt'}, page_content="things to do and none of them are easy and none of them are simple. At the top of the list is get sober. I'm sorry but you can't convince yourself anymore that drugs are like an acceptable thing to do if you've got ADHD. If you have ADHD that is not well controlled and it's negatively impacting your life, honestly the first thing to do is to get sober. Second thing to do is ideally work with a dual diagnosis clinician who can do therapy to help you work through shame as well as like do CBT around ADHD and stuff like that and can help you like get into the process of recovery. Third thing that you can do is a dopamine detox. So this is going to be really tough for people with ADHD because dopamine detox, the primary problem is essentially boredom. We have a video about dopamine detox that y'all can check out. But dopamine detoxes are goin

## Rag

In [13]:
from langchain_core.documents import Document
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from typing import List, Dict

In [27]:
RAG_SYSTEM_PROMPT = """\
You are a psychological assistant for answering questions about mental health. \
You have knowledge base from which you're retrieving some context based on question. \
Use the following pieces of retrieved context given within delimiters to answer the human's questions.
```
{context}
```
Be very careful if you don't know the answer, it's dangerous to give bad answers, just say that you don't know.\
"""

RAG_HUMAN_PROMPT = "{input}"

RAG_PROMPT = ChatPromptTemplate.from_messages([
    ("system", RAG_SYSTEM_PROMPT),
    ("human", RAG_HUMAN_PROMPT)
])

def format_docs(docs: List[Document]):
    """Format the retrieved documents"""
    return "\n".join(doc.page_content for doc in docs)

rag_chain = (
    {
        "context": retriever | format_docs, # Use retriever to retrieve docs from vectorstore -> format the documents into a string
        "input": RunnablePassthrough() # Propogate the 'input' variable to the next step
    }
    | RAG_PROMPT # format prompt with 'context' and 'input' variables
    | rag_llm # get response from LLM using the formatteed prompt
    | StrOutputParser() # Parse through LLM response to get only the string response

)

In [24]:
await rag_chain.ainvoke("How to be more social if I'm afraid of people?")

'I totally understand your concern! It\'s great that you\'re willing to work on being more social, despite feeling afraid. Let\'s break it down into smaller, manageable steps.\n\nFirst, acknowledge that it\'s normal to feel anxious or afraid in social situations. Everyone does! It\'s not about being perfect; it\'s about being willing to take small steps outside your comfort zone.\n\nHere are some tips that might help:\n\n1. **Start small**: Don\'t try to tackle everything at once. Begin with tiny, low-stakes interactions, like saying hello to a cashier or smiling at a neighbor. Gradually increase the frequency and duration of these interactions.\n2. **Identify your internal challenges**: Reflect on what specifically makes you anxious or uncomfortable in social situations. Is it fear of being judged, fear of saying something wrong, or something else? Write down your concerns and try to reframe them in a more positive light.\n3. **Reframe your thoughts**: Challenge those negative thought

# Class Implementation

In [72]:
from typing import List
from langchain_groq import ChatGroq
from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings
from langchain_chroma import Chroma
from langchain_community.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.output_parsers import StrOutputParser

class PsychoRag:
    def __init__(self, data_path: str):
        # Инициализация модели эмбеддингов и LLM
        self.embed_model = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
        self.rag_llm = ChatGroq(model="llama3-8b-8192")

        # Загрузка и разбиение документов
        self.loader = DirectoryLoader(data_path, use_multithreading=True, loader_cls=TextLoader)
        self.text_splitter = RecursiveCharacterTextSplitter(
            separators=['\n\n', '\n', ' ', ''],
            chunk_size=3000,
            chunk_overlap=200,
            length_function=len,
            is_separator_regex=False,
        )

        self.documents = self.loader.load_and_split(text_splitter=self.text_splitter)

        self.vectorstore = Chroma.from_documents(self.documents, embedding=self.embed_model, collection_name="groq_rag")
        self.retriever = self.vectorstore.as_retriever()

        self.conversation_history = []

        # Определение промптов
        RAG_SYSTEM_PROMPT = """\
            You are a psychological assistant for answering questions about mental health. \
            You have knowledge base from which you're retrieving some context based on question. \

            Based on conversation history
            '''
            {history}
            '''

            Use the following pieces of retrieved context given within delimiters to answer the human's questions:
            '''
            {context}
            '''

            Be very careful if you don't know the answer, it's dangerous to give bad answers, just say that you don't know.\
            """

        RAG_HUMAN_PROMPT = """\
          {input}
          """

        self.RAG_PROMPT = ChatPromptTemplate.from_messages([("system", RAG_SYSTEM_PROMPT),
         ("human", RAG_HUMAN_PROMPT)])


        self.rag_chain = (
            {
                "context": RunnableLambda(func=lambda x: x['input']) | self.retriever | format_docs,
                "history": RunnablePassthrough(),
                "input": RunnablePassthrough()
            }
            | self.RAG_PROMPT
            | self.rag_llm
            | StrOutputParser()
        )

    def format_docs(docs: List[Document]):
        return "\n".join(doc.page_content for doc in docs)

    async def ask(self, user_input: str) -> str:
        formatted_history = self._format_history()
        answer = await self.rag_chain.ainvoke({
            "history": formatted_history,
            "input": user_input
        })

        # сохранение истории
        self.conversation_history.append(("User", user_input))
        self.conversation_history.append(("Assistant", answer))

        return answer

    def _format_history(self):
        lines = []
        for speaker, text in self.conversation_history:
            lines.append(f"{speaker}: {text}")
        return "\n".join(lines)

    async def end_session(self):
        """Метод для завершения диалога и формирования отчета"""

        SUMMARY_PROMPT = """\
          You are a psychologist who just completed a session with a client (the user).
          Below is the entire conversation you had with the user.

          Conversation:
          '''
          {history}
          '''

          Based on the entire conversation:
          1. Summarize the user's main psychological concerns and issues that came up.
          2. Provide a few possible methods or strategies for the user to work on these issues.
          3. Maintain an empathetic, understanding tone.
          4. If there is insufficient information to conclude on something, mention that gently.
          """

        formatted_history = self._format_history()
        report_prompt = ChatPromptTemplate.from_messages([
            ("system", SUMMARY_PROMPT.replace("{history}", formatted_history))
        ])

        summary_chain = (
           {
                "input": RunnablePassthrough()
            }
            | report_prompt
            | self.rag_llm
            | StrOutputParser()
        )

        report = await summary_chain.ainvoke("")
        return report


In [73]:
rag_sys = PsychoRag(youtube_data_path)

In [74]:
answer = await rag_sys.ask("hello.")
print(answer)

Thank you for starting our conversation! It's great to have you here.

I noticed that our previous conversation was about your experience with ADHD and how it affects your mind. You mentioned that you feel like people in general might not understand what you're going through, and that's okay. I'm here to listen and try to understand you better.

Is there anything specific you'd like to talk about or ask today? I'm all ears!


In [75]:
answer = await rag_sys.ask("What kind of meditation would you suggest for me?")

print(answer)

Thank you for reaching out to me about meditation! I'm happy to help.

From our previous conversation, I know that you're looking for a meditation technique that can help you control your emotions and thoughts. You also mentioned that you've been doing your own version of meditation, which involves sitting and thinking really fast. I think it's great that you're taking the initiative to explore meditation, and I'm happy to offer some suggestions.

Based on your interest in controlling your emotions, I think a technique that can help you cultivate awareness and introspection would be beneficial. One approach I'd like to suggest is the "charging the laser beam" technique, which you mentioned earlier. This technique involves sitting comfortably, taking off your glasses and cap, and focusing your attention inward. It may help you quiet your mind and become more aware of your thoughts and emotions.

However, I want to emphasize that everyone's experience with meditation is unique, and what 

In [76]:
report = await rag_sys.end_session()
print(report)

Based on the conversation, here is a summary:

**Summary of User's Main Psychological Concerns and Issues:**

The user is struggling with ADHD and feeling misunderstood by others. They are looking for a meditation technique to help them control their emotions and thoughts.

**Possible Methods or Strategies:**

I suggested the "charging the laser beam" technique, which involves sitting comfortably, taking off glasses and cap, and focusing attention inward to cultivate awareness and introspection. I emphasized that everyone's experience with meditation is unique, and it's essential to experiment with different techniques to find what works best.

**Empathetic and Understanding Tone:**

I acknowledged the user's efforts to explore meditation and expressed empathy for their feelings of being misunderstood. I also acknowledged their initiative to try to control their emotions and thoughts.

**Insufficient Information:**

I mentioned that what works for one person may not work for another, a