<a href="https://colab.research.google.com/github/peremartra/Large-Language-Model-Notebooks-Course/blob/main/3-LangChain/3_4_Medical_Assistant_Agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div>
    <h1>Large Language Models Projects</a></h1>
    <h3>Apply and Implement Strategies for Large Language Models</h3>
    <h2>3.4-Create a Medical Assistant RAG System chat with LangChain & ChromaDB</h2>
</div>

by [Pere Martra](https://www.linkedin.com/in/pere-martra/)

_______
Models: OpenAI.

Colab Environment: CPU.

Keys:
* RAG
* ChromaDB
* Embeddings
* OpenAI
_______


#Installing libraries & Loading Dataset

In [None]:
!pip install -q langchain==0.1.4
!pip install -q langchain-openai==0.0.5
!pip install -q langchainhub==0.1.14
!pip install -q datasets==2.16.1
!pip install -q chromadb==0.4.22

We will download the dataset from the Hugging Face datasets library. It's a dataset with information about diseases.

In [None]:
from datasets import load_dataset

data = load_dataset("keivalya/MedQuad-MedicalQnADataset", split='train')


In [None]:
data = data.to_pandas()
data.head(10)

In [None]:
#uncoment this line if you want to limit the size of the data.
data = data[0:100]

As you can see, the medical information in the dataset is well-organized, and to someone like me, who is not an expert in the field, it appears to be quite valuable. This information could be a useful addition to any general medicine book to support primary care doctors.

Load the langchain libraries to load the document.

In [None]:
from langchain.document_loaders import DataFrameLoader
from langchain.vectorstores import Chroma

The Document is in the Answer column, and the others columns are Metadata.

In [None]:
df_loader = DataFrameLoader(data, page_content_column="Answer")


In [None]:
df_document = df_loader.load()
display(df_document[:2])

We can chunk the documents. The size to which we want to split the document is a design decision. The larger it is, the larger the prompt will be, and the slower the Model's response process.

We also need to consider the maximum prompt size and ensure that the document does not exceed it.

In [None]:
from langchain.text_splitter import CharacterTextSplitter

In [None]:
text_splitter = CharacterTextSplitter(chunk_size=1250,
                                      separator="\n",
                                      chunk_overlap=100)
texts = text_splitter.split_documents(df_document)


These warnings we see are because it can't perform the partition of the required size. This is because it waits for a page break to divide the text and does so when possible.

In [None]:
first_doc = texts[1]
print(first_doc.page_content)

### Initialize the Embedding Model and Vector DB

We load the text-embedding-ada-002 model from OpenAI.

In [None]:
from getpass import getpass
OPENAI_API_KEY = getpass("OpenAI API Key: ")

In [None]:
from langchain_openai import OpenAIEmbeddings

model_name = 'text-embedding-ada-002'

embed = OpenAIEmbeddings(
    model=model_name,
    openai_api_key=OPENAI_API_KEY
)

The execution of this cell may take 3 to 5 minutes. If you want it to be faster, you can reduce the number of records in the dataset.

In [None]:
directory_cdb = '/content/drive/MyDrive/chromadb'
chroma_db = Chroma.from_documents(
    df_document, embed, persist_directory=directory_cdb
)

We are going to create three objects.

* The language model, which can be any of those from OpenAI.
* The memory, responsible for keeping the prompt with all the necessary history.
* The retrieval, used to obtain information stored in ChromaDB.

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain_openai import OpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.chains import RetrievalQA

llm=OpenAI(openai_api_key=OPENAI_API_KEY,
           temperature=0.0)

conversational_memory = ConversationBufferWindowMemory(
    memory_key='chat_history',
    k=4, #Number of messages stored in memory
    return_messages=True #Must return the messages in the response.
)

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=chroma_db.as_retriever()
)

We can try the isolated Retrieval to see if the information it returns is relevant.




In [None]:
qa.run("What is the main symptom of LCM?")

Perfect! The information returned is exactly what we desired.

## Creating the Agent.

In [None]:
from langchain.agents import Tool

#Defining the list of tool objects to be used by LangChain.
tools = [
    Tool(
        name='Medical KB',
        func=qa.run,
        description=(
            """use this tool when answering medical knowledge queries to get
            more information about the topic"""
        )
    )
]

In [None]:
from langchain.agents import create_react_agent
from langchain import hub

prompt = hub.pull("hwchase17/react-chat")
agent = create_react_agent(
    tools=tools,
    llm=llm,
    prompt=prompt,
)

In [None]:
# Create an agent executor by passing in the agent and tools
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent,
                               tools=tools,
                               verbose=True,
                               memory=conversational_memory,
                               max_iterations=30,
                               max_execution_time=600,
                               handle_parsing_errors=True
                               )

### Using the Conversational Agent

To make queries we simply call the `agent` directly.

First i will try a order not related to the Medical field.

In [None]:
agent_executor.invoke({"input": "Give me the area of square of 2x2"})

Perfect, the model has responded without accessing the configured knowledge database.

Now I will try with a question that is also not related to health.

In [None]:
agent_executor.invoke({"input": "Do you know who is Clark Kent?"})

It has not accessed either, as the model has been able to identify that it is not a question related to the database that LangChain provides.

Now it's time to try with a question related to Medicine. Let's see if the model can understand that it should first look for information in the vector database at its disposal.

In [None]:
 agent_executor.memory.clear()

In [None]:
agent_executor.invoke({"input": """I have a patient that can have Botulism,
how can I confirm the diagnosis?"""})

Perfect, the most important thing for us is that it has been able to identify that it should go to the medical database to search for information about the symptoms.

In [None]:
agent_executor.invoke({"input": "Is this an important illness?"})

And the memory works perfectly. We can maintain a conversation, taking into account that the model knows the previous questions and answers.

# Conclusions.
The experiment has been a small success. The Vectorial database has been configured and filled with information from the dataset. A LangChain agent has been created, and it has been able to retrieve information from the database only when necessary. Don't forget that our ChatBot has memory.

All of this in just a few lines of code!


---