# **PAN IIT GEN-AI Hackathon**
## **Building ChatBot for Medical Healthcare using Mixtral 8x7b model, FAISS, LANGCHAIN, RAG Model, Question-Answering Pipeline**
- It is a healthcare assistance chatbot—an intelligent virtual assistant designed to provide support and information related to health and wellness through natural language conversations.
- Leveraging artificial intelligence (AI) and natural language processing (NLP) technologies, I aim to enhance the accessibility and efficiency of healthcare services by offering you a user-friendly interface for seeking medical information, advice, or assistance
- We created a heathcare website using NEXT-JS and used MIXTRAL 8x7B model for the functionality of my chatbot website
- We loaded the model from HUgging Face and used their tooken for Authorization
- Used RAG for better answering of the prompts using our dataset and the output generated by the LLM MOdel

**Data Information**
   - The data consists of "The GALE of encyclopedia of Medicine"
   
**Loading the document and splitting text:**
   - Loading the pdf file using PyPDFLoader and extracted text from the pdf 
   - Splitting the text into smaller chunks using langchain.text_splitter

**Text Embeddings:**
   - Text embeddings are generated using the `HuggingFaceEmbeddings` class from `langchain_community.embeddings`.
   - The model used for embedding is 'sentence-transformers/all-MiniLM-L6-v2', and it is configured to run on the CPU.

**Converting to vectors and saving it**
   - Converted text to vector using FAISS class from langchain_community.vectorstores and then saving the data 

In [5]:
#<-------------------------------------------------------------------------------------------------->
#Importing the required libraries
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter 
#<-------------------------------------------------------------------------------------------------->
DATA_PATH = 'data/'  #Path containing my data
DB_FAISS_PATH = 'vectorstore/db_faiss'  #Path where we store the embeddings of the data
#<-------------------------------------------------------------------------------------------------->
#Function for creating embeddings of my data
def create_vector_db():
    #Loading the data
    loader = DirectoryLoader(DATA_PATH,glob='*.pdf',loader_cls=PyPDFLoader)
    documents = loader.load()

    #Splitting the data
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=50)
    texts = text_splitter.split_documents(documents)

    #Converting to embeddings using sentence transformer model 
    embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2',
                                       model_kwargs={'device': 'cpu'})
    db = FAISS.from_documents(texts, embeddings)

    #Saving the model
    db.save_local(DB_FAISS_PATH)

    #Returns the database
    return db
#<-------------------------------------------------------------------------------------------------->

# **Creating a RAG Using LangChain and FAISS**

    Calling the function created above which converts the text data into embeddings

In [6]:
#<-------------------------------------------------------------------------------------------------->
#Storing the embeddings into db1
db1=create_vector_db()
#<-------------------------------------------------------------------------------------------------->

- It creates a retriever using a vector store (`db1`). The retriever is configured for similarity search, enabling the retrieval of documents similar to a given query.

In [7]:
#<-------------------------------------------------------------------------------------------------->
#It checks similar content
retriever = db1.as_retriever(
   search_type="similarity",
   search_kwargs={'k': 4}
)
#<-------------------------------------------------------------------------------------------------->

- Checking our vector database and see if it can retrieve similar chunks of content giving some prompt
- It is basically fetching the output of the prompt from the vector database only

In [8]:
#<-------------------------------------------------------------------------------------------------->
#Query to ask from the database
query = "Tell what should i do to cure fever?"

#Fetching it from above
docs = db1.similarity_search(query)
print(docs[0].page_content)
#<-------------------------------------------------------------------------------------------------->

In addition to relieving pain and reducing inflamma-
tion, aspirin also lowers fever by acting on the part of thebrain that regulates temperature. The brain then signalsthe blood vessels to widen, which allows heat to leavethe body more quickly.
Recommended dosage
Adults
TO RELIEVE PAIN OR REDUCE FEVER. one to two
tablets every three to four hours, up to six times per day.
TO REDUCE THE RISK OF STROKE. one tablet four
times a day or two tablets twice a day.


    Importing the libraries

In [9]:
#Importing the required libraries
#<-------------------------------------------------------------------------------------------------->
import os
import torch
import transformers
from transformers import (
  AutoTokenizer,
  AutoModelForCausalLM,
  BitsAndBytesConfig,
  pipeline
)
from langchain.llms import HuggingFaceHub
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
#<-------------------------------------------------------------------------------------------------->

# **Building an LLM Chain for Question-Answering**

- Fetching the api token from the higging face and laoding model from hugging face 
- Generaing prompt templates and then creating llm chain for answering of our prompt


In [10]:
#<-------------------------------------------------------------------------------------------------->
#API token fetched from hugging face
api_token="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" 

# Load the model from Hugging Face Hub
llm = HuggingFaceHub(
    repo_id="mistralai/Mixtral-8x7B-Instruct-v0.1",
    model_kwargs={"temperature": 1, "max_length": 10000},
    huggingfacehub_api_token=api_token
)
#<-------------------------------------------------------------------------------------------------->
#Generating template of prompt to give to my model
prompt_template = """
### [INST]
Instruction: Answer the question based on your
healthcare knowledge. Here is context to help:

{context}

### QUESTION:
{question}

[/INST]
"""

# Create prompt from prompt template
prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)

# Create LLM chain
llm_chain = LLMChain(llm=llm, prompt=prompt)
#<-------------------------------------------------------------------------------------------------->




- Testing our LLM chain without giving context

In [11]:
#Checking output of our llm chain without giving context
llm_chain.invoke({"context":"",
                 "question": "give causes of glaucoma?"})

{'context': '',
 'question': 'give causes of glaucoma?',
 'text': 'Glaucoma is a group of eye conditions that damage the optic nerve, often caused by an abnormally high pressure in the eye (intraocular pressure). The following are some common causes of glaucoma:\n\n1. Increased fluid production: Overproduction of aqueous humor, the clear fluid that flows inside the front part of the eye, can lead to an increase in intraocular pressure and cause glaucoma.\n2. Red'}

# **Creating a RAG Chain**
Creating a rag chain so that the model has context to the query/prompt

- A retriever is created from the vector store db1 using the as_retriever method.
- The retriever is configured for similarity search, aiming to retrieve the top 20 documents similar to a given query.

In [12]:
#Searching into top 20 docs for the query
#<-------------------------------------------------------------------------------------------------->
from langchain_core.runnables import RunnablePassthrough
query = "Tell symptoms of glaucoma and tell how to cure it"
retriever = db1.as_retriever(
   search_type="similarity",
   search_kwargs={'k': 20}
)
#<-------------------------------------------------------------------------------------------------->

- A RAG (Retrieval-Augmented Generation) chain is constructed using the rag_chain variable.
- The chain includes a retriever for providing context and a language model chain (llm_chain) for generating responses.

- The RAG chain is invoked with a specific query ("Tell symptoms of glaucoma and tell how to cure it").
- The retriever in the chain fetches relevant documents based on similarity to the query.
- The language model chain (llm_chain) then generates responses based on the retrieved context and the given question.

In [13]:
#Building up RAG Chain
#<-------------------------------------------------------------------------------------------------->
rag_chain = (
{"context": retriever, "question": RunnablePassthrough()}
   | llm_chain
)
#<-------------------------------------------------------------------------------------------------->
#Invoking query in pipeline
rag_chain.invoke(query)
#<-------------------------------------------------------------------------------------------------->

{'context': [Document(page_content='and thinning and spotting of the skin. Applying topicalsteroids to the area around the eyes can cause glaucoma .\nOral antihistamines , such as diphenhydramine\n(Benadryl), can relieve symptoms of allergy-related atopicdermatitis. More concentrated topical steroids are recom-mended for persistent symptoms. A mild tranquilizer maybe prescribed to reduce stress and help the patient sleep,and antibiotics are used to treat secondary infections.\nCortisone ointments should be used sparingly, and', metadata={'source': 'data\\71763-gale-encyclopedia-of-medicine.-vol.-1.-2nd-ed.pdf', 'page': 417}),
  Document(page_content='Glaucoma —A condition in which pressure in the\neye is abnormally high. If not treated, glaucomamay lead to blindness.\nHallucination —A false or distorted perception of\nobjects, sounds, or events that seems real. Hallucina-tions usually result from drugs or mental disorders.\nHeat stroke —A severe condition caused by pro-', metadata={'so

    Finally testing our model for given query which uses rag chain to give better prompts

In [14]:
#<-------------------------------------------------------------------------------------------------->
#Query to be asked
query = "Tell about skin cancer recent things?"
#Invoking query in pipeline
rag_chain.invoke(query)
#storing output text
output=rag_chain.invoke(query)["text"]
print(output)
#<-------------------------------------------------------------------------------------------------->


Based on the provided documents, there are two main types of skin cancer: basal cell carcinomas and malignant melanomas. Malignant melanomas are more common on areas of the body exposed to the sun and are cancers that develop from skin cells that produce the brown pigment called melanin.

There is no recent information provided in the documents about skin cancer. However, it is mentioned that approximately 3,500 Americans will be diagnosed


In [23]:
from fastapi import FastAPI, Query
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
origins = [
    "http://localhost:3000",  # Add other allowed origins as needed
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
@app.get("/generate_response/")
async def generate_response(prompt: str = Query(..., title="Prompt", description="Enter your prompt here")):
    # Add your logic to generate a response based on the prompt
    # For demonstration purposes, let's just reverse the prompt.
    rag_chain.invoke(prompt)
    output=rag_chain.invoke(prompt)["text"]
    return {"response": output}

In [24]:
import asyncio
import uvicorn
if __name__ == "__main__":
    config = uvicorn.Config(app)
    server = uvicorn.Server(config)
    await server.serve()

INFO:     Started server process [14076]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)


INFO:     127.0.0.1:62868 - "GET /generate_response/?prompt=I%20am%20having%20serious%20headache%20and%20backpain HTTP/1.1" 200 OK
INFO:     127.0.0.1:62958 - "GET /generate_response/?prompt=%20%22Recently%2C%20I%27ve%20had%20a%20persistent%20itchiness%20in%20various%20parts%20of%20my%20body%2C%20and%20there%20are%20red%2C%20raised%20rashes.%20I%27m%20unsure%20if%20it%27s%20an%20allergy%20or%20something%20else.%20Can%20you%20help%20diagnose%20the%20cause%20of%20these%20skin%20issues%20and%20suggest%20a%20suitable%20treatment%20plan%3F%22 HTTP/1.1" 200 OK
INFO:     127.0.0.1:63016 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:63020 - "GET /generate_response/?prompt=Tell%20about%20skin%20cancer%20recent%20things%3F HTTP/1.1" 200 OK
INFO:     127.0.0.1:63020 - "GET /generate_response/?prompt=Tell%20about%20skin%20cancer%20recent%20things%3F HTTP/1.1" 200 OK
INFO:     127.0.0.1:53073 - "GET /generate_response/?prompt=Recently%2C%20I%27ve%20had%20a%20persistent%20itchiness%20in%20var

In [None]:
# from fastapi import FastAPI, Request
# from fastapi.responses import HTMLResponse
# from fastapi.staticfiles import StaticFiles
# from fastapi.templating import Jinja2Templates

# app = FastAPI()

# # Define static files directory
# app.mount("/static", StaticFiles(directory="static"), name="static")

# # Define templates directory
# templates = Jinja2Templates(directory="templates")

# @app.get("/", response_class=HTMLResponse)
# async def index(request: Request):
#     return templates.TemplateResponse("index.html", {"request": request})

# @app.post("/generate")
# async def generate(query: str):
#     output = rag_chain.invoke(query)["text"]
#     return {"response": output}