# **PAN IIT GEN-AI Hackathon**
## **Building ChatBot for Medical Healthcare using Mixtral 8x7b model, FAISS, LANGCHAIN, RAG Model, Question-Answering Pipeline**
- It is a healthcare assistance chatbot—an intelligent virtual assistant designed to provide support and information related to health and wellness through natural language conversations.
- Leveraging artificial intelligence (AI) and natural language processing (NLP) technologies, I aim to enhance the accessibility and efficiency of healthcare services by offering you a user-friendly interface for seeking medical information, advice, or assistance
- We created a heathcare website using NEXT-JS and used MIXTRAL 8x7B model for the functionality of my chatbot website
- We loaded the model from HUgging Face and used their tooken for Authorization
- Used RAG for better answering of the prompts using our dataset and the output generated by the LLM MOdel

**Data Information**
   - The data consists of "The GALE of encyclopedia of Medicine"
   
**Loading the document and splitting text:**
   - Loading the pdf file using PyPDFLoader and extracted text from the pdf 
   - Splitting the text into smaller chunks using langchain.text_splitter

**Text Embeddings:**
   - Text embeddings are generated using the `HuggingFaceEmbeddings` class from `langchain_community.embeddings`.
   - The model used for embedding is 'sentence-transformers/all-MiniLM-L6-v2', and it is configured to run on the CPU.

**Converting to vectors and saving it**
   - Converted text to vector using FAISS class from langchain_community.vectorstores and then saving the data 

In [1]:
#<-------------------------------------------------------------------------------------------------->
#Importing the required libraries
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter 
#<-------------------------------------------------------------------------------------------------->
DATA_PATH = 'data/'  #Path containing my data
DB_FAISS_PATH = 'vectorstore/db_faiss'  #Path where we store the embeddings of the data
#<-------------------------------------------------------------------------------------------------->
#Function for creating embeddings of my data
def create_vector_db():
    #Loading the data
    loader = DirectoryLoader(DATA_PATH,glob='*.pdf',loader_cls=PyPDFLoader)
    documents = loader.load()

    #Splitting the data
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=50)
    texts = text_splitter.split_documents(documents)

    #Converting to embeddings using sentence transformer model 
    embeddings = HuggingFaceEmbeddings(model_name=r'C:\Users\shiva\all-MiniLM-L6-v2',
                                       model_kwargs={'device': 'cpu'})
    db = FAISS.from_documents(texts, embeddings)

    #Saving the model
    db.save_local(DB_FAISS_PATH)

    #Returns the database
    return db
#<-------------------------------------------------------------------------------------------------->

In [2]:
!pip install langchain_community




In [3]:
!pip install pypdf
!pip install faiss-cpu
!pip install sentence-transformers
!pip install huggingface-hub
!pip install transformers==4.10.0
!pip install torch==2.2.0

Collecting transformers==4.10.0
  Using cached transformers-4.10.0-py3-none-any.whl.metadata (51 kB)
Collecting sacremoses (from transformers==4.10.0)
  Using cached sacremoses-0.1.1-py3-none-any.whl.metadata (8.3 kB)
Collecting tokenizers<0.11,>=0.10.1 (from transformers==4.10.0)
  Using cached tokenizers-0.10.3.tar.gz (212 kB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Using cached transformers-4.10.0-py3-none-any.whl (2.8 MB)
Using cached sacremoses-0.1.1-py3-none-any.whl (897 kB)
Building wheels for collected packages: tokenizers
  Building wheel for tokenizers (pyproject.toml): started
  Building wheel for tokenizers (pyproject.toml): still running...
  Building wheel for tokenizers (pyproject.toml)

  error: subprocess-exited-with-error
  
  × Building wheel for tokenizers (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [417 lines of output]
      !!
      
              ********************************************************************************
              Please consider removing the following classifiers in favor of a SPDX license expression:
      
              License :: OSI Approved :: Apache Software License
      
              See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
              ********************************************************************************
      
      !!
        self._finalize_license_expression()
      running bdist_wheel
      running build
      running build_py
      creating build\lib.win-amd64-cpython-311\tokenizers
      copying py_src\tokenizers\__init__.py -> build\lib.win-amd64-cpython-311\tokenizers
      creating build\lib.win-amd64-cpython-311\tokenizers\mode



# **Creating a RAG Using LangChain and FAISS**

    Calling the function created above which converts the text data into embeddings

In [5]:
#<-------------------------------------------------------------------------------------------------->

from sentence_transformers import SentenceTransformer

# Use local folder path
model = SentenceTransformer(r"C:\Users\shiva\all-MiniLM-L6-v2")

db1=create_vector_db()
#<-------------------------------------------------------------------------------------------------->

  embeddings = HuggingFaceEmbeddings(model_name=r'C:\Users\shiva\all-MiniLM-L6-v2',


In [6]:
import shutil
shutil.rmtree(r"C:\Users\shiva\.cache\huggingface\hub\models--sentence-transformers--all-MiniLM-L6-v2", ignore_errors=True)


- It creates a retriever using a vector store (`db1`). The retriever is configured for similarity search, enabling the retrieval of documents similar to a given query.

In [7]:
#<-------------------------------------------------------------------------------------------------->
#It checks similar content
retriever = db1.as_retriever(
   search_type="similarity",
   search_kwargs={'k': 4}
)
#<-------------------------------------------------------------------------------------------------->

- Checking our vector database and see if it can retrieve similar chunks of content giving some prompt
- It is basically fetching the output of the prompt from the vector database only

In [8]:
#<-------------------------------------------------------------------------------------------------->
#Query to ask from the database
query = "Tell what should i do to cure fever?"

#Fetching it from above
docs = db1.similarity_search(query)
print(docs[0].page_content)
#<-------------------------------------------------------------------------------------------------->

are not growing and are in a resting state. Alternatively, a
“broad spectrum” antibiotic may be used which would
kill many different kinds of bacteria.
Aspirin or other medications which reduce the pain
and the fever may also be given. Medications which
reduce any inflammation of the infected region may also
be provided. The patient is likely to be hospitalized to
administer the antibiotic and other medications and to
closely monitor his or her condition. Surgical drainage of


    Importing the libraries

In [9]:
#Importing the required libraries
#<-------------------------------------------------------------------------------------------------->
import os
import torch
import transformers
from transformers import (
  AutoTokenizer,
  AutoModelForCausalLM,
  BitsAndBytesConfig,
  pipeline
)
from langchain.llms import HuggingFaceHub
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
#<-------------------------------------------------------------------------------------------------->

# **Building an LLM Chain for Question-Answering**

- Fetching the api token from the higging face and laoding model from hugging face 
- Generaing prompt templates and then creating llm chain for answering of our prompt


In [10]:
#<-------------------------------------------------------------------------------------------------->
#API token fetched from hugging face
api_token="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" 

# Load the model from Hugging Face Hub
llm = HuggingFaceHub(
    repo_id="mistralai/Mixtral-8x7B-Instruct-v0.1",
    model_kwargs={"temperature": 1, "max_length": 10000},
    huggingfacehub_api_token=api_token
)
#<-------------------------------------------------------------------------------------------------->
#Generating template of prompt to give to my model
prompt_template = """
### [INST]
Instruction: Answer the question based on your
healthcare knowledge. Here is context to help:

{context}

### QUESTION:
{question}

[/INST]
"""

# Create prompt from prompt template
prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)

# Create LLM chain
llm_chain = LLMChain(llm=llm, prompt=prompt)
#<-------------------------------------------------------------------------------------------------->


  llm = HuggingFaceHub(
  llm_chain = LLMChain(llm=llm, prompt=prompt)


- Testing our LLM chain without giving context

In [11]:
#Checking output of our llm chain without giving context
llm_chain.invoke({"context":"",
                 "question": "give causes of glaucoma?"})



{'context': '',
 'question': 'give causes of glaucoma?',
 'text': "\n### [INST]\nInstruction: Answer the question based on your\nhealthcare knowledge. Here is context to help:\n\n\n\n### QUESTION:\ngive causes of glaucoma?\n\n[/INST]\nGlaucoma is a group of eye conditions that damage the optic nerve, often caused by an abnormally high pressure in the eye (intraocular pressure). The following are some common causes of glaucoma:\n\n1. Increased fluid production: Overproduction of aqueous humor, the clear fluid that flows inside the front part of the eye, can lead to an increase in intraocular pressure and cause glaucoma.\n2. Reduced fluid drainage: The eye's drainage system, called the trabecular meshwork, may not function properly, leading to a buildup of aqueous humor and increased intraocular pressure.\n3. Eye injuries: Trauma to the eye can damage the eye's drainage system and cause glaucoma, even years after the initial injury.\n4. Certain medications: Corticosteroids, especially wh

# **Creating a RAG Chain**
Creating a rag chain so that the model has context to the query/prompt

- A retriever is created from the vector store db1 using the as_retriever method.
- The retriever is configured for similarity search, aiming to retrieve the top 20 documents similar to a given query.

In [12]:
#Searching into top 20 docs for the query
#<-------------------------------------------------------------------------------------------------->
from langchain_core.runnables import RunnablePassthrough
query = "Tell symptoms of glaucoma and tell how to cure it"
retriever = db1.as_retriever(
   search_type="similarity",
   search_kwargs={'k': 20}
)
#<-------------------------------------------------------------------------------------------------->

- A RAG (Retrieval-Augmented Generation) chain is constructed using the rag_chain variable.
- The chain includes a retriever for providing context and a language model chain (llm_chain) for generating responses.

- The RAG chain is invoked with a specific query ("Tell symptoms of glaucoma and tell how to cure it").
- The retriever in the chain fetches relevant documents based on similarity to the query.
- The language model chain (llm_chain) then generates responses based on the retrieved context and the given question.

In [13]:
#Building up RAG Chain
#<-------------------------------------------------------------------------------------------------->
rag_chain = (
{"context": retriever, "question": RunnablePassthrough()}
   | llm_chain
)
#<-------------------------------------------------------------------------------------------------->
#Invoking query in pipeline
rag_chain.invoke(query)
#<-------------------------------------------------------------------------------------------------->



{'context': [Document(id='8276c459-4b9f-4f3f-8553-5ac0d45a729d', metadata={'producer': 'PDFlib+PDI 5.0.0 (SunOS)', 'creator': 'PyPDF', 'creationdate': '2004-12-18T17:00:02-05:00', 'moddate': '2004-12-18T16:15:31-06:00', 'source': 'data\\71763-gale-encyclopedia-of-medicine.-vol.-1.-2nd-ed.pdf', 'total_pages': 637, 'page': 417, 'page_label': '418'}, page_content='not use products that contain coal tar. Topical steroids can\ncause itching, burning,acne, permanent stretch marks,\nand thinning and spotting of the skin. Applying topical\nsteroids to the area around the eyes can cause glaucoma.\nOral antihistamines , such as diphenhydramine\n(Benadryl), can relieve symptoms of allergy-related atopic\ndermatitis. More concentrated topical steroids are recom-\nmended for persistent symptoms. A mild tranquilizer may'),
  Document(id='706a65f1-3cfa-44e0-b376-21e6a6fca312', metadata={'producer': 'PDFlib+PDI 5.0.0 (SunOS)', 'creator': 'PyPDF', 'creationdate': '2004-12-18T17:00:02-05:00', 'moddate':

    Finally testing our model for given query which uses rag chain to give better prompts

In [14]:
#<-------------------------------------------------------------------------------------------------->
#Query to be asked
query = "Tell about skin cancer recent things?"
#Invoking query in pipeline
rag_chain.invoke(query)
#storing output text
output=rag_chain.invoke(query)["text"]
print(output)
#<-------------------------------------------------------------------------------------------------->




### [INST]
Instruction: Answer the question based on your
healthcare knowledge. Here is context to help:

[Document(id='c1b17961-c593-477b-9595-1bace4edd807', metadata={'producer': 'PDFlib+PDI 5.0.0 (SunOS)', 'creator': 'PyPDF', 'creationdate': '2004-12-18T17:00:02-05:00', 'moddate': '2004-12-18T16:15:31-06:00', 'source': 'data\\71763-gale-encyclopedia-of-medicine.-vol.-1.-2nd-ed.pdf', 'total_pages': 637, 'page': 587, 'page_label': '588'}, page_content='tribute to development of intestinal cancers\n• smoking, which causes lung cancer\n• excessive use of alcohol, which is associated with liver\ncancer\n• excessive exposure to the sun, which can cause\nmelanoma (a deadly form of skin cancer).\nMonthly self-examinations of the breasts and testi-\ncles can detect breast and testicular cancer at their earli-\nest, most curable stages.\nResources\nBOOKS\nThe Editors of Time-Life Books, Inc. The Medical Advisor:'), Document(id='f3a55828-077a-4db5-8dfe-b48feb5878ca', metadata={'producer': 'PD

In [15]:
!pip install fastapi



In [16]:
from fastapi import FastAPI, Query
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
origins = [
    "http://localhost:3000",  # Add other allowed origins as needed
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
@app.get("/generate_response/")
async def generate_response(prompt: str = Query(..., title="Prompt", description="Enter your prompt here")):
    # Add your logic to generate a response based on the prompt
    # For demonstration purposes, let's just reverse the prompt.
    rag_chain.invoke(prompt)
    output=rag_chain.invoke(prompt)["text"]
    return {"response": output}

In [17]:
!pip install uvicorn
!pip install asyncio



In [None]:
import asyncio
import uvicorn
if __name__ == "__main__":
    config = uvicorn.Config(app)
    server = uvicorn.Server(config)
    await server.serve()

INFO:     Started server process [27580]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)


INFO:     127.0.0.1:52043 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:52043 - "GET /favicon.ico HTTP/1.1" 404 Not Found


In [None]:
# from fastapi import FastAPI, Request
# from fastapi.responses import HTMLResponse
# from fastapi.staticfiles import StaticFiles
# from fastapi.templating import Jinja2Templates

# app = FastAPI()

# # Define static files directory
# app.mount("/static", StaticFiles(directory="static"), name="static")

# # Define templates directory
# templates = Jinja2Templates(directory="templates")

# @app.get("/", response_class=HTMLResponse)
# async def index(request: Request):
#     return templates.TemplateResponse("index.html", {"request": request})

# @app.post("/generate")
# async def generate(query: str):
#     output = rag_chain.invoke(query)["text"]
#     return {"response": output}