<a href="https://colab.research.google.com/github/Manish1176/RAG_with_HuggingFace/blob/main/Medical_Chatbot_FileUploader.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Medical Chatbot**

**Steps to Build the Medical Chatbot using LangChain & Hugging Face**

**1. Set Up the Environment**

Install required libraries:

!pip install langchain langchain_community langchain_huggingface transformers sentence-transformers faiss-cpu pypdf

**2. Initialize the Streamlit App**

Create a Streamlit title and input field for user queries.

Add a sidebar notification indicating the medical book is loaded.

**3. Load and Process the Medical PDF**

Use PyPDFLoader to read the medical book.

Split the text into manageable chunks using RecursiveCharacterTextSplitter.

Store these document chunks for retrieval.

**4. Generate Text Embeddings**

Use HuggingFaceEmbeddings (sentence-transformers/all-MiniLM-L6-v2) to convert text into embeddings.

Store these embeddings in a FAISS vector database for efficient retrieval.

**5. Set Up the Hugging Face Model**

Load the Flan-T5 model and tokenizer from Hugging Face (google/flan-t5-base).

Configure a text-generation pipeline with parameters like max_length, temperature, and top_p.

**6. Create the Retrieval-Augmented Generation (RAG) Chain**

Define a prompt template to format input for the model.

Implement a retrieval system using FAISS to find the most relevant chunks of text based on user questions.

Construct a LangChain processing pipeline that integrates retrieval and LLM response generation.

**7. Define the Response Generation Function**

Retrieve the most relevant document chunks.

Format them into a prompt and pass it through the Flan-T5 model.

Truncate context if necessary to fit within model token limits.

Return the model’s generated answer.

**8. Build the Streamlit Interface**

Create a text input field for user queries.

On submission, retrieve and generate a response.

Display the final answer in the Streamlit app.

**9. Run the Application**

Start the Streamlit server using:

streamlit run app.py

-- Ask medical-related questions and receive AI-generated answers.

In [1]:
! pip install streamlit -q

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.3/44.3 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.8/9.8 MB[0m [31m50.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m51.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.1/79.1 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
!wget -q -O - ipv4.icanhazip.com

34.125.91.157


In [3]:
!pip install langchain langchain_community langchain_huggingface transformers sentence-transformers faiss-cpu PyPDF2 pypdf

Collecting langchain_community
  Downloading langchain_community-0.3.22-py3-none-any.whl.metadata (2.4 kB)
Collecting langchain_huggingface
  Downloading langchain_huggingface-0.1.2-py3-none-any.whl.metadata (1.3 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.10.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.4 kB)
Collecting PyPDF2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Collecting pypdf
  Downloading pypdf-5.4.0-py3-none-any.whl.metadata (7.3 kB)
Collecting langchain-core<1.0.0,>=0.3.51 (from langchain)
  Downloading langchain_core-0.3.55-py3-none-any.whl.metadata (5.9 kB)
Collecting langchain
  Downloading langchain-0.3.24-py3-none-any.whl.metadata (7.8 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting httpx-sse

In [4]:
%%writefile app.py
import streamlit as st
import warnings
import torch
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.prompts import PromptTemplate
from langchain_huggingface import HuggingFacePipeline
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from transformers import pipeline, AutoModelForSeq2SeqLM, AutoTokenizer
import os

# Suppress warnings
warnings.filterwarnings("ignore")

# App title
st.set_page_config(page_title="📚 Medical Chatbot", layout="wide")
st.title("📚 Medical Chatbot")
st.write("Ask me questions about your uploaded medical PDFs!")

# ✅ Upload and process multiple PDFs
uploaded_files = st.sidebar.file_uploader("📄 Upload Medical PDF(s)", type=["pdf"], accept_multiple_files=True)

docs = []

def load_and_process_pdf(pdf_path):
    loader = PyPDFLoader(pdf_path)
    data = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=50)
    docs = text_splitter.split_documents(data)
    return docs

if uploaded_files:
    st.sidebar.success(f"✅ {len(uploaded_files)} file(s) uploaded")
    for uploaded_file in uploaded_files:
        with open(uploaded_file.name, "wb") as f:
            f.write(uploaded_file.read())
        new_docs = load_and_process_pdf(uploaded_file.name)
        docs.extend(new_docs)
else:
    st.warning("📄 Please upload at least one PDF file to begin.")

if docs:
    # ✅ Load Hugging Face Embeddings
    embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

    # ✅ Store embeddings in FAISS
    vectorstore = FAISS.from_documents(documents=docs, embedding=embeddings)
    retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 3})

    # ✅ Load T5 Model
    tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base")
    model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base", device_map="auto")

    # ✅ Setup Hugging Face Pipeline
    text_generation_pipeline = pipeline(
        task="text2text-generation",
        model=model,
        tokenizer=tokenizer,
        max_length=300,
        min_length=50,
        num_return_sequences=1,
        temperature=0.7,
        do_sample=True,
        top_p=0.95,
        repetition_penalty=1.2
    )

    llm = HuggingFacePipeline(pipeline=text_generation_pipeline)

    # ✅ Prompt Template
    prompt_template = """
    You are a knowledgeable AI trained on medical information. Read the following context carefully and provide a clear, concise, and well-structured answer.

    Context:
    {context}

    Question:
    {question}

    Provide an answer in full sentences, ensuring clarity and completeness.
    """

    prompt = PromptTemplate(input_variables=["context", "question"], template=prompt_template)

    # ✅ Truncate helper
    def truncate_text(text, max_tokens=400):
        tokens = text.split()
        return " ".join(tokens[:max_tokens])

    # ✅ LLM Chain
    llm_chain = prompt | llm | StrOutputParser()
    rag_chain = {"context": retriever, "question": RunnablePassthrough()} | llm_chain

    # ✅ Generate Answer
    def generate_response(question):
        retrieved_contexts = retriever.invoke(question)
        combined_context = "\n\n".join([truncate_text(doc.page_content, max_tokens=500) for doc in retrieved_contexts])

        if not combined_context.strip():
            return "I couldn't find relevant information in the provided context."

        formatted_prompt = prompt.format(context=combined_context, question=question)
        response = llm.invoke(formatted_prompt)
        return response.strip()

    # ✅ Chat history
    if "messages" not in st.session_state:
        st.session_state.messages = []

    # ✅ User input box
    user_question = st.chat_input("💬 Ask a medical question:")

    if user_question:
        with st.spinner("Thinking..."):
            answer = generate_response(user_question)
        st.session_state.messages.append({"role": "user", "text": user_question})
        st.session_state.messages.append({"role": "bot", "text": answer})

    # ✅ Display chat history
    for msg in st.session_state.messages:
        if msg["role"] == "user":
            st.markdown(f"🧑‍💬 **You:** {msg['text']}")
        else:
            st.markdown(f"🤖 **Bot:** {msg['text']}")


Writing app.py


In [None]:
! streamlit run app.py & npx localtunnel --port 8501

[1G[0K⠙
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K[1G[0JNeed to install the following packages:
localtunnel@2.0.2
Ok to proceed? (y) [20G[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://172.28.0.12:8501[0m
[34m  External URL: [0m[1mhttp://34.125.91.157:8501[0m
[0m
y

[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K⠧[1G[0K⠇[1G[0K⠏[1G[0K⠋[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K⠧[1G[0K⠇[1G[0K⠏[1G[0K⠋[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0Kyour url is: https://wet-oranges-draw.loca.lt
2025-04-23 06:14:03.757899: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1745388843.983