# 🏥 Healthcare RAG Chatbot
This notebook demonstrates a fully featured Retrieval-Augmented Generation (RAG) chatbot tailored for the **healthcare domain**. It includes:

- Conversational memory
- Hybrid retrieval (semantic + keyword search)
- Cross-encoder reranking
- Feedback collection for fine-tuning
- Fixed interaction loop to ensure answers are shown and exits handled properly

!pip install langchain faiss-cpu sentence-transformers transformers openai rank_bm25 nltk langchain-community

In [2]:
!pip install langchain faiss-cpu sentence-transformers transformers openai rank_bm25 nltk



## 📂 Step 2: Load and Chunk Healthcare Documents
**Purpose:** Load multiple `.txt` files containing medical content and split them into overlapping chunks. This helps in semantic retrieval and prevents missing information.

In [3]:
# 📦 Install dependencies
!pip install -U langchain langchain-community

# Import libraries
import os
import nltk
nltk.download('punkt')

from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load healthcare text documents
data_path = "/content/healthcare_docs"  # Make sure this folder is uploaded to Colab
documents = []

for filename in os.listdir(data_path):
    if filename.endswith(".txt"):
        file_path = os.path.join(data_path, filename)
        loader = TextLoader(file_path)
        documents.extend(loader.load())

# Split into overlapping chunks for retrieval
splitter = RecursiveCharacterTextSplitter(
    chunk_size=700,      # Each chunk size
    chunk_overlap=100    # Context overlap
)

split_docs = splitter.split_documents(documents)

# Display results
print(f"✅ Total Chunks Created: {len(split_docs)}")
print("📄 Example Chunk Preview:\n")
print(split_docs[0].page_content)


✅ Total Chunks Created: 7
📄 Example Chunk Preview:

Basic first aid instructions:
- For cuts: Clean the wound and apply a bandage
- For burns: Run under cool water and cover with a sterile dressing
- For choking: Perform the Heimlich maneuver
- For unconsciousness: Check breathing and call emergency services
- For sprains: Rest, Ice, Compression, and Elevation (R.I.C.E)


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


## 🔎 Step 3: Create Embeddings and FAISS Vector Store
**Purpose:** Convert text into dense vectors and store them in a FAISS index for semantic similarity search.

In [4]:
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings()
vectorstore = FAISS.from_documents(split_docs, embedding_model)

  embedding_model = HuggingFaceEmbeddings()
  embedding_model = HuggingFaceEmbeddings()
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


## 🗝️ Step 4: Add Keyword-Based BM25 Retriever
**Purpose:** Enable keyword-based search as a fallback when semantic search isn't enough.
BM25 works based on keyword frequency and rarity.

In [8]:
# Step 2: Load and Chunk Healthcare Documents (No punkt errors)
import os
import nltk
from nltk.tokenize import TreebankWordTokenizer  # ✅ FIX: this doesn't require punkt
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

tokenizer = TreebankWordTokenizer()

# Load .txt files
data_path = "/content/healthcare_docs"
documents = []
for filename in os.listdir(data_path):
    if filename.endswith(".txt"):
        loader = TextLoader(os.path.join(data_path, filename))
        documents.extend(loader.load())

# Split docs into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=100)
split_docs = splitter.split_documents(documents)

# ✅ FIXED: Tokenization with Treebank tokenizer (no punkt needed)
bm25_corpus = [tokenizer.tokenize(doc.page_content.lower()) for doc in split_docs]

# Build BM25 retriever
from rank_bm25 import BM25Okapi
bm25 = BM25Okapi(bm25_corpus)

print(f"✅ Total Chunks: {len(split_docs)}")
print(f"📄 Preview Chunk:\n{split_docs[0].page_content}")


✅ Total Chunks: 7
📄 Preview Chunk:
Basic first aid instructions:
- For cuts: Clean the wound and apply a bandage
- For burns: Run under cool water and cover with a sterile dressing
- For choking: Perform the Heimlich maneuver
- For unconsciousness: Check breathing and call emergency services
- For sprains: Rest, Ice, Compression, and Elevation (R.I.C.E)


## ⚡ Step 5: Hybrid Retriever Function
**Purpose:** Combine keyword search (BM25) and semantic search (FAISS) for better accuracy.

In [9]:
def hybrid_retrieve(query, top_k=5):
    query_tokens = word_tokenize(query.lower())
    bm25_scores = bm25.get_scores(query_tokens)
    bm25_results = sorted(zip(bm25_scores, split_docs), reverse=True)[:top_k]
    faiss_results = vectorstore.similarity_search(query, k=top_k)
    unique_docs = list({doc.page_content: doc for _, doc in bm25_results}.values())
    combined = unique_docs + faiss_results
    return combined[:top_k]

## 🧠 Step 6: Conversational Memory Setup
**Purpose:** Store chat history to maintain multi-turn context using LangChain's memory module.

In [10]:
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

  memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


## 🤖 Step 7: Load GPT Model and Create Chatbot Chain
**Purpose:** Use OpenAI GPT and LangChain to create a conversational retrieval chain.

In [11]:
import os
os.environ["OPENAI_API_KEY"] = "Paste Open AI API KEY"
from langchain.llms import OpenAI
from langchain.chains import ConversationalRetrievalChain

llm = OpenAI(temperature=0.3)
retrieval_chain = ConversationalRetrievalChain.from_llm(llm=llm, retriever=vectorstore.as_retriever(), memory=memory)

  llm = OpenAI(temperature=0.3)


## 🧪 Step 8: Add Cross-Encoder Reranker
**Purpose:** Improve accuracy of retrieved results using BERT-based reranking.

In [12]:
from sentence_transformers import CrossEncoder
cross_encoder = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

def rerank_with_cross_encoder(query, docs, top_n=3):
    pairs = [[query, doc.page_content] for doc in docs]
    scores = cross_encoder.predict(pairs)
    scored_docs = sorted(zip(scores, docs), key=lambda x: x[0], reverse=True)
    return [doc for _, doc in scored_docs[:top_n]]

config.json:   0%|          | 0.00/845 [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.33k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/732 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/3.74k [00:00<?, ?B/s]

## 📝 Step 9: Feedback Logging Function
**Purpose:** Allow users to rate answers (👍/👎) for future RLHF-style improvement.

In [13]:
import datetime

def log_feedback(question, answer, feedback):
    with open("feedback_log.txt", "a") as f:
        f.write(f"{datetime.datetime.now()}\n")
        f.write(f"Question: {question}\n")
        f.write(f"Answer: {answer}\n")
        f.write(f"Feedback: {feedback}\n")
        f.write("-"*50 + "\n")

## 💬 Step 10: Chat Interface (with All Fixes)
**Purpose:** Chat with the bot. Exits cleanly. Handles no-answer case. Logs feedback.

In [16]:
while True:
    try:
        question = input("\n🩺 Ask a healthcare question (or type 'exit'): ").strip()
        if question.lower() == "exit":
            print("👋 Goodbye!")
            break

        # Run LangChain retrieval chain
        result = retrieval_chain.run(question)

        # If result is empty or None
        if not result or not result.strip():
            print("🤖 RAGBot: Sorry, I couldn't find a good answer. Please try rephrasing your question.")
            continue

        # Otherwise show answer
        print("🤖 RAGBot:", result)

        # Ask for feedback
        feedback = input("Was this helpful? (Yes/No): ").strip()
        log_feedback(question, result, feedback)

    except Exception as e:
        print("❌ Error:", e)



🩺 Ask a healthcare question (or type 'exit'): How can I improve my heart health?
🤖 RAGBot:  Some ways to improve heart health include eating plenty of fruits and vegetables, reducing salt and saturated fat intake, exercising regularly, avoiding smoking, maintaining a healthy weight, and managing blood pressure and cholesterol through regular health screenings.
Was this helpful? (Yes/No): yes

🩺 Ask a healthcare question (or type 'exit'): How much sleep does a toddler need?
🤖 RAGBot:  The recommended amount of sleep for toddlers is between 11-14 hours per day, including naps. This can vary based on their age, with younger toddlers needing more sleep and older toddlers needing slightly less. For example, a 1-year-old may need 12-14 hours of sleep, while a 3-year-old may only need 11-13 hours. It is important to monitor a child's individual sleep needs and consult with a pediatrician if there are concerns.
Was this helpful? (Yes/No): no

🩺 Ask a healthcare question (or type 'exit'): What

## ✅ Sample Questions to Test
- What are symptoms of the flu?
- What foods should diabetics avoid?
- How can I improve my heart health?
- What should I do if someone is choking?
- How much sleep does a toddler need?
- What should I avoid if I have high blood pressure?