<a href="https://colab.research.google.com/github/muthuraman2002/RAG-system-in-colab/blob/main/Chatbot_using_RAG_and_LangChain_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Task
Implement a RAG system using the Ollama GPT-2 model.

## Install necessary libraries

### Subtask:
Install all required libraries for setting up the RAG system, including `ollama`, `transformers`, `sentence-transformers`, `langchain`, and `faiss-cpu`.


**Reasoning**:
Install the required libraries using pip.



In [None]:
%pip install transformers sentence-transformers langchain faiss-cpu  langchain-community

Collecting faiss-cpu
  Downloading faiss_cpu-1.12.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.1-py3-none-any.whl.metadata (9.4 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.

## Load and process data

### Subtask:
Load the documents for the RAG system and process them into a suitable format for embedding.


**Reasoning**:
Load the documents and process them into a suitable format for embedding using RecursiveCharacterTextSplitter.



**Reasoning**:
Correct the typo in the class name and re-run the code to split the documents.



In [1]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import Document
import pandas as pd


# 1. Load CSV file
df = pd.read_csv("../content/healthcare_dataset.csv")

# 2. Convert rows into LangChain Documents
documents = []
for i, row in df.iterrows():
    content = " ".join([f"{col}: {row[col]}" for col in df.columns])
    documents.append(Document(page_content=content))

# 1. Define a list of strings where each string represents a document.
# documents = [
#     "This is the first document. It talks about the basics of Large Language Models and their applications.",
#     "The second document discusses the architecture of the GPT series of models, including GPT-2.",
#     "Document three focuses on Retrieval Augmented Generation (RAG) systems and how they combine retrieval and generation techniques.",
#     "The fourth document explores the use of vector databases like FAISS for efficient similarity search in RAG systems.",
#     "This is the fifth and final document. It provides an overview of the Langchain framework and its components for building LLM applications."
# ]

# 2. Instantiate a RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=100,
    chunk_overlap=20
)

# 3. Use the create_documents method to process the document strings
# Extract the page_content (string) from each Document object
text_content = [doc.page_content for doc in documents]
docs = text_splitter.create_documents(text_content)

# Display the first few processed documents
print(docs[:2])

[Document(metadata={}, page_content='Name: Bobby JacksOn Age: 30 Gender: Male Blood Type: B- Medical Condition: Cancer Date of Admission:'), Document(metadata={}, page_content='Date of Admission: 2024-01-31 Doctor: Matthew Smith Hospital: Sons and Miller Insurance Provider:')]


## Create embeddings

### Subtask:
Generate embeddings for the processed documents using a sentence transformer model.


**Reasoning**:
Generate embeddings for the processed documents using a sentence transformer model.



**Reasoning**:
The previous command failed because the `langchain_community` module was not found. This is likely because the installed version of `langchain` does not include this module or it needs to be installed separately. Install `langchain-community` to resolve this issue.



In [3]:
%pip install langchain-community

Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.1-py3-none-any.whl.metadata (9.4 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain-community)
  Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 k

**Reasoning**:
Now that `langchain-community` is installed, retry generating embeddings for the processed documents using a sentence transformer model.



In [4]:
from langchain_community.embeddings import SentenceTransformerEmbeddings

# Instantiate SentenceTransformerEmbeddings
# Ensure you have internet access to download the model if not already cached
try:
    embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
    print("SentenceTransformerEmbeddings model loaded successfully.")
except Exception as e:
    print(f"Error loading SentenceTransformerEmbeddings model: {e}")
    embeddings = None


# Generate embeddings for the processed documents
if embeddings is not None and docs: # Use 'docs' which contains the split documents
    print(f"Generating embeddings for {len(docs)} documents...")
    try:
        doc_embeddings = embeddings.embed_documents([doc.page_content for doc in docs])
        print(f"Generated {len(doc_embeddings)} embeddings.")
        if doc_embeddings:
            print(f"Example embedding (first 10 dimensions): {doc_embeddings[0][:10]}")
    except Exception as e:
        print(f"Error generating embeddings: {e}")
        doc_embeddings = None
else:
    print("Skipping embedding generation due to issues with embeddings model or processed documents.")
    doc_embeddings = None

  embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

SentenceTransformerEmbeddings model loaded successfully.
Generating embeddings for 31939 documents...
Generated 31939 embeddings.
Example embedding (first 10 dimensions): [-0.08565466850996017, 0.03265903890132904, -0.12203414738178253, 0.01496089156717062, -0.015038768760859966, 0.015352633781731129, 0.04373319819569588, 0.04880908504128456, -0.031047934666275978, -0.04581812024116516]


## Build a vector store

### Subtask:
Build a vector store (e.g., using FAISS) from the document embeddings.


**Reasoning**:
Import the necessary FAISS class and create a FAISS index from the documents and embeddings.



In [6]:
from langchain_community.vectorstores import FAISS

# Create a FAISS index from the documents and embeddings
vectorstore = None
if docs and embeddings: # Use 'docs' which contains the split documents
    try:
        vectorstore = FAISS.from_documents(docs, embeddings)
        print("FAISS vectorstore created successfully.")
    except Exception as e:
        print(f"Error creating FAISS vectorstore: {e}")
else:
    print("Skipping FAISS vectorstore creation due to missing processed documents or embeddings.")

Error creating FAISS vectorstore: Could not import faiss python package. Please install it with `pip install faiss-gpu` (for CUDA supported GPU) or `pip install faiss-cpu` (depending on Python version).


## Set up the rag chain

### Subtask:
Configure the RAG chain using Langchain, combining the vector store and the GPT-2 model for generation.


**Reasoning**:
Configure the RAG chain by creating a Langchain LLM instance from the loaded GPT-2 model and tokenizer, creating a retriever from the FAISS vector store, and instantiating RetrievalQA with the retriever and LLM.



## Implement the rag query function

### Subtask:
Create a function that takes a user query, retrieves relevant documents from the vector store, and uses the RAG chain to generate a response.


**Reasoning**:
Define a function `answer_query` that takes a query string, calls the `qa_chain` with the query, and returns the result.



In [7]:
def answer_query(query: str):
    """
    Answers a user query using the configured RAG chain.

    Args:
        query: The user's question.

    Returns:
        The response generated by the RAG chain.
    """
    result = qa_chain.invoke({"query": query})
    return result

# Example usage (optional, for testing)
# query = "What is RAG?"
# response = answer_query(query)
# print(response)

## Test the rag system

### Subtask:
Test the implemented RAG system with sample queries to ensure it is working correctly.


**Reasoning**:
Define sample queries and call the answer_query function for each, then print the results.



**Reasoning**:
Import the necessary FAISS class and create a FAISS index from the documents and embeddings.

In [14]:
from langchain.chains import RetrievalQA
from langchain_community.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Define the model and tokenizer
model_id = "gpt2"  # Using GPT-2 as requested
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Create a HuggingFace pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=100) # Adjust max_new_tokens as needed

# Create a Langchain LLM instance
llm = HuggingFacePipeline(pipeline=pipe)

# Create a retriever from the FAISS vector store
retriever = vectorstore.as_retriever()

# Instantiate RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff", # "stuff" is a common chain type for RAG
    retriever=retriever,
    return_source_documents=True # Optional: return the source documents
)

print("RAG chain created successfully.")

Device set to use cpu


RAG chain created successfully.


Device set to use cpu


RAG chain created successfully.


## Add User Prompt and Get Response

### Subtask:
Prompt the user for a query, use the RAG system to find relevant information in the CSV data, and display the generated response.

In [17]:
# Prompt the user for a query
while True:
  user_query = input("Enter your query about the healthcare data: ")

# Get the response from the RAG system
  if 'qa_chain' in locals():
      try:
          response = qa_chain.invoke({"query": user_query})
          # print("\nResponse:")
          print(response['result'])
          if 'source_documents' in response:
              # print("\nSource Documents:")
              for doc in response['source_documents']:
                  print(f"- {doc.page_content}")
      except Exception as e:
          print(f"Error during RAG query: {e}")
  else:
      print("RAG chain is not initialized. Please run the previous cells to set up the RAG system.")

Enter your query about the healthcare data: hii


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


- Name: ANDrEA HuERTA Age: 29 Gender: Female Blood Type: A- Medical Condition: Hypertension Date of
- Name: branDI sUlLIvAn Age: 26 Gender: Male Blood Type: O+ Medical Condition: Asthma Date of
- Name: BRanDI cROss Age: 65 Gender: Male Blood Type: B+ Medical Condition: Asthma Date of Admission:
- Name: mIChAeL sAliNas Age: 25 Gender: Female Blood Type: A+ Medical Condition: Arthritis Date of
Enter your query about the healthcare data: hich type of peoples are easly  atack the cance ?


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


- Name: sARA caNTrelL Age: 19 Gender: Male Blood Type: O- Medical Condition: Diabetes Date of
- Name: anTHoNY canTREll Age: 46 Gender: Female Blood Type: A- Medical Condition: Arthritis Date of
- Name: ALiSHA cantreLl Age: 39 Gender: Male Blood Type: B- Medical Condition: Obesity Date of
- Name: sIErrA whITe Age: 50 Gender: Male Blood Type: O- Medical Condition: Asthma Date of Admission:
Enter your query about the healthcare data: give average of the cancer affected perops


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


- Cancer Date of Admission: 2020-10-06 Doctor: Anthony Gaines Hospital: Cox, Moore Pugh and Insurance
- Cancer Date of Admission: 2021-09-19 Doctor: Katrina Martin Hospital: Cook and Craig, Herrera
- Cancer Date of Admission: 2020-08-29 Doctor: Stephen Atkinson Hospital: Smith, Chavez Berg and
- Cancer Date of Admission: 2022-09-16 Doctor: Patrick Smith Hospital: Boyd, Gamble and Cruz
Enter your query about the healthcare data: give a count of cancer peoples


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


- Cancer Date of Admission: 2020-08-29 Doctor: Stephen Atkinson Hospital: Smith, Chavez Berg and
- Cancer Date of Admission: 2023-05-13 Doctor: Annette Williams Hospital: White and Smith Villegas,
- Cancer Date of Admission: 2024-03-25 Doctor: Sandra Hanson Hospital: Taylor Hughes and Smith,
- Cancer Date of Admission: 2021-09-19 Doctor: Katrina Martin Hospital: Cook and Craig, Herrera
Enter your query about the healthcare data: i need a count of the cancer  dises affetd people


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


- Cancer Date of Admission: 2020-08-29 Doctor: Stephen Atkinson Hospital: Smith, Chavez Berg and
- Cancer Date of Admission: 2023-05-13 Doctor: Annette Williams Hospital: White and Smith Villegas,
- Cancer Date of Admission: 2022-04-11 Doctor: Meghan Jennings Hospital: Terry Jordan and Jones,
- Cancer Date of Admission: 2021-09-19 Doctor: Katrina Martin Hospital: Cook and Craig, Herrera
Enter your query about the healthcare data: how to clear the cear the cancer issue


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


- Cancer Date of Admission: 2023-05-13 Doctor: Annette Williams Hospital: White and Smith Villegas,
- Cancer Date of Admission: 2023-04-15 Doctor: Frances Contreras Hospital: Jordan, Robles and
- Cancer Date of Admission: 2020-03-05 Doctor: James Espinoza Hospital: Harper Jackson Jordan, and
- Cancer Date of Admission: 2023-10-08 Doctor: Timothy Young Jr. Hospital: Fletcher Garcia and


KeyboardInterrupt: Interrupted by user