# Hybrid Search RAG Pipeline
This notebook demonstrates how to create a hybrid search Retrieval-Augmented Generation (RAG) pipeline using the LangChain library. The pipeline combines the strengths of BM25 and vector search to provide more accurate and relevant results for question-answering tasks.

## Installation and Imports
First, let's install the necessary dependencies:

In [None]:
!pip install langchain langchain_community chromadb requests langchain sentence-transformers langchain_community pypdf




Collecting sentence-transformers
  Downloading sentence_transformers-3.0.1-py3-none-any.whl (227 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pypdf
  Downloading pypdf-4.2.0-py3-none-any.whl (290 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m290.4/290.4 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torc

Now, we can import the required modules:

In [None]:
import os
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain_community.llms import LlamaCpp
from langchain.chains import RetrievalQA, LLMChain
from langchain.retrievers import BM25Retriever, EnsembleRetriever

## Set up Environment Variables
Before we can use the Hugging Face Hub API, we need to set up the API token as an environment variable. We'll use the os and getpass modules for this purpose.

In [None]:

from google.colab import userdata

In [None]:
import os
from getpass import getpass

from getpass import getpass

HUGGINGFACEHUB_API_TOKEN = userdata.get("HUGGINGFACEHUB_API_TOKEN")

# Set the API token in the environment variable
os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACEHUB_API_TOKEN

## Load and Split Documents
Here we load the PDF documents from the specified directory and splits them into smaller chunks using the RecursiveCharacterTextSplitter. The chunk size is set to 500 characters with a 50-character overlap.

In [None]:

# Load your documents (assuming they are PDFs in a directory)
loader = PyPDFDirectoryLoader('/content/sample_data/Data')
documents = loader.load()

# Split documents into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)


## Create Prompt Template:

In [None]:
from langchain.prompts.prompt import PromptTemplate

prompt_template = """
<|system|>
You are an AI Assistant that follows instructions extremely well.
Please be truthful and give direct answers. Please tell 'I don't know' if user query is not in CONTEXT

CONTEXT: {context}
</s>
<|user|>
{query}
</s>
<|assistant|>
Your answer:
"""
prompt = ChatPromptTemplate.from_template(template)

## Initialize Embeddings and Vector Store
We initialize the Hugging Face embeddings model and use it to create a Chroma vector store from the document chunks.

In [None]:
embeddings = HuggingFaceEmbeddings(model_name="thenlper/gte-large")

In [None]:
vectorstore = Chroma.from_documents(chunks, embeddings)

In [None]:
vectorstore

<langchain_community.vectorstores.chroma.Chroma at 0x7cfb70636da0>

In [None]:
query = "what is electrolysis"
search = vectorstore.similarity_search(query)

## Create BM25 and Vector Retrievers
Here,We create the BM25 and vector retrievers. The BM25 retriever is created directly from the document chunks, while the vector retriever is created from the Chroma vector store.

In [None]:
!pip install rank_bm25



In [None]:
bm25_retriever = BM25Retriever.from_documents(chunks)
vector_retriever = vectorstore.as_retriever()

## Set Up the EnsembleRetriever
 The EnsembleRetriever combines the BM25 and vector retrievers. The weights parameter is set to 0.5 for each retriever, giving them equal importance in the ensemble.

In [None]:
from langchain.retrievers.ensemble import EnsembleRetriever

retrievers = [bm25_retriever, vector_retriever]
ensemble_retriever = EnsembleRetriever(retrievers=retrievers, weights=[0.5, 0.5])


## Initialize the Large Language Model

In [None]:
from langchain_community.llms import HuggingFaceHub

llm = HuggingFaceHub(
    repo_id="HuggingFaceH4/zephyr-7b-beta",
    task="text-generation",
    model_kwargs={
        "max_new_tokens": 512,
        "top_k": 30,
        "temperature": 0.1,
        "repetition_penalty": 1.1,
        "return_full_text":False
    },
)

## Create the RAG Pipeline

In [None]:
from langchain_core.output_parsers import StrOutputParser

In [None]:
from langchain_core.runnables import RunnablePassthrough

In [None]:
output_parser = StrOutputParser()

In [None]:
retriever= ensemble_retriever
chain = (
    {"context": retriever, "query": RunnablePassthrough()}
    | prompt
    | llm
    | output_parser
)

## Run a Query


In [None]:
query = "what is an electrolyte"


In [None]:
response = chain.invoke(query)

In [None]:
print(response)

An electrolyte is a substance that, when dissolved in a solvent (such as water), becomes an electrically conductive solution. Electrolytes contain ions (charged particles) that can move through a membrane or between two electrodes, allowing electrical current to flow. Examples of electrolytes include table salt (sodium chloride), lemon juice (citric acid), and baking soda (sodium bicarbonate). In the human body, electrolytes such as sodium, potassium, calcium, magnesium, and chloride play important roles in various physiological processes, including muscle contractions, nerve impulses, and maintaining proper fluid balance.


In [None]:
print(chain.invoke("what is electrolysis?"))

Electrolysis is a chemical process that uses electricity to break down a compound, usually in a solution, into its constituent elements or simpler compounds. In other words, it is the process of using electric current to drive nonspontaneous chemical reactions. Electrolysis is commonly used in industry for the production of metals such as aluminum, chlorine, and sodium hydroxide (caustic soda). It can also be used to purify water by removing impurities like minerals and gases through a process called electrodeionization.
