<a href="https://colab.research.google.com/github/bhgtankita/LangChain/blob/main/HR_Helper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -q -U google-generativeai
!pip install pypdf2
!pip install chromadb
!pip install google.generativeai
!pip install langchain-google-genai
!pip install langchain
!pip install langchain_community
!pip install pypdf

Collecting pypdf2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Downloading pypdf2-3.0.1-py3-none-any.whl (232 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pypdf2
Successfully installed pypdf2-3.0.1
Collecting chromadb
  Downloading chromadb-0.5.18-py3-none-any.whl.metadata (6.8 kB)
Collecting build>=1.0.3 (from chromadb)
  Downloading build-1.2.2.post1-py3-none-any.whl.metadata (6.5 kB)
Collecting chroma-hnswlib==0.7.6 (from chromadb)
  Downloading chroma_hnswlib-0.7.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (252 bytes)
Collecting fastapi>=0.95.2 (from chromadb)
  Downloading fastapi-0.115.4-py3-none-any.whl.metadata (27 kB)
Collecting uvicorn>=0.18.3 (from uvicorn[standard]>=0.18.3->chromadb)
  Downloading uvicorn-0.32.0-py3-none-any.whl.metadata (6.6 kB)
Collecting posthog>=2.4.0 (from chromadb)
  Downloading posthog-3.7.0-py

# create_stuff_documents_chain from langchain.chains.combine_documents

**Purpose:** This function helps in creating a chain that combines information from multiple documents. It’s useful for applications where you want to synthesize or summarize content from several sources.

**Use Cases:**
Summarizing information from multiple documents into a cohesive answer.
Answering questions based on data spread across different documents.
Concatenating or “stuffing” multiple documents together into a single coherent response.

**How It Works:** This chain typically "stuffs" the documents together, enabling the language model to process and generate responses based on the combined content. It’s often used with models that have larger context windows so they can effectively handle all the combined input text.



# create_retrieval_chain from langchain.chains

**Purpose:** This function is designed to set up a retrieval-based chain, which can search a collection of documents to find the most relevant information based on a user’s query.

**Use Cases:**
Searching through large databases or knowledge bases to retrieve relevant information.
Building question-answering systems where information retrieval is a key step.
Creating chatbot interactions where the bot needs to pull specific answers from a document collection.

**How It Works:** The retrieval chain typically integrates with vector stores or other retrieval mechanisms. It searches for and retrieves the most relevant documents before processing the information to answer the query.

In [3]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.prompts import PromptTemplate
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain.vectorstores import Chroma
from secret import GOOGLE_API_KEY
import os

In [4]:
os.environ["GOOGLE_API_KEY"]=GOOGLE_API_KEY

In [6]:
#Load the models
llm = ChatGoogleGenerativeAI(model="gemini-pro")
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

#Load the PDF and create chunks
loader = PyPDFLoader("handbook.pdf")
text_splitter = CharacterTextSplitter(
    separator=".",
    chunk_size=250,
    chunk_overlap=50,
    length_function=len,
    is_separator_regex=False,
)
pages = loader.load_and_split(text_splitter)

#Turn the chunks into embeddings and store them in Chroma
vectordb=Chroma.from_documents(pages,embeddings)

#Configure Chroma as a retriever with top_k=5
retriever = vectordb.as_retriever(search_kwargs={"k": 5})

#Create the retrieval chain
template = """
You are a helpful AI assistant.
Answer based on the context provided.
context: {context}
input: {input}
answer:
"""
prompt = PromptTemplate.from_template(template)
combine_docs_chain = create_stuff_documents_chain(llm, prompt)
retrieval_chain = create_retrieval_chain(retriever, combine_docs_chain)

#Invoke the retrieval chain
response=retrieval_chain.invoke({"input":"How do I apply for personal leave?"})



In [7]:
#Print the answer to the question
print("\n ******Result******")
print(response["answer"])


 ******Result******
Such requests shall be made in writing to t he Employer not less than fifteen (15) calendar days prior to the start thereof, and shall Employer the starting and ending dates of the requested leave
