**Installing necessary packages**

In [2]:
!pip install langchain langchain_community # popular framework for generative ai
%pip install --upgrade --quiet huggingface_hub
!pip install faiss-cpu # vectorstore
!pip install pypdf # loader in rag
!pip install langchain_huggingface
!pip install chromadb # vectorstore
!pip install langchain_core

Note: you may need to restart the kernel to use updated packages.


**Importing libaries**

In [3]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
hf_tkn = user_secrets.get_secret("HUGGINGFACEHUB_API_TOKEN") # for accessing my secret HuggingFace token

from langchain.llms import HuggingFaceHub 
from langchain_huggingface import HuggingFaceEndpoint # for accessing huggingface models
from langchain_huggingface import HuggingFaceEmbeddings # embeding the documents in the vectorstore
from langchain_huggingface import ChatHuggingFace # chat model
from langchain.prompts import ChatPromptTemplate
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter,RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS,Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

**Let's now load the document**

In [4]:
pdfloader = PyPDFLoader('/kaggle/input/consti/Constitution of Kenya 2010.pdf')
docs = pdfloader.load()

**Let's split the documents into chunks**

In [5]:
splitter = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=0)
texts = splitter.split_documents(docs)

**We now create embeddings from the texts then store them to the vectorstore**

In [6]:
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
db = Chroma.from_documents(texts,embedding=embeddings)

  from tqdm.autonotebook import tqdm, trange


**Initializing our chat model**

In [13]:
llm = HuggingFaceEndpoint(
    repo_id="HuggingFaceH4/zephyr-7b-beta",
    task="text-generation",
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.03,
    huggingfacehub_api_token=hf_tkn
)

chat_model = ChatHuggingFace(llm=llm)

**Defining our prompt to the llm**

In [8]:
prompt = ChatPromptTemplate.from_template(""" 
        Answer the following question based only on the provided context
        Think step by step before providing a detailed answer
        <context>
        {context}
        </context>
        Question: {input}""")

**Let's create our retriever**

In [9]:
retriever = db.as_retriever()

**Let's create the chain**

In [14]:
# RunnablePassthrough allows us to pass the user's question to the prompt and model
retrieval_chain = (
                {"context":retriever,"input":RunnablePassthrough()}
                | prompt
                | chat_model
                | StrOutputParser()
                )

**Sample input**

In [18]:
input_text = "What's sovereignity of the people?"
def capitalize_first_letter(response):
    return response[0].upper() + response[1:] if response else response

response = retrieval_chain.invoke(input_text).replace("Based on the provided context, ", "")

response = capitalize_first_letter(response)
print(response)

Based solely on the provided context, which includes excerpts from the Constitution of Kenya, we can infer that sovereignty of the people refers to the ultimate authority and power that resides with the citizenry of a country. The Constitution acknowledges this concept through its recognition in the preamble, which states that "ALL POWER BELONGS TO THE PEOPLE" and that they have "enacted this Constitution." This principle is also evident in Article 1 (
