<a href="https://colab.research.google.com/github/csetanmayjain/LangChain/blob/main/LangChain_SarvamAI2B_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **LLM LangChain Demo**

This project showcases the use of [LangChain](https://www.langchain.com) with the [Sarvam AI 2B-v0.5](https://huggingface.co/sarvamai/sarvam-2b-v0.5) LLM to build a Retrieval-Augmented Generation (RAG) application that fetches URLs, creates embeddings using [Google BERT multilingual uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased), and responds to user queries based on these embeddings.


## Developer: Tanmay Jain
- **Email:** csetanmayjain@gmail.com
- **GitHub:** [github.com/csetanmayjain](https://github.com/csetanmayjain)
- **Linkedin:** [linkedin.com/in/csetanmayjain](https://www.linkedin.com/in/csetanmayjain)

In [None]:
!pip install --quiet langchain_core gradio streamlit langchain_community faiss-gpu sentence_transformers

In [None]:
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
import gradio
import streamlit as st
from huggingface_hub import login



In [None]:
#setup hugging face token
login(token="")

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [None]:
def load_model():

    #load embeddings for hindi
    huggingface_embeddings_hi = HuggingFaceBgeEmbeddings(
        model_name="google-bert/bert-base-multilingual-uncased",
        model_kwargs={'device':'cuda'},
        encode_kwargs={'normalize_embeddings':True})

    #load llm model
    llm_hi = HuggingFacePipeline.from_model_id(
        model_id="sarvamai/sarvam-2b-v0.5",
        task="text-generation",
        pipeline_kwargs={"temperature": 0.1, "max_new_tokens": 500},
        device=0)

    return huggingface_embeddings_hi, llm_hi

In [None]:
# prompt template
def get_prompt():

    prompt_template_hi = """
    पूछे गए प्रश्न का उत्तर देने के लिए निम्नलिखित संदर्भ का उपयोग करें।
    कृपया उत्तर केवल संदर्भ के आधार पर दे।

    {context}
    प्रश्न: {question}

    उत्तर:
    """

    return prompt_template_hi

In [None]:
"""
The function loads and processes documents from a given URL,
splits them into manageable chunks, indexes them in a FAISS vector store,
and sets up a retrieval-based question-answering system using an LLM.
It updates these components only when the URL changes to optimize performance.
"""
def get_retriever_prompt(url, huggingface_embeddings, prompt_template, llm):
    # Initialize session state variables if they don't exist
    if 'session_id' not in st.session_state:
        st.session_state.session_id = 0
        st.session_state.previous_url = None
        st.session_state.retrievalQA = None

    # Retrieve stored session state values
    previous_url = st.session_state.previous_url

    # Check if parameters have changed
    if url != previous_url:

        st.session_state.previous_url = url

        # Create a WebBaseLoader instance with the URL
        st.session_state.loader = WebBaseLoader(url)

        # Load the content of the URL
        st.session_state.documents = st.session_state.loader.load()

        # Create a text splitter
        st.session_state.text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

        # Split the document
        st.session_state.split_docs = st.session_state.text_splitter.split_documents(st.session_state.documents)

        # VectorStore Creation
        st.session_state.vectorstore = FAISS.from_documents(st.session_state.split_docs, huggingface_embeddings)

        st.session_state.retriever = st.session_state.vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})

        st.session_state.prompt = PromptTemplate(template=prompt_template, input_variables=["context", "question"])

        st.session_state.retrievalQA = RetrievalQA.from_chain_type(
            llm=llm,
            chain_type="stuff",
            retriever=st.session_state.retriever,
            return_source_documents=False,
            chain_type_kwargs={"prompt": st.session_state.prompt}
        )
    return st.session_state.retrievalQA

In [None]:
#loading the model into memory
huggingface_embeddings_hi, llm_hi = load_model()
prompt_template_hi = get_prompt()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
#main driver code to call the function based on the given input
def driver_code(url, query):
    retrievalQA_hi = get_retriever_prompt(url, huggingface_embeddings_hi, prompt_template_hi, llm_hi)
    result = retrievalQA_hi.invoke({"query": query})
    return result['result'].split("उत्तर:")[1]


In [None]:
# Create the Gradio interface
app = gr.Interface(
    fn=driver_code,
    inputs=[
        gr.Textbox(label="Input URL"),
        gr.Textbox(label="Input Query")
    ],
    outputs=gr.Textbox(label="Output")
)

# Launch the app
app.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://e577e79979e2ec4dec.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


