# Building a Chat application using RAG (Retrieval-augmented generation) and LangChain

### Importance of RAG and how it is better than fine-tuning the LLM or building it from scratch

1. Fine-tuning is much more expensive as it requires trillions and trillions of data to train the LLM and also takes a lot of time.
2. Fine-tuning needs custom serving and maintenance.
3. Fine-tuning doesn't give you the auto-update feature.

### RAG Workflow:

1. **Retrieve**: Extract relevant information from a knowledge base or documents.
2. **Generate**: Use an LLM to generate human-like responses based on the retrieved information.

### Key Components of this Project:

1. **Knowledge Base/Vector Store** - To store the relevant documents in a searchable format (vector embeddings).
2. **LLM (Language Model)** - For generating intelligent and context-related answers.
3. **LangChain** - Acts as the glue to connect components (retriever, LLM, database).
4. **Frontend** - Using Streamlit

Creating an API key from OpenAI and store them as environment variables in your system.

Source - https://platform.openai.com/api-keys

## Importing necessary libraries

In [1]:
import os
import dotenv
from pathlib import Path

from langchain_core.messages import AIMessage, HumanMessage
from langchain_community.document_loaders.text import TextLoader
from langchain_community.document_loaders import (
    WebBaseLoader, 
    PyPDFLoader, 
    Docx2txtLoader,
)
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

dotenv.load_dotenv()

USER_AGENT environment variable not set, consider setting it to identify your requests.


True

Dataset or Document used is the "**Google Terms of Service**" in the form of pdf and docs.

**Source** - https://policies.google.com/terms?hl=en

In [2]:
from langchain_community.document_loaders.text import TextLoader
from langchain_community.document_loaders import PyPDFLoader, Docx2txtLoader
from pathlib import Path

# Define document paths
doc_paths = [
    "test.pdf",
    "test.docx",
]

docs = []
for doc_file in doc_paths:
    file_path = Path(doc_file)

    try:
        # Check the file extension and select the appropriate loader
        if doc_file.endswith(".pdf"):
            loader = PyPDFLoader(str(file_path))  # Ensure correct path format
        elif doc_file.endswith(".docx"):
            loader = Docx2txtLoader(str(file_path))
        else:
            print(f"Document type {file_path.suffix} not supported.")
            continue

        # Load the document and extend the docs list
        docs.extend(loader.load())

    except Exception as e:
        print(f"Error loading document {file_path.name}: {e}")

# Print the loaded documents
print(f"Loaded {len(docs)} documents.")


Loaded 21 documents.


In [3]:
docs

[Document(metadata={'source': 'test.pdf', 'page': 0}, page_content='GOOGLE TERMS OF SERVICE\nEffective May 22, 2024 | Archived versions\nWhat’s covered in these terms\nWe know it’s tempting to skip these Terms of\nService, but it’s important to establish what you\ncan expect from us as you use Google services,\nand what we expect from you.\nThese Terms of Service re\x00ect the way Google’s business works, the laws that apply to\nour company, and certain things we’ve always believed to be true. As a result, these Terms\nof Service help de\x00ne Google’s relationship with you as you interact with our services. For\nexample, these terms include the following topic headings:\nWhat you can expect from us, which describes how we provide and develop our\nservices\nWhat we expect from you, which establishes certain rules for using our services\nContent in Google services, which describes the intellectual property rights to the\ncontent you \x00nd in our services — whether that content belongs 

In [4]:
# Split docs

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=5000,
    chunk_overlap=1000,
)

document_chunks = text_splitter.split_documents(docs)

In [7]:
# Tokenize and load the documents to the vector store

vector_db = Chroma.from_documents(
    documents=document_chunks,
    embedding=OpenAIEmbeddings(),
)

In [8]:
# Retrieve

def _get_context_retriever_chain(vector_db, llm):
    retriever = vector_db.as_retriever()
    prompt = ChatPromptTemplate.from_messages([
        MessagesPlaceholder(variable_name="messages"),
        ("user", "{input}"),
        ("user", "Given the above conversation, generate a search query to look up in order to get inforamtion relevant to the conversation, focusing on the most recent messages."),
    ])
    retriever_chain = create_history_aware_retriever(llm, retriever, prompt)

    return retriever_chain

In [9]:
def get_conversational_rag_chain(llm):
    retriever_chain = _get_context_retriever_chain(vector_db, llm)

    prompt = ChatPromptTemplate.from_messages([
        ("system",
        """You are a helpful assistant. You will have to answer to user's queries.
        You will have some context to help with your answers, but now always would be completely related or helpful.
        You can also use your knowledge to assist answering the user's queries.\n
        {context}"""),
        MessagesPlaceholder(variable_name="messages"),
        ("user", "{input}"),
    ])
    stuff_documents_chain = create_stuff_documents_chain(llm, prompt)

    return create_retrieval_chain(retriever_chain, stuff_documents_chain)

In [10]:
# Augmented Generation

llm_stream_openai = ChatOpenAI(
    model="gpt-4o",  # Here you could use "o1-preview" or "o1-mini" if you already have access to them
    temperature=0.3,
    streaming=True,
)

llm_stream_anthropic = ChatAnthropic(
    model="claude-3-5-sonnet-20240620",
    temperature=0.3,
    streaming=True,
)

llm_stream = llm_stream_openai  # Select between OpenAI and Anthropic models for the response

messages = [
    {"role": "user", "content": "Hi"},
    {"role": "assistant", "content": "Hi there! How can I assist you today?"},
    {"role": "user", "content": "Can you tell me what i can expect from Google?"},
]
messages = [HumanMessage(content=m["content"]) if m["role"] == "user" else AIMessage(content=m["content"]) for m in messages]

conversation_rag_chain = get_conversational_rag_chain(llm_stream)
response_message = "*(RAG Response)*\n"
for chunk in conversation_rag_chain.pick("answer").stream({"messages": messages[:-1], "input": messages[-1].content}):
    response_message += chunk
    print(chunk, end="", flush=True)

messages.append({"role": "assistant", "content": response_message})

Certainly! When using Google services, you can expect the following:

1. **A Broad Range of Useful Services**: Google provides a variety of services, including apps and sites like Search and Maps, platforms like Google Shopping, integrated services like Maps embedded in other apps or sites, and devices like Google Nest and Pixel. These services are designed to work together to enhance your experience.

2. **Development, Improvement, and Updates**: Google is constantly developing new technologies and features to improve its services. This includes using artificial intelligence and machine learning for tasks like translations and spam detection. Google may add or remove features, adjust service limits, or introduce new services. Software updates may occur automatically on your device when new versions are available.

3. **Advance Notice for Material Changes**: If there are significant changes that negatively impact your use of services, or if a service is discontinued, Google will provid