## RAG Tutorial

This notebook simply follows the online doc of langchain.

https://python.langchain.com/docs/tutorials/rag/

### Pre requisites

Google API Key 
https://cloud.google.com/docs/authentication/api-keys

### Setting up required packages

In [1]:
%pip install --quiet --upgrade langchain-text-splitters langchain-community langgraph

Note: you may need to restart the kernel to use updated packages.


### Set langsmith property for tracing

In [4]:
import os
import getpass

os.environ['LANGSMITH_TRACING'] = "true"
os.environ['LANGSMITH_API_KEY'] = getpass.getpass()

 ········


In [8]:
os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for google gemini: ")

Enter API key for google gemini:  ········


In [9]:
os.environ["GOOGLE_PROJECT"] = getpass.getpass("Enter project id: ")

Enter project id:  ········


### Setting up components

#### Setting up chat model (LLM)

In [10]:
%pip install -qU "langchain[google-genai]"

Note: you may need to restart the kernel to use updated packages.


In [11]:
import getpass
import os

from langchain.chat_models import init_chat_model
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.messages.base import BaseMessage


llm = init_chat_model('gemini-2.0-flash', model_provider="google_genai")

#### Setting up model for embedding

In [12]:
%pip install -qU "langchain-google-vertexai"

Note: you may need to restart the kernel to use updated packages.


In [13]:
from langchain_google_vertexai import VertexAIEmbeddings

embeddings = VertexAIEmbeddings(project=os.environ.get('GOOGLE_PROJECT'), model="text-embedding-004")

#### Setting up the vector store

In [14]:
from langchain_core.vectorstores import InMemoryVectorStore

vector_store = InMemoryVectorStore(embeddings)

### Indexing

In [15]:
import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langgraph.graph import START, StateGraph
from typing_extensions import List, TypedDict

USER_AGENT environment variable not set, consider setting it to identify your requests.


#### Loading

In [16]:
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs = loader.load()

#### Splitting

In [17]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
all_splits = text_splitter.split_documents(docs)

#### Indexing

In [18]:
_ = vector_store.add_documents(documents=all_splits)

### RAG

In [23]:
prompt = hub.pull("rlm/rag-prompt")

class State(TypedDict):
    question: str
    context: List[Document]
    answer: str

def retrieve(state: State) -> list[Document]:
    print(f"Message state for retrieve step: {state}")
    retrieved_docs: list[Document] = vector_store.similarity_search(state["question"])
    print(retrieved_docs)
    return {"context": retrieved_docs}

def generate(state: State) -> BaseMessage:
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages  = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}


### Testing

In [24]:
graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

In [25]:
response = graph.invoke({"question": "What is Task Decomposition?"})
print(response["answer"])

Message state for retrieve step: {'question': 'What is Task Decomposition?'}
[Document(id='ff782c3a-d247-4b1b-a430-4d449e2d10e9', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Component One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree st