## LangChain - RAG Indexing

### Part 1: Overview

In [1]:
import bs4
from langchain import hub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
from dotenv import load_dotenv

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
load_dotenv()

True

In [3]:
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs = loader.load()

In [4]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

In [5]:
embedding = AzureOpenAIEmbeddings(
    model="text-embedding-3-large"
)

In [6]:
vectorstore = Chroma.from_documents(documents=splits, embedding=embedding)

In [7]:
retriever = vectorstore.as_retriever()

In [8]:
prompt = hub.pull("rlm/rag-prompt")

In [9]:
llm = AzureChatOpenAI(
    model="gpt-4o-mini",
    api_version="2024-12-01-preview"
)

In [10]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [11]:
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [12]:
rag_chain.invoke("What is Task Decomposition?")

'Task decomposition is the process of breaking down a complex task into smaller, manageable subgoals or steps. It can be achieved through various methods, such as simple prompting to outline steps, task-specific instructions, or human inputs. Techniques like Chain of Thought and Tree of Thoughts further enhance this process by providing structured reasoning paths for better clarity and performance in task execution.'

#### Part 2: Indexing

In [13]:
question = "What kinds of pets do I like?"
document = "My favorite pet is dog."

In [14]:
import tiktoken

def num_tokens_from_str(string: str, encoding_name: str) -> int:
    """"Returns the number of tokens in a text string"""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

num_tokens_from_str(question, "cl100k_base")

8

In [15]:
query_result = embedding.embed_query(question)
document_result = embedding.embed_query(document)
len(query_result)

3072

In [16]:
import numpy as np

def cosine_similarity(vec1, vec2):
    dot_product = np.dot(vec1, vec2)
    norm_vec1 = np.linalg.norm(vec1)
    norm_vec2 = np.linalg.norm(vec2)
    return dot_product / (norm_vec1 * norm_vec2)

cosine_similarity(query_result, document_result)

np.float64(0.5722581282709265)

In [17]:
# load
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

In [18]:
# split
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300,
    chunk_overlap=50
)
splits = text_splitter.split_documents(blog_docs)

In [19]:
vectorstore = Chroma.from_documents(documents=splits,embedding=embedding)
retriever = vectorstore.as_retriever()

### Part 3: Retrieval

In [20]:
retriever = vectorstore.as_retriever(search_kwargs={"k": 1})

In [21]:
docs = retriever.invoke("What is Decomposition?")

In [22]:
len(docs)

1

In [23]:
docs

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Component One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a

### Part 4: Generation

In [24]:
from langchain.prompts import ChatPromptTemplate

In [25]:
# Prompt
template = """"
    Answer the question based on the following context: 

    Context: {context}
    Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
prompt

ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template='"\n    Answer the question based on the following context: \n\n    Context: {context}\n    Question: {question}\n'), additional_kwargs={})])

In [26]:
chain = prompt | llm

In [27]:
chain.invoke({"context": docs, "question": "What is Task Decomposition?"})

AIMessage(content='Task Decomposition is a technique used to break down complicated tasks into smaller, more manageable steps. It allows an agent to plan ahead by identifying individual components of a task. Two prominent methods for task decomposition are:\n\n1. **Chain of Thought (CoT)**: This approach encourages the model to "think step by step," enhancing performance on complex tasks by transforming a big task into several simpler tasks, thereby providing insights into the model’s reasoning process.\n\n2. **Tree of Thoughts**: This method builds on CoT by allowing the exploration of multiple reasoning paths at each step. It decomposes a problem into several thought steps and generates multiple thoughts for each step, forming a tree structure that can be evaluated through techniques like breadth-first search (BFS) or depth-first search (DFS).\n\nTask decomposition can also be achieved through various approaches, such as simple prompting, task-specific instructions, or human input.',

In [28]:
docs

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Component One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a

In [29]:
prompt_hub_rag = hub.pull("rlm/rag-prompt")

In [30]:
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt_hub_rag
    | llm
    | StrOutputParser()
)

In [31]:
rag_chain.invoke("What is Task Decomposition?")

'Task Decomposition is the process of breaking down a complex task into smaller, manageable steps to facilitate problem-solving. Techniques such as Chain of Thought (CoT) allow models to think step by step, while the Tree of Thoughts method explores multiple reasoning possibilities at each step. This can be achieved through various means, including prompting, task-specific instructions, or human inputs.'

In [32]:
rag_chain.invoke("What the heuristic function determines?")

'The heuristic function determines when a trajectory is inefficient or contains hallucination and should be stopped. Inefficient planning refers to trajectories that take too long without achieving success, while hallucination involves encountering repeated actions that yield the same observations.'