# Retrieval Augmented Generation Demo - Base

## Prerequisites

### Make a `.env` file

```sh
cp .env.example .env
```

### Provide API keys in `.env` file 

- LANGCHAIN_API_KEY (optional)
  -  Sign up to https://smith.langchain.com/
  -  Go to Settings -> API Keys
  -  Create API Key
- GROQ_API_KEY
  - Sign up to https://console.groq.com/
  - Go to API Keys
  - Create API Key

### Check that everything is working

- Check that the configured Kernel is `.venv`
  - In the navigation bar, click `Kernel`
  - Click `Change Kernel...`
- Check that all cells are working
  - In the navigation bar, click `Kernel`
  - Click `Restart Kernel...`
  - In the navigation bar, click `Edit`
  - Click `Clear Outputs of All Cells`
  - In the navigation bar, click `Run`
  - Click `Run All Cells`

## Set up environment

In [20]:
import warnings
from dotenv import load_dotenv

warnings.filterwarnings('ignore')
load_dotenv()

True

## Set up tracing (Optional)
Define `LANGCHAIN_API_KEY` in your `.env` file.  

In [21]:
import os

if "LANGCHAIN_API_KEY" in os.environ and os.environ["LANGCHAIN_API_KEY"] != "":
    os.environ["LANGCHAIN_TRACING_V2"] = "true"
else:
    os.environ["LANGCHAIN_TRACING_V2"] = "false"

os.environ["LANGCHAIN_TRACING_V2"]

'true'

## Set up LLM
Define `GROQ_API_KEY` in your `.env` file.

In [22]:
from langchain_groq import ChatGroq

llm = ChatGroq(model="llama3-8b-8192")

## Indexing - Load documents

In [23]:
import os
import bs4
from langchain_community.document_loaders import WebBaseLoader

os.environ["USER_AGENT"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"

bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()

len(docs[0].page_content)

43131

## Indexing - Split documents

In [24]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    add_start_index=True,
)
all_splits = text_splitter.split_documents(docs)

len(all_splits)

66

## Indexing - Embed and store documents

In [25]:
import time
from langchain_chroma import Chroma
from langchain_huggingface.embeddings import HuggingFaceEmbeddings

print("Indexing started...")
start = time.time()

vectorstore = Chroma.from_documents(documents=all_splits, embedding=HuggingFaceEmbeddings())

end = time.time()
print(f"Indexing completed in {round(end - start, 2)}s.")

Indexing started...
Indexing completed in 7.17s.


## Retrieval

In [26]:
import os

os.environ["TOKENIZERS_PARALLELISM"] = "false"

retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 6},
)
retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")

len(retrieved_docs)

6

## Generation

In [27]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")
example_messages = prompt.invoke({
    "context": "This document is a filler and will be used for example contexts",
    "question": "Is this an example question?",
}).to_messages()

example_messages

[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: Is this an example question? \nContext: This document is a filler and will be used for example contexts \nAnswer:")]

In [28]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)

Task decomposition is the process of breaking down a problem into smaller, manageable subtasks or thought steps. This can be done through simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", using task-specific instructions, or with human inputs.