# Exercise - Knowledge Base Agent - STARTER

In this exercise, you’ll build a Knowledge Base Agent using LangGraph, which can:

Efficiently process long documents using text embedding and chunking.
Retrieve information from a vector database.
Augment user queries with retrieved contextual documents.
Generate accurate responses using an LLM.


**Challenge**

Your task is to create a LangGraph Workflow that includes:

- A document loading and vectorization process for a knowledge base.
- An Agent Node capable of:
    - Retrieving relevant knowledge.
    - Augmenting responses with contextual documents.
    - Generating accurate answers.
- Conditional routing to control query resolution.
- Optimization techniques such as text chunking and embedding search.

By the end of this exercise, you’ll have built an AI-powered Knowledge Base Agent that uses a structured process to generate accurate answers.



## 0. Import the necessary libs

In [None]:
from typing import List
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.document_loaders import PyPDFLoader
from langgraph.graph import START, END, StateGraph
from langgraph.graph.message import MessagesState
from IPython.display import Image, display

## 1. Instantiate Chat Model with your API Key

To be able to connect with OpenAI, you need to instantiate an OpenAI client passing your OpenAI key.

You can pass the `api_key` argument directly.
```python
llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0.0,
    api_key="sk-",
)
```
Usually the OpenAI API key is a long string starting with `sk-`.


Alternatively, can do this implicitly. However to use this approach, you should have a .env file with a variable called OPENAI_API_KEY.
```python
from dotenv import load_dotenv
load_dotenv()
llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0.0,
)
```

Loading an environment variable prevents you from exposing it in your code.

In [None]:
# FILL IN - Instantiate your chat model
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.0,
    api_key = "YOUR_API_KEY_HERE",
)

In [None]:
# FILL IN - Instantiate your embeddings model
embeddings_fn = OpenAIEmbeddings(
    model="text-embedding-3-large"
    api_key = "YOUR_API_KEY_HERE",
)

## 2. Load and Process Documents

In [None]:
# FILL IN create your Chroma vector store with a collection name 
# and the embedding function
vector_store = 

In [None]:
file_path = "compact-guide-to-large-language-models.pdf"

In [None]:
loader = PyPDFLoader(file_path)

In [None]:
pages = []
async for page in loader.alazy_load():
    pages.append(page)

In [None]:
# FILL IN - Create a text splitter with chunk_size and chunk_overlap 
# values of 1000 and 200, respectively
text_splitter = RecursiveCharacterTextSplitter()

In [None]:
all_splits = text_splitter.split_documents(pages)

In [None]:
_ = vector_store.add_documents(documents=all_splits)

## 3. Define State Schema

We define a State Schema for managing:

- User query
- Retrieved documents
- Generated answer

In [None]:
# FILL IN - Create your state schema named State inheriting from MessagesState
# with question(str), documents(List) and answer(str) attributes
class State

## 4. RAG Nodes

The agent should:
- fetch relevant document chunks based on the user query
- combine the retrieved documents and use them as context
- invoke the LLM to generate a response

In [None]:
# FILL IN - Use the vector store to retrieve similar documents to the question
def retrieve(state: State):
    question = state["question"]
    retrieved_docs = 
    return {"documents": retrieved_docs}

In [None]:
# FILL IN - Create a RAG ChatPromptTemplate with question and context variables
def augment(state: State):
    question = state["question"]
    documents = state["documents"]
    docs_content = "\n\n".join(doc.page_content for doc in documents)

    template = 

    messages = template.invoke(
        {"context": docs_content, "question": question}
    ).to_messages()

    return {"messages": messages}

In [None]:
# FILL IN - Invoke the LLM
def generate(state: State):
    ai_message = 
    return {"answer": ai_message.content, "messages": ai_message}

## 5. Build the LangGraph Workflow

In [None]:
## FILL IN - add all the nodes and edges

workflow = StateGraph(State)

In [None]:
graph = workflow.compile()

display(
    Image(
        graph.get_graph().draw_mermaid_png()
    )
)

## 6. Invoke the Agent with a Query

Run and Print the retrieved documents to check search accuracy.

In [None]:
output = graph.invoke(
    {"question": "What are Open source models?"}
)

In [None]:
output["answer"]

In [None]:
for message in output["messages"]:
    message.pretty_print()

## 10. Break Things

Now that you understood how it works, experiment new things.

- Change the embedding model
- Change the parameters of RecursiveCharacterTextSplitter(chunk_size and chunk_overlap)
- Use your own document
- Add More File Types