https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_crag.ipynb

some deviations from the source code because i dont wanna pay for embeddings from openai, or hit openai models. All openAI integration is replaced with ollama.

I also removed langsmith integration. don't think it's needed. just a frontend for LLM debugging which i can achieve with `langchain.debug = True`



In [1]:
from langchain_ollama.chat_models import ChatOllama
from langchain_ollama import OllamaEmbeddings

In [2]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

urls = [
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
    "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]

docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist]

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=250, chunk_overlap=0
)
doc_splits = text_splitter.split_documents(docs_list)

# Add to vectorDB
vectorstore = Chroma.from_documents(
    documents=doc_splits,
    collection_name="rag-chroma",
    embedding= OllamaEmbeddings(
        model="nomic-embed-text:v1.5"
    ),
)
retriever = vectorstore.as_retriever()

USER_AGENT environment variable not set, consider setting it to identify your requests.


Try to write it yourself

I don't really wanna set up Tavilysearch. Will hold off for now.

Define graph state

In [3]:
from langchain.schema import Document
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel
from typing import Literal
from pydantic import BaseModel

In [4]:
from typing import List
from typing_extensions import TypedDict

class GraphState(TypedDict):
    """
    Represents the state of our graph.

    Attributes:
        user_query: question
        generation: LLM generation
        search: whether to add search
        docs: list of documents
        filtered_docs: list of documents that passed LLM grader
    """
    user_query: str
    generation: str
    search: bool
    docs: List[str]
    filtered_docs: List[str]

1. retrieve
2. grade
   1. rewrite -> web search
3. generation

In [14]:
# Key point to note is that we always manipulate state, reminds me of ansible and idempotence
def retrieve(state):
    # Note that retriever is a global obj
    user_query = state['user_query']
    return {
        'user_query': user_query, 
        'docs': retriever.get_relevant_documents(user_query)
    }

def grading(state):
    # Data model
    class GradeDocuments(BaseModel):
        """Binary score for relevance check on retrieved documents."""
        binary_score: str = Literal['Yes', "No"]

    structured_llm_grader = ChatOllama(
        model="llama3.2:3b-instruct-q8_0",
        temperature=0,
        format=GradeDocuments.model_json_schema())

    # Prompt
    grade_prompt_template = """You are a grader assessing relevance of a retrieved document to a user question.
If the document contains keyword(s) or semantic meaning related to the question, grade it as relevant.
Give a binary score 'Yes' or 'No' to indicate whether the document is relevant to the question.

question:
{user_query}

document:
{document}
"""
    grade_prompt = ChatPromptTemplate.from_template(grade_prompt_template)


    user_query = state['user_query']
    docs = state['docs']
    filtered_docs = []
    search = False

    for i, doc in enumerate(docs):
        relevant = (grade_prompt | structured_llm_grader | StrOutputParser()).invoke({
            'user_query': user_query,
            'document': doc.page_content.strip()
        })
        if relevant == 'Yes':
            filtered_docs.append(doc)
            print(f'Doc {i+1} is relevant.')
        else:
            search = True
            print(f'Doc {i+1} is not relevant. Search will be done. Sample:\n{doc.page_content.strip()[:500]}\n')
    return {
        'user_query': user_query, 
        'docs': docs,
        'filtered_docs': filtered_docs,
        'search': search
    }

def decide_to_search(state):
    search = state['search']
    if search:
        return "rewrite_and_search"
    else:
        return "generate"

def rewrite_and_search(state):
    user_query = state['user_query']
    docs = state['docs']
    filtered_docs = state['filtered_docs']
    search = state['search']

    # rewrite question for the purposes of search
    rewrite_template = """You are a master of search engines and are tasked to rewrite a user query for optimum web search.
    You are only to respond with the rewritten user query, without any additional instructions or explanations.

    Rewrite the following user query:
    {user_query}"""
    rewrite_prompt = ChatPromptTemplate.from_template(rewrite_template)

    llm = ChatOllama(
        model="llama3.2:3b-instruct-q8_0",
        temperature=0,
    )
    rewritten_question = (rewrite_prompt | llm | StrOutputParser()).invoke({'user_query': user_query})
    print(f'Do a websearch here, using the rewritten question: {rewritten_question}')
    return {
        'user_query': user_query, 
        'docs': docs,
        'filtered_docs': filtered_docs + [Document(page_content='sample websearch results')],
        'search': search
    }


def generate(state):
    user_query = state['user_query']
    docs = state['docs']
    filtered_docs = state['filtered_docs']
    search = state['search']

    gen_template = """You are a helpful assistant and your job is to answer a user query given the following context.

    context:
    {documents}

    user query:
    {user_query}"""
    gen_prompt = ChatPromptTemplate.from_template(gen_template)

    llm = ChatOllama(
        model="llama3.2:3b-instruct-q8_0",
        temperature=0,
    )
    gen_chain = (
        {
            'documents': itemgetter('documents') | RunnableLambda(lambda docs: '\n---next document---\n'.join([d.page_content.strip() for d in docs])),
            'user_query': RunnablePassthrough()
        }
        | gen_prompt 
        | llm 
        | StrOutputParser()
    )
    generation = gen_chain.invoke({'user_query': user_query, 'documents': filtered_docs})

    return {
        'user_query': user_query, 
        'docs': docs,
        'filtered_docs': filtered_docs + ['sample search result'],
        'search': search,
        'generation': generation
    }

In [15]:
from langgraph.graph import END, StateGraph, START

workflow = StateGraph(GraphState)

# Define the nodes
workflow.add_node("retrieve", retrieve)  # retrieve
workflow.add_node("grading", grading)  # grade documents
workflow.add_node("generate", generate)  # generatae
workflow.add_node("rewrite_and_search", rewrite_and_search)  # transform_query

# Build graph
workflow.add_edge(START, "retrieve")
workflow.add_edge("retrieve", "grading")
workflow.add_conditional_edges(
    "grading",
    decide_to_search,
    {
        "rewrite_and_search": "rewrite_and_search",
        "generate": "generate",
    },
)
workflow.add_edge("rewrite_and_search", "generate")
workflow.add_edge("generate", END)

# Compile
app = workflow.compile()

In [18]:
# Run
inputs = {"user_query": "What are the types of agent memory?"}
for output in app.stream(inputs):
    for key, value in output.items():
        # Node
        print(f"Node '{key}':")
        # Optional: print full state at each node
        # pprint.pprint(value["keys"], indent=2, width=80, depth=None)
    print("\n---\n")

Node 'retrieve':

---





Doc 1 is not relevant. Search will be done. Sample:
Fig. 7. Comparison of AD, ED, source policy and RL^2 on environments that require memory and exploration. Only binary reward is assigned. The source policies are trained with A3C for "dark" environments and DQN for watermaze.(Image source: Laskin et al. 2023)
Component Two: Memory#
(Big thank you to ChatGPT for helping me draft this section. I’ve learned a lot about the human brain and data structure for fast MIPS in my conversations with ChatGPT.)
Types of Memory#
Memory can be defined as the p

Doc 2 is not relevant. Search will be done. Sample:
They also discussed the risks, especially with illicit drugs and bioweapons. They developed a test set containing a list of known chemical weapon agents and asked the agent to synthesize them. 4 out of 11 requests (36%) were accepted to obtain a synthesis solution and the agent attempted to consult documentation to execute the procedure. 7 out of 11 were rejected and among these 7 rejected c

I can see that i should adjust the prompt for rewriting a bit better, so that it doesn't spit out so much unrelated content. 

In [19]:
print(value["generation"])

Based on the provided context, I can help answer the user's query about the types of agent memory.

According to my knowledge, there are several types of memory in artificial intelligence and robotics, which can be categorized into two main types: declarative memory and procedural memory.

1. **Declarative Memory**: This type of memory stores factual information, such as knowledge about the world, rules, and concepts. It is also known as "what" memory because it answers questions like "What is this?" or "What does this do?"

Examples of declarative memories include:

* Knowledge graphs
* Ontologies
* Expert systems

2. **Procedural Memory**: This type of memory stores procedures, skills, and habits. It is also known as "how" memory because it answers questions like "How do I do that?" or "What's the best way to do this?"

Examples of procedural memories include:

* Neural networks
* Rule-based systems
* Sensorimotor integration

Additionally, there are other types of agent memory, such