### Graph
 ![image](../image/graph.png)

### Ref

* [Langchain/.../State.py](https://github.com/langchain-ai/langgraph/blob/main/libs/langgraph/langgraph/graph/state.py)

In [72]:
# 환경 변수 설정
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
# os.environ["LANGCHAIN_API_KEY"] = "sk-"
os.environ['OPENAI_API_KEY'] = "sk-"

local_llm = "llama3"

In [49]:
# Docs Retrieval

from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_nomic.embeddings import NomicEmbeddings
from langchain_text_splitters import RecursiveCharapycterTextSplitter


urls = [
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
    "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]

docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist]
# 
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=250, chunk_overlap=0
)
doc_splits = text_splitter.split_documents(docs_list)
# 
# # Add to vectorDB
vectorstore = Chroma.from_documents(
    documents=doc_splits,
    collection_name="rag-project-chroma", # Customizing
    embedding=NomicEmbeddings(model="nomic-embed-text-v1.5",
                             inference_mode='local'),
)
retriever = vectorstore.as_retriever()

In [67]:
retriever_q = "what is agent memory?"
# print(retriever.invoke(retriever_q))

retriever_docs = retriever.invoke(retriever_q)
for doc in retriever_docs:
    print(doc.page_content)


They also discussed the risks, especially with illicit drugs and bioweapons. They developed a test set containing a list of known chemical weapon agents and asked the agent to synthesize them. 4 out of 11 requests (36%) were accepted to obtain a synthesis solution and the agent attempted to consult documentation to execute the procedure. 7 out of 11 were rejected and among these 7 rejected cases, 5 happened after a Web search while 2 were rejected based on prompt only.
Generative Agents Simulation#
Generative Agents (Park, et al. 2023) is super fun experiment where 25 virtual characters, each controlled by a LLM-powered agent, are living and interacting in a sandbox environment, inspired by The Sims. Generative agents create believable simulacra of human behavior for interactive applications.
The design of generative agents combines LLM with memory, planning and reflection mechanisms to enable agents to behave conditioned on past experience, as well as to interact with other agents.

M

In [73]:
## Relevance Checker

### (Retrieval) Relevance Checker (Grader)

from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

# LLM
# llm = ChatOllama(model=local_llm, format="json", temperature=0)
# prompt = PromptTemplate(
#     template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a grader assessing relevance 
#     of a retrieved document to a user question. If the document contains keywords related to the user question, 
#     grade it as relevant. It does not need to be a stringent test. The goal is to filter out erroneous retrievals. \n
#     Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question. \n
#     Provide the binary score as a JSON with a single key 'score' and no premable or explanation.
#      <|eot_id|><|start_header_id|>user<|end_header_id|>
#     Here is the retrieved document: \n\n {document} \n\n
#     Here is the user question: {question} \n <|eot_id|><|start_header_id|>assistant<|end_header_id|>
#     """,
#     input_variables=["question", "document"],
# )

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
prompt = PromptTemplate(
    template="""system: \'You are a grader assessing relevance 
    of a retrieved document to a user question. If the document contains keywords related to the user question, 
    grade it as relevant. It does not need to be a stringent test. The goal is to filter out erroneous retrievals. \n
    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question. \n
    Provide the binary score as a JSON with a single key 'score' and no premable or explanation\n\'
     user: Here is the retrieved document: 
      {document} 
    Here is the user question:
     {question} 
     assistance: ??? 
    """,
    input_variables=["question", "document"],
)

relevance_checker = prompt | llm | JsonOutputParser()


# Run
# question = "agent memory"
# docs = retriever.invoke(question)
# doc_txt = docs[1].page_content
# print(retrieval_grader.invoke({"question": question, "document": doc_txt}))

In [77]:
doc_txt = retriever_docs[0].page_content
print(relevance_checker.invoke({"question": "what is Generative Agents", "document": doc_txt}))
print(doc_txt)

{'score': 'yes'}
They also discussed the risks, especially with illicit drugs and bioweapons. They developed a test set containing a list of known chemical weapon agents and asked the agent to synthesize them. 4 out of 11 requests (36%) were accepted to obtain a synthesis solution and the agent attempted to consult documentation to execute the procedure. 7 out of 11 were rejected and among these 7 rejected cases, 5 happened after a Web search while 2 were rejected based on prompt only.
Generative Agents Simulation#
Generative Agents (Park, et al. 2023) is super fun experiment where 25 virtual characters, each controlled by a LLM-powered agent, are living and interacting in a sandbox environment, inspired by The Sims. Generative agents create believable simulacra of human behavior for interactive applications.
The design of generative agents combines LLM with memory, planning and reflection mechanisms to enable agents to behave conditioned on past experience, as well as to interact with

In [78]:
### Generate Answer

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate

# Prompt
# prompt = PromptTemplate(
#     template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an assistant for question-answering tasks. 
#     Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. 
#     Use three sentences maximum and keep the answer concise <|eot_id|><|start_header_id|>user<|end_header_id|>
#     Question: {question} 
#     Context: {context} 
#     Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
#     input_variables=["question", "document"],
# )
# 
# llm = ChatOllama(model=local_llm, temperature=0)

prompt = PromptTemplate(
    template = """
    You are an assistant trained to provide concise and accurate answers using the provided context. If the answer is not known, state that clearly. Please keep the response limited to three sentences.
    Question: {question}
    Context: {context}
    """,
    input_variables=["question", "document"]
)

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


# Chain
answer_generator = prompt | llm | StrOutputParser()

# Run
# question = "agent memory"
# docs = retriever.invoke(question)
# generation = rag_chain.invoke({"context": docs, "question": question})
# print(generation)

In [80]:
generation = answer_generator.invoke({"question": "what is Generative Agents", "context": retriever_docs})
print(generation)

AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-M74AX***************************************RzZs. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

In [52]:
## Hallucination Checker

# LLM
llm = ChatOllama(model=local_llm, format="json", temperature=0)

# Prompt
prompt = PromptTemplate(
    template=""" <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a grader assessing whether 
    an answer is grounded in / supported by a set of facts. Give a binary 'yes' or 'no' score to indicate 
    whether the answer is grounded in / supported by a set of facts. Provide the binary score as a JSON with a 
    single key 'score' and no preamble or explanation. <|eot_id|><|start_header_id|>user<|end_header_id|>
    Here are the facts:
    \n ------- \n
    {documents} 
    \n ------- \n
    Here is the answer: {generation}  <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
    input_variables=["generation", "documents"],
)

hallucination_checker = prompt | llm | JsonOutputParser()

# RUN
# hallucination_grader.invoke({"documents": docs, "generation": generation})

In [53]:
### Router

from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate

# LLM
llm = ChatOllama(model=local_llm, format="json", temperature=0)

prompt = PromptTemplate(
    template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an expert at routing a 
    user question to a vectorstore or web search. Use the vectorstore for questions on LLM  agents, 
    prompt engineering, and adversarial attacks. You do not need to be stringent with the keywords 
    in the question related to these topics. Otherwise, use web-search. Give a binary choice 'web_search' 
    or 'vectorstore' based on the question. Return the a JSON with a single key 'datasource' and 
    no premable or explanation. Question to route: {question} <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
    input_variables=["question"],
)

question_router = prompt | llm | JsonOutputParser()

# Run
# question = "llm agent memory"
# docs = retriever.get_relevant_documents(question)
# doc_txt = docs[1].page_content
# print(question_router.invoke({"question": question}))

In [40]:
### Web Search

from langchain_community.tools.tavily_search import TavilySearchResults
os.environ["TAVILY_API_KEY"] = "tvly-"

web_search_tool = TavilySearchResults(k=3)

In [54]:
from pprint import pprint
from typing import List

from langchain_core.documents import Document
from typing_extensions import TypedDict

from langgraph.graph import END, StateGraph

In [55]:
### State

class GraphState(TypedDict):
    """
    Represents the state of our graph.

    Attributes:
        question: question
        generation: LLM generation
        web_search: whether to add search
        documents: list of documents
        source: sources of documents
    """

    question: str
    generation: str
    web_search: str
    documents: List[str]
    source: str # mission point

In [56]:
### Nodes

def do_retrieve(state):
    """
    Retrieve documents from vectorstore

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, documents, that contains retrieved documents
    """
    print("---RETRIEVE FROM VECTOR STORE---")
    question = state["question"]

    # Retrieval
    documents = retriever.invoke(question)
    return {"documents": documents, "question": question, "source": "vectorstore"}


def do_generate(state):
    """
    Generate answer using RAG on retrieved documents

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    print("---GENERATE---")
    question = state["question"]
    documents = state["documents"]

    # RAG generation
    generation = answer_generator.invoke({"context": documents, "question": question})
    return {"documents": documents, "question": question, "generation": generation}

def do_relevance_check(state):
    """
    Determines whether the retrieved documents are relevant to the question
    If any document is not relevant, we will set a flag to run web search

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Filtered out irrelevant documents and updated web_search state
    """

    print("---CHECK DOCUMENT RELEVANCE TO QUESTION---")
    question = state["question"]
    documents = state["documents"]

    # Score each doc
    filtered_docs = []
    web_search = "No"
    for d in documents:
        score = relevance_checker.invoke(
            {"question": question, "document": d.page_content}
        )
        grade = score["score"]
        # Document relevant
        if grade.lower() == "yes":
            print("---GRADE: DOCUMENT RELEVANT---")
            filtered_docs.append(d)
        # Document not relevant
        else:
            print("---GRADE: DOCUMENT NOT RELEVANT---")
            # We do not include the document in filtered_docs
            # We set a flag to indicate that we want to run web search
            web_search = "Yes"
            continue
    return {"documents": filtered_docs, "question": question, "web_search": web_search}

def do_web_search(state):
    """
    Web search based based on the question

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Appended web results to documents
    """

    print("---WEB SEARCH---")
    question = state["question"]
    documents = state["documents"]

    # Web search
    docs = web_search_tool.invoke({"query": question})
    web_results = "\n".join([d["content"] for d in docs])
    web_results = Document(page_content=web_results)
    if documents is not None:
        documents.append(web_results)
    else:
        documents = [web_results]
    return {"documents": documents, "question": question, "source": "web_search"}

# def do_hallucination_checker(state):
#     """
#     Determines whether the generation is grounded in the document and answers question.
# 
#     Args:
#         state (dict): The current graph state
# 
#     Returns:
#         str: Decision for next node to call
#     """
# 
#     print("---CHECK HALLUCINATIONS---")
#     question = state["question"]
#     documents = state["documents"]
#     generation = state["generation"]
# 
#     score = hallucination_checker.invoke(
#         {"documents": documents, "generation": generation}
#     )
#     grade = score["score"]
# 
#     # Check hallucination
#     if grade == "yes":
#         return "useful"
#     else:
#         pprint("---DECISION: GENERATION IS NOT GROUNDED IN DOCUMENTS, RE-TRY---")
#         return "not useful"

In [57]:
### Conditional Edge

# def edge_route_question(state):
#     """
#     Route question to web search or RAG.
# 
#     Args:
#         state (dict): The current graph state
# 
#     Returns:
#         str: Next node to call
#     """
# 
#     print("---ROUTE QUESTION---")
#     question = state["question"]
#     print(f"bot --> : {question}")
#     source = question_router.invoke({"question": question})
#     print(f"bot --> target source is {source['datasource']}")
#     if source["datasource"] == "web_search":
#         print("---ROUTE QUESTION TO WEB SEARCH---")
#         return "websearch"
#     elif source["datasource"] == "vectorstore":
#         print("---ROUTE QUESTION TO RAG---")
#         return "vectorstore"
    
def edge_decide_to_generate(state):
    """
    Determines whether to generate an answer, or add web search

    Args:
        state (dict): The current graph state

    Returns:
        str: Binary decision for next node to call
    """

    print("---ASSESS GRADED DOCUMENTS---")
    state["question"]
    web_search = state["web_search"]
    state["documents"]

    if web_search == "yes":
        # All documents have been filtered check_relevance
        # We will re-generate a new query
        print(
            "---DECISION: ALL DOCUMENTS ARE NOT RELEVANT TO QUESTION, INCLUDE WEB SEARCH---"
        )
        return "websearch"
    else:
        # We have relevant documents, so generate answer
        print("---DECISION: GENERATE---")
        return "generate"
    
def edge_hallucination_check(state):
    """
    Determines whether the generation is grounded in the document and answers question.

    Args:
        state (dict): The current graph state

    Returns:
        str: Decision for next node to call
    """

    print("---CHECK HALLUCINATIONS---")
    question = state["question"]
    documents = state["documents"]
    generation = state["generation"]

    score = hallucination_checker.invoke(
        {"documents": documents, "generation": generation}
    )
    grade = score["score"]

    # Check hallucination
    if grade == "yes":
        return "useful"
    else:
        pprint("---DECISION: GENERATION IS NOT GROUNDED IN DOCUMENTS, RE-TRY---")
        return "not useful"

In [58]:
# Workflow 
workflow = StateGraph(GraphState)
# Define the nodes
workflow.add_node("websearch", do_web_search)
workflow.add_node("retrieve", do_retrieve)
workflow.add_node("relevance_check", do_relevance_check) 
workflow.add_node("generate", do_generate)
# workflow.add_node("hallucination_check", do_hallucination_checker)

In [59]:
### Build graph

workflow.set_entry_point("retrieve")

workflow.add_edge("retrieve", "relevance_check")

workflow.add_conditional_edges(
    "relevance_check",
    edge_decide_to_generate,
    {
        "websearch": "websearch",
        "generate": "generate",
    },
)
workflow.add_edge("websearch", "relevance_check")
# workflow.add_edge("generate", "hallucination_check")


workflow.add_conditional_edges(
    "generate",
    edge_hallucination_check,
    {
        "useful": END,
        "not useful": "generate",
    },
)

In [61]:
# Compile
app = workflow.compile()

# Test

inputs = {"question": "What are the types of agent memory?"}
for output in app.stream(inputs):
    for key, value in output.items():
        pprint(f"Finished running: {key}:")
        
        
pprint(f"bot -> : {value}")

---RETRIEVE FROM VECTOR STORE---
'Finished running: retrieve:'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
'Finished running: relevance_check:'
---GENERATE---
---CHECK HALLUCINATIONS---
'Finished running: generate:'
("bot -> : {'question': 'What are the types of agent memory?', 'generation': "
 "'According to the provided context, there are several types of memory in "
 'human brains. The specific type mentioned is Sensory Memory, which includes '
 'iconic memory (visual), echoic memory (auditory), and haptic memory (touch). '
 'Additionally, a "Memory stream" is mentioned as a long-term memory module '
 "that records a comprehensive list of agents\\' experience in natural "
 "language.', 'documents': [Document(page_content='Fig. 7. Comparison of AD, "
 'ED, source policy and RL^2 on environments that require m