# Agents: Some key concepts
---

### What is an Agent?

In context of Large Language Models (`LLMs`), an agent is an autonomous or semi-autonomous system that can take actions on behalf of the user in a given environment, state, and make decisions or take actions that can accomplish certain tasks. A more detailed explanation of agents would be systems powered by `LLMs`, or `LLMs` themselves that have access to a certain user environment, external sources (such as data) that they can use to accomplish user tasks by efficiently breaking the request into smaller components. An agent can perform these tasks using different components given below:

### Tools
Tools are simple functions (can be python functions) that agents use to interact with `APIs` or external data sources. Tools can be of various kinds but can be divided into two portions:

- `Built-in tools`: Several Agentic frameworks that help you build agents provide tools that are already implemented that you can use through your agent such as:
    1. Web search tools to retrieve information from the web (such as `DuckDuckGo`, `Bing Search`, `Wikipedia API`).
    1. Database tools such as `SQL Database connector` to execute SQL queries against relational databases.
    1. Different frameworks provide a host of different built-in tools. For example, view some built-in tools offered by `LangChain` [here](https://python.langchain.com/docs/concepts/tools/).

- `Custom tools`: You can build your own custom tools that perform various actions by interacting with your choice of custom/proprietary `APIs`. View some of these examples below:
    1. **Document parsing**: Extract and analyze text from PDF documents, create summaries of long documents, etc. 
    1. **Task management**: Create and assign tasks in project management systems for your specific use cases and customer stories.
    1. **Budget analyzer**: Categorize and analyze spending patterns across your company infrastructure using custom `APIs`.
    1. **Code analyzer**: Review code for bugs or improvements using your own fine-tuned model.

**Note**: Writing tools are simple as writing functions. You can write tools using different approaches via different frameworks. In this lab, we will go over how we can write tools in `LangGraph`.

### Workflows
Workflows are how agents can use a predefined sequence of operations that define how an agent can process information and take action. For example, a single agent can have access to multiple tools and be prompted to perform those actions either in a 

1. `sequential` manner (executing tools one by one), 
1. `branching`, where the agent dynamically determines which tool to call based on the step it took previously, 
1. `Cyclic`, where the agent can perform the same set of steps with a human in the loop to review and iterate on the new results or -
1. It can be a combination of various workflows and agents interacting all together.

### Persistence/Memory
Imagine if you had an agent powering your chatbot serving thousands of users. In that case, it becomes important to retain the memory and the state of the agent with specific user interactions. This is essential for:
1. Maintaining conversation context, 
1. Resuming interrupted operations
1. Tracking process across multiple user sessions.
Using `LangGraph`, you can retain `short term memory` (which refers to a thread-scoped memory, can be recalled at any time from within a single conversational thread with a user) or `long term memory` (which refers to memory that is shared across conversational threads. It can be recalled at any time and in any thread. Memories are scoped to any custom namespace, not just within a single thread ID.)

### Human-in-the-loop (`HITL`)
Agents act autonomously and need to be evaluated. This requires iterative testing and a human, who can check the steps that an agent takes and makes sure that the agent is performing and executing the right steps and tools. For this, it is important to have a human-in-the-loop. 
- This workflow integrates a human input into automated processes, allowing for decisions.
- Useful in LLM-based applications to check for accuracy
- Examples of these are:
    1. **Reviewing tool calls**: Humans can review, edit or approve which tools to call by the agent.
    1. **Validating `LLM` outputs**: Humans can review, edit or approve the outputs generated by `LLMs`.
    1. **Providing context**: You can have the `agent` or `LLM` return control to the user. This means that the agent will pause and ask the human or user for validation before executing an action. (For example, asking for more information, or clarifying steps to take)

### Prompting
Prompting is a technique to control and provide instructions to how an `LLM` can process user inputs and generate responses. In `Agentic` systems, prompts are essential for:
1. **Defining the agent's role or capabilities**: Your agent solution might have several agents working together. In that case, having separate instructions and guidance for agents can be provided through prompts.
1. **Structuring the agent's thinking process**: Your agent's thinking process is defined by the instructions that you provide in the prompt. This includes if you want your agent to execute certain tools before the others for example.
    
**Note**: Prompt engineering is a separate topic and more information on this can be found [here](https://aws.amazon.com/what-is/prompt-engineering/). Prompt templates will be different from model-to-model to get the best performance for your agent use case.

### State management
State management in agents refer to tracking and updating the variables that the agent uses based on the conversation history, task progress, available resources and user preferences.

## Agent frameworks
There are several Agent frameworks that can help us build stateful, persistent agents. These frameworks are either open-source (Such as [LangGraph](https://www.langchain.com/langgraph), [Letta](https://www.letta.com/), [LangChain](https://www.langchain.com/), [CrewAI](https://www.crewai.com/)) or AWS-offered, such as [Agents on Amazon Bedrock](https://aws.amazon.com/bedrock/agents/). In this lab, we will be taking an example of `LangGraph` to build your simple agent with memory.

# Build a simple agent using `LangGraph`

In this lab, we will use `LangGraph` to build a simple agent. `LangGraph` is an open-source low-level orchestration framework for building controllable agents. This library enables agent orchestration, offering customizable architectures such as long-term memory, `HITL`, custom tools, etc. `LangGraph` provides low-level supporting infrastructure that sits underneath any workflow or agent. It does not abstract prompts or architecture, and provides three central benefits:

#### Persistence
`LangGraph` has a persistence layer, which offers a number of benefits:

1. `Memory`: LangGraph persists arbitrary aspects of your application's state, supporting memory of conversations and other updates within and across user interactions;
1. `Human-in-the-loop`: Because state is `checkpointed`, execution can be interrupted and resumed, allowing for decisions, validation, and corrections via human input.

#### Streaming
`LangGraph` also provides support for streaming workflow / agent state to the user (or developer) over the course of execution. `LangGraph` supports streaming of both events (such as feedback from a tool call) and tokens from LLM calls embedded in an application.

#### Debugging and Deployment
`LangGraph` provides an easy onramp for testing, debugging, and deploying applications via `LangGraph` Platform. This includes Studio, an IDE that enables visualization, interaction, and debugging of workflows or agents. This also includes numerous options for deployment.

#### In this lab we will:

1. Build a simple agent powered by an `LLM`.

1. Create the `LangGraph` graph state which will have `nodes` as python functions and edges as components maintaining the flow between the nodes.

1. Invoke the agent.

1. Persist short term and long term memory within the agent.

### Set up and imports

In [None]:
# first, lets import the necessary libraries required to build the agent in this notebook
from typing import TypedDict, Annotated, List
from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_core.runnables.graph import MermaidDrawMethod
from IPython.display import Image, display

### LangGraph components
---

`LangGraph`, at its core models agent workflows as graphs. These graphs contain nodes, edges and a state. LangGraph revolves around the concept of stateful graph, where each node in the graph represents a step in the application (could be an agent or an action) and the graph maintains a state that is passed around and updated as the computation progresses. You can define the behavior of your agent using these three components:

1. `State`: The State schema serves as the input schema for all Nodes and Edges in the graph.

1. `StateGraph`: This is a shared data structure that represents the current snapshot of your graph of your application.

1. `Nodes`: These are python functions that contain the logic to your agents. There are two main nodes:
    - `Agent node`: An agent node is responsible for deciding on which actions to take.
    - `Tool node`: The tool node will orchestrate calling a respective tool, performing some computation and then returning the output back to the user with the updated state.

1. `Edges`: These control the flow between nodes. These are python functions that contain logic to determine which `Node` to call next based on the current `state`. This is a big part of how your agents work and how different nodes communicate with one another. There are a few concepts to edges:

    - **Normal Edges**: Go directly from one node to the next.
    - **Conditional Edges**: Call a function to determine which node(s) to go to next.
    - **Entry Point**: Which node to call first when user input arrives.
    - **Conditional Entry Point**: Call a function to determine which node(s) to call first when user input arrives.



### Step 1: Define the Agent State

We will first define the agent state that will remain throughout the operation. This is the state of the graph and the state schema serves as the input for all nodes and edges in the graph.

In [None]:
class PlannerState(TypedDict):
    """
    This is the state of the planner agent. It contains the following fields:
    - `messages`: The messages in the conversation.
    - `itinerary`: The itinerary generated by the agent.
    - `city`: The city for which the itinerary is generated.
    - `user_message`: The user message containing the query.
    """
    messages: Annotated[List[HumanMessage | AIMessage], "The messages in the conversation"]
    itinerary: str
    city: str
    user_message: str

### Step 2: Set up language models and prompts

Next, we will set up LLM and a prompt template that will be used by our agent in planning the trip.

In [None]:
# represents the global variables used across this notebook
BEDROCK_RUNTIME: str = 'bedrock-runtime'
# Model ID used by the agent
AMAZON_NOVA_LITE_MODEL_ID: str = "us.amazon.nova-lite-v1:0"
PROVIDER_ID : str = 'amazon'
# Inference parameters
TEMPERATURE: float = 0.1
MAX_TOKENS: int = 512

In [None]:
import boto3
import logging
# We are importing this to use any model supported on Amazon Bedrock. In this example
# we will be using the Amazon Nova lite model.
from langchain_aws import ChatBedrockConverse
# This helps checkpoint the memory state of the agent for short term/long term memory
from langgraph.checkpoint.memory import MemorySaver
from langchain_core.runnables.config import RunnableConfig
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

In [None]:
# set a logger
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

In [None]:
session = boto3.session.Session()
region = session.region_name
logger.info(f"Running this example in region: {region}")

# Initialize the bedrock client placeholder
bedrock_client = boto3.client("bedrock-runtime")

In [None]:
# Create the llm used by the agent
llm = ChatBedrockConverse(
    model=AMAZON_NOVA_LITE_MODEL_ID,
    provider=PROVIDER_ID, 
    temperature=TEMPERATURE, 
    max_tokens=MAX_TOKENS,
    client=bedrock_client,
)

# Initialize the itinerary prompt that will be used across the agent workflow. This gives agent instructions and guidance while executing tasks based
# on the user request. This prompt will be used by the LLM.

# Note that this prompt contains placeholder variables {city} and {user_message} that will be replaced with the user input and city name respectively.
ITINERARY_PROMPT = ChatPromptTemplate.from_messages([
    ("system", """You are a helpful travel assistant. Create a day trip itinerary for {city} based on the user's interests. 
    Follow these instructions:
    1. Use the below chat conversation and the latest input from Human to get the user interests.
    2. Always account for travel time and meal times - if its not possible to do everything, then say so.
    3. If the user hasn't stated a time of year or season, assume summer season in {city} and state this assumption in your response.
    4. If the user hasn't stated a travel budget, assume a reasonable dollar amount and state this assumption in your response.
    5. Provide a brief, bulleted itinerary in chronological order with specific hours of day."""),
    MessagesPlaceholder("chat_history"),
    ("human", "{user_message}"),
])

### Define the nodes and Edges
---

Next, we will be adding nodes, edges and persistent memory to the `StateGraph` before we compile it.

1. user travel plans
1. invoke with Bedrock
1. generate the travel plan for the day
1. ability to add or modify the plan

In [None]:
# The first node that we will create will solely input a user message
def input_interest(state: PlannerState) -> PlannerState:
    """
    This function takes a PlannerState object as input and prompts the user to enter their interest.
    It then updates the PlannerState object with the user's interest and returns the updated object.
    """
    # The user message is fetched from the state during invocation. The stateGraph is traversed during this computation
    user_message = state['user_message'] # this comes from the user input
    if not state.get('messages'): state['messages'] = []
    return {
        **state
    }

# The next node is responsible for creating an agent invocation based on the user interest
def create_itinerary(state: PlannerState) -> PlannerState:
    """
    This function takes a PlannerState object as input and creates an agent invocation based on the user's interest.
    It then updates the PlannerState object with the agent invocation and returns the updated object.
    """
    # Format the user input into the prompt, and any relevant message history. This contains the city, user message and any relevant chat history.
    response = llm.invoke(ITINERARY_PROMPT.format_messages(city=state['city'], user_message=state['user_message'], chat_history=state['messages']))
    print("\nItinerary plan:")
    print(response.content)
    return {
        **state,
        'messages': [
            *state['messages'],
            HumanMessage(content=state['user_message']),
            AIMessage(content=response.content)
        ],
        'itinerary': response.content
    }

### Create and Compile the Graph
---

In this portion of the notebook, we will create our `LangGraph` workflow and then compile it.

1. First, we will initialize the `StateGraph` with the `State` class that we defined above
1. Then, we add our nodes and edges.
1. We use the `START` Node, a special node that sends user input to the graph, to indicate where to start our graph.
1. The `END` Node is a special node that represents a terminal node.

In [None]:
# initialize the StateGraph
workflow = StateGraph(PlannerState)
# Next, we will add our nodes to this workflow
workflow.add_node("input_user_interests", input_interest)
workflow.add_node("create_itinerary", create_itinerary)
workflow.set_entry_point("input_user_interests")
# Next, we will add a direct edge between input interests and create itinerary
workflow.add_edge("input_user_interests", "create_itinerary")
workflow.add_edge("create_itinerary", END)

### What is Memory?¶
Memory is a cognitive function that allows people to store, retrieve, and use information to understand their present and future.

As AI agents undertake more complex tasks involving numerous user interactions, equipping them with memory becomes equally crucial for efficiency and user satisfaction. With memory, agents can learn from feedback and adapt to users' preferences. This guide covers two types of memory based on recall scope:

1. `Short-term memory`, or thread-scoped memory, can be recalled at any time from within a single conversational thread with a user. `LangGraph` manages short-term memory as a part of your agent's state. State is persisted to a database using a `checkpointer` so the thread can be resumed at any time. Short-term memory updates when the graph is invoked or a step is completed, and the State is read at the start of each step.

1. `Long-term memory` is shared across conversational threads. It can be recalled at any time and in any thread. Memories are scoped to any custom namespace, not just within a single thread ID. `LangGraph` provides stores (reference doc) to let you save and recall long-term memories.

Both are important to understand and implement for your application.

In [None]:
# Next, we create a checkpointer which will let us have the graph persist its state
# this is a complete memory for the entire graph
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [None]:
# Now that we have compiled the graph, we can view it in a mermaid diagram. You can use Mermaid
# to generate diagrams from markdown-like text.

display(
    Image(
        app.get_graph().draw_mermaid_png(
            draw_method=MermaidDrawMethod.API,
        )
    )
)

From the graph above, we can see that from the start point, the user can input the message or their interests for the itinerary. Next, according to the next node, the llm will be invoked with the custom prompt template to create that itinerary. This is a hello world example of a simple agent on `LangGraph`.

### Define the function that runs the graph
---

When we compile the graph, we turn it into a `LangChain` Runnable, which automatically enables calling `.invoke()`, `.stream()` and `.batch()` with your inputs. In the following example, we run `stream()` to invoke the graph with inputs

In [None]:
def run_travel_planner(user_request: str, city_request: str, config_dict: dict):
    print(f"Current User Request: {user_request}\n")
    print(f"City of interest: {city_request}\n")
    init_input = {"user_message": user_request,"city" : city_request}

    for output in app.stream(init_input, config=config_dict, stream_mode="values"):
        pass  # The nodes themselves now handle all printing

In [None]:
# lets run the example

# thread_id within the configurable object serves as a unique identifier for the conversation session. This ID is used by the graph to: Track which conversation a message belongs to and
# maintain separate conversation histories for different users or sessions
config = {"configurable": {"thread_id": "1"}}

# I want to create an itinerary for a day trip in seattle with boaring and swimming options. Make it extremely comprehensive and should include meals as well
user_request = input("Enter your travel request: ")
# Seattle
city = input("Enter the city: ")

# Use them in the function call
run_travel_planner(user_request, city, config)

### Leverage the memory saver to manipulate the Graph State

1. Since the Conversation Messages are part of the graph state we can leverage that

1. However the graph state is tied to session_id which will be passed in as a thread_id which ties to a session

1. If we add a request with different thread id it will create a new session which will not have the previous Interests

1. However this this has the other check points variables as well and so this pattern is good for A-Sync workflow

In [None]:
# lets run the example

# thread_id within the configurable object serves as a unique identifier for the conversation session. This ID is used by the graph to: Track which conversation a message belongs to and
# maintain separate conversation histories for different users or sessions
config = {"configurable": {"thread_id": "1"}}

# Can you add picnic to this itinerary?
user_request = input("Enter your travel request: ")
# Seattle
city = input("Enter the city: ")

# Use them in the function call
run_travel_planner(user_request, city, config)

A new section for picnic would have been added to the previous response since we used the same thread id for this call: 

```
Picnic Lunch at Gas Works Park
- **Location:** 1421 N. Lake Washington Blvd, Seattle, WA 98109
- **Description:** Head to Gas Works Park for a picnic lunch. The park offers beautiful views of the Seattle skyline and Lake Union.
```

In [None]:
# Now, let's run with another session and we should see a different response since that thread is not related to anything, 
# it will create a new itinerary with different items from the above 2 examples.
config = {"configurable": {"thread_id": "13"}}

# Can you add picnic to this itinerary?
user_request = input("Enter your travel request: ")
# Seattle
city = input("Enter the city: ")

# Use them in the function call
run_travel_planner(user_request, city, config)

### Memory

Memory is key for any agentic conversation which is `Multi-Turn` or `Multi-Agent` collaboration conversation and more so if it spans multiple days. The 3 main aspects of Agents are:

1. Tools
1. Memory
1. Planners

### Types of Memory

Memory is a cognitive function that allows people to store, retrieve, and use information to understand their present and future. Consider the frustration of working with a colleague who forgets everything you tell them, requiring constant repetition! As AI agents undertake more complex tasks involving numerous user interactions, equipping them with memory becomes equally crucial for efficiency and user satisfaction. With memory, agents can learn from feedback and adapt to users' preferences. This guide covers two types of memory based on recall scope:

1. Short-term memory, or thread-scoped memory, can be recalled at any time from within a single conversational thread with a user. LangGraph manages short-term memory as a part of your agent's state. State is persisted to a database using a checkpointer so the thread can be resumed at any time. Short-term memory updates when the graph is invoked or a step is completed, and the State is read at the start of each step.

1. Long-term memory is shared across conversational threads. It can be recalled at any time and in any thread. Memories are scoped to any custom namespace, not just within a single thread ID. LangGraph provides stores to let you save and recall long-term memories.

In this example lab, we will be leveraging a `multi-thread`, `multi-session` persistence to chat messages. That way, we will be able to retrieve memory across various sessions and chats across several users. Ideally you would leverage persistence like Redis store to save messages per session.

### Memory Management

1. We can have several Patterns - we can have each Agents with it's own Session memory
1. Or we can have the whole Graph have a combined memory in which case each agent will get it's own memory

The `MemorySaver` or the Store have the concept of separating sections of memory by Namespaces or by Thread ID's and those can be leveraged to either 1/ Use the graph level message or memory 2/ Each agent can have it's own memory via space in saver or else having it's own saver like we do in the `ReACT` agent

In [None]:
from langgraph.store.base import BaseStore, Item, Op, Result
from langgraph.store.memory import InMemoryStore
from typing import Any, Iterable, Literal, NamedTuple, Optional, Union, cast

# CustomMemoryStore is a wrapper class that implements the BaseStore interface
class CustomMemoryStore(BaseStore):
    def __init__(self, ext_store):
        # Initialize with an external store that will handle the actual storage
        self.store = ext_store

    def get(self, namespace: tuple[str, ...], key: str) -> Optional[Item]:
        # Retrieve an item from the store using namespace and key
        return self.store.get(namespace, key)

    def put(self, namespace: tuple[str, ...], key: str, value: dict[str, Any]) -> None:
        # Store a value in the store using namespace and key
        return self.store.put(namespace, key, value)
        
    def batch(self, ops: Iterable[Op]) -> list[Result]:
        # Execute multiple operations in a batch
        return self.store.batch(ops)
        
    async def abatch(self, ops: Iterable[Op]) -> list[Result]:
        # Execute multiple operations asynchronously in a batch
        return self.store.abatch(ops)

In [None]:
in_memory_store = CustomMemoryStore(InMemoryStore())
namespace_u = ("chat_messages", "user_id_1")
key_u="user_id_1"
in_memory_store.put(namespace_u, key_u, {"data":["list a"]})
item_u = in_memory_store.get(namespace_u, key_u)
print(item_u.value, item_u.value['data'])

in_memory_store.list_namespaces()

### Create the similiar graph as earlier -- note we will not have any mesages in the Graph state as that has been externalized

In [None]:
class PlannerState(TypedDict):
    itinerary: str
    city: str
    user_message: str

In [None]:
def input_interests(state: PlannerState, config: RunnableConfig, *, store: BaseStore) -> PlannerState:
    user_message = state['user_message'] #input("Your input: ")
    return {
        **state,
    }

def create_itinerary(state: PlannerState, config: RunnableConfig, *, store: BaseStore) -> PlannerState:
    #- get the history from the store
    user_u = f"user_id_{config['configurable']['thread_id']}"
    namespace_u = ("chat_messages", user_u)
    store_item = store.get(namespace=namespace_u, key=user_u)
    chat_history_messages = store_item.value['data'] if store_item else []
    print(user_u,chat_history_messages)

    response = llm.invoke(ITINERARY_PROMPT.format_messages(city=state['city'], user_message=state['user_message'], chat_history=chat_history_messages))
    print("\nFinal Itinerary:")
    print(response.content)

    #- add back to the store
    store.put(namespace=namespace_u, key=user_u, value={"data":chat_history_messages+[HumanMessage(content=state['user_message']),AIMessage(content=response.content)]})
    
    return {
        **state,
        "itinerary": response.content
    }

In [None]:
in_memory_store_n = CustomMemoryStore(InMemoryStore())

workflow = StateGraph(PlannerState)

workflow.add_node("input_interests", input_interests)
workflow.add_node("create_itinerary", create_itinerary)
workflow.set_entry_point("input_interests")
workflow.add_edge("input_interests", "create_itinerary")
workflow.add_edge("create_itinerary", END)


app = workflow.compile(store=in_memory_store_n)

In [None]:
def run_travel_planner(user_request: str, config_dict: dict):
    print(f"Current User Request: {user_request}\n")
    init_input = {"user_message": user_request,"city" : "London"}

    for output in app.stream(init_input, config=config_dict, stream_mode="values"):
        pass  # The nodes themselves now handle all printing

config = {"configurable": {"thread_id": "2"}}

user_request = "Can you create a itinerary for a day trip in london.  I need a complete plan that budgets for travel time and meal time."
run_travel_planner(user_request, config)

In [None]:
config = {"configurable": {"thread_id": "2"}}

user_request = "Can you add something else to this?"
run_travel_planner(user_request, config)

### Quick look at the store
it will show the History of the Chat Messages

In [None]:
print(in_memory_store_n.list_namespaces())
print(in_memory_store_n.get(('chat_messages', 'user_id_2'),'user_id_2').value)