# Deployment

## Review

We built up to an agent with memory:

**act** - let the model call specific tools
**observe** - pass the tool output back to the model
**reason** - let the model reason about the tool output to decide what to do next (e.g., call another tool or just respond directly)
**persist state** - use an in memory checkpointer to support long-running conversations with interruptions

**Goals**

Now, we'll cover how to actually deploy our agent locally to Studio and to LangGraph Cloud.

In [1]:
from langgraph_sdk import get_client

In [2]:
# Replace this with the URL of your own deployed graph
URL = "http://localhost:52563"
client = get_client(url=URL)

# Search all hosted graphs
assistants = await client.assistants.search()

In [4]:
assistants

[{'assistant_id': 'fe096781-5601-53d2-b2f6-0d3403f7e9ca',
  'graph_id': 'agent',
  'created_at': '2024-11-24T11:50:25.808584+00:00',
  'updated_at': '2024-11-24T11:50:25.808584+00:00',
  'config': {},
  'metadata': {'created_by': 'system'},
  'version': 1,
  'name': 'agent'},
 {'assistant_id': '228f9934-0cdd-5383-92c8-ee8422522cc2',
  'graph_id': 'router',
  'created_at': '2024-11-23T18:07:20.528540+00:00',
  'updated_at': '2024-11-23T18:07:20.528540+00:00',
  'config': {},
  'metadata': {'created_by': 'system'},
  'version': 1,
  'name': 'router'}]

In [5]:
# We create a thread for tracking the state of our run
thread = await client.threads.create()

Now, we can run our agent with client.runs.stream with:

The **thread_id** \
The **graph_id** \
The **input** \
The **stream_mode** \
We'll discuss streaming in depth in a future module.

For now, just recognize that we are streaming the full value of the state after each step of the graph with stream_mode="values".

The state is captured in the chunk.data.

In [6]:
from langchain_core.messages import HumanMessage

# Input
input = {"messages": [HumanMessage(content="Multiply 3 by 2.")]}

# Stream
async for chunk in client.runs.stream(
        thread['thread_id'],
        "agent",
        input=input,
        stream_mode="values",
    ):
    if chunk.data and chunk.event != "metadata":
        print(chunk.data['messages'][-1])

{'content': 'Multiply 3 by 2.', 'additional_kwargs': {'example': False, 'additional_kwargs': {}, 'response_metadata': {}}, 'response_metadata': {}, 'type': 'human', 'name': None, 'id': 'f99c5584-1210-4473-b15e-1e9868c00a15', 'example': False}
{'content': '', 'additional_kwargs': {'tool_calls': [{'id': 'call_m8dqeQGGZGdrIYR6BY86gn6x', 'function': {'arguments': '{"a":3,"b":2}', 'name': 'multiply'}, 'type': 'function'}], 'refusal': None}, 'response_metadata': {'token_usage': {'completion_tokens': 17, 'prompt_tokens': 135, 'total_tokens': 152, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0705bf87c0', 'finish_reason': 'tool_calls', 'logprobs': None}, 'type': 'ai', 'name': None, 'id': 'run-e291ce47-7364-4c6b-b7ea-82a27740b409-0', 'example': False, 'tool_calls': [{'name