# Deployment

## Concepts

`Langgraph`
- Python and JS library -> Creates Agentic Workflows

`Langgraph API`
- Bundles the code
- Provides a task queue for managing asynchronous operations
- Offers persistence for maintaining state across interactions

`Langgraph Cloud`
- Hosts the API
- Allowd graph deployment from Github repos
- Provides monitoring and tracing for deployed graphs
- Accessible via a unique url for each deployment

`Langgraph Studio`
- Integrated Development Evnironment (IDE) for Langgraph applications
- Uses API as backend, allows realtime testing and exploration of graphs
- Can be run locally, or with cloud-deployment

`Langgraph SDK`
- Python library for programmatically interacting with LangGraph graphs
- Provides a consistent interface for working with graphs, whether served locally or in the cloud
- Allows creation of clients, access to assistants, thread management, and execution of runs

In [1]:
from langgraph_sdk import get_client

In [6]:
# This is the URL of the local development server
URL = "http://127.0.0.1:2024"
client = get_client(url=URL)

# Search all hosted graphs
assistants = await client.assistants.search()

In [7]:
assistants

[{'assistant_id': 'fe096781-5601-53d2-b2f6-0d3403f7e9ca',
  'graph_id': 'agent',
  'config': {},
  'metadata': {'created_by': 'system'},
  'name': 'agent',
  'created_at': '2025-06-22T16:41:55.422640+00:00',
  'updated_at': '2025-06-22T16:41:55.422640+00:00',
  'version': 1,
  'description': None},
 {'assistant_id': '228f9934-0cdd-5383-92c8-ee8422522cc2',
  'graph_id': 'router',
  'config': {},
  'metadata': {'created_by': 'system'},
  'name': 'router',
  'created_at': '2025-06-22T16:41:55.374179+00:00',
  'updated_at': '2025-06-22T16:41:55.374179+00:00',
  'version': 1,
  'description': None},
 {'assistant_id': '28d99cab-ad6c-5342-aee5-400bd8dc9b8b',
  'graph_id': 'simple_graph',
  'config': {},
  'metadata': {'created_by': 'system'},
  'name': 'simple_graph',
  'created_at': '2025-06-22T16:41:01.324689+00:00',
  'updated_at': '2025-06-22T16:41:01.324689+00:00',
  'version': 1,
  'description': None}]

In [8]:
# We create a thread for tracking the state of our run
thread = await client.threads.create()
thread

{'thread_id': 'fd2ed3f6-15ee-4ca5-b294-33fd93a2477a',
 'created_at': '2025-06-22T16:48:39.990282+00:00',
 'updated_at': '2025-06-22T16:48:39.990282+00:00',
 'metadata': {},
 'status': 'idle',
 'config': {},
 'values': None}

Now, we can run our agent [with `client.runs.stream`](https://langchain-ai.github.io/langgraph/concepts/low_level/#stream-and-astream) with:

* The `thread_id`
* The `graph_id`
* The `input` 
* The `stream_mode`
 
The state is captured in the `chunk.data`. 

In [10]:
from langchain_core.messages import HumanMessage

# Input
input = {"messages": [HumanMessage(content="Multiply 3 by 2.")]}

# Stream
async for chunk in client.runs.stream(
    thread["thread_id"],
    "agent",
    input = input,
    stream_mode='values'
):
    if chunk.data and chunk.event != "metadata":
        print(chunk.data['messages'][-1])

{'content': 'Multiply 3 by 2.', 'additional_kwargs': {'additional_kwargs': {}, 'response_metadata': {}, 'example': False}, 'response_metadata': {}, 'type': 'human', 'name': None, 'id': '102ef3fc-63ac-4c7e-8d84-cd8b4225de5f', 'example': False}
{'content': '', 'additional_kwargs': {'tool_calls': [{'id': 'call_qBBnhpbBsMi6GgvTkMamJXOK', 'function': {'arguments': '{"a":3,"b":2}', 'name': 'multiply'}, 'type': 'function'}], 'refusal': None}, 'response_metadata': {'token_usage': {'completion_tokens': 17, 'prompt_tokens': 188, 'total_tokens': 205, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_07871e2ad8', 'finish_reason': 'tool_calls', 'logprobs': None}, 'type': 'ai', 'name': None, 'id': 'run-7bbcbabf-cee7-4e5f-8f96-b04c297aff07-0', 'example': False, 'tool_calls': [{'name': 'm