In [None]:
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Building and Deploying a Human-in-the-Loop LangGraph Application with Reasoning Engine on Vertex AI

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Freasoning-engine%2Flanggraph_human_in_the_loop.ipynb">
      <img width="32px" src="https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>    
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Open in Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/53/X_logo_2023_original.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/langgraph_human_in_the_loop.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>            

| | |
|-|-|
| Author(s) | [Xiaolong Yang](https://github.com/shawn-yang-google) |

## Overview

[Reasoning Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/reasoning-engine/overview) (LangChain on Vertex AI) is a managed service designed to help you build and deploy agent reasoning frameworks. [LangGraph](https://langchain-ai.github.io/langgraph/) is a library for constructing stateful, multi-actor applications with LLMs, enabling the creation of sophisticated agent and multi-agent workflows.

This notebook demonstrates how to build, deploy, and test a Human-in-the-Loop LangGraph application using [Reasoning Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/reasoning-engine/overview) on Vertex AI. You'll learn how to combine LangGraph's powerful workflow orchestration with the scalability of Vertex AI to build Human-in-the-Loop generative AI applications.

The previous [notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/intro_reasoning_engine.ipynb) covered: Defining Tools, Defining a Router, Building a LangGraph Application, Local Testing, Deploying to Vertex AI, Remote Testing, and Cleaning Up Resources.

This notebook expands on those concepts and explores the following Human-in-the-Loop features:

- **Reviewing Tool Calls:** Implement human oversight after tool use, allowing for verification and correction of actions before proceeding.
- **Fetching State History:** Retrieve the complete execution history of the LangGraph application for auditing, analysis, and potential state reversion.
- **Time Travel:** Examine the state of the agent at a specific point in time to understand past decisions.
- **Replay:** Restart execution from a specific checkpoint without modifications to ensure consistent results.
- **Branching:** Create alternative execution paths based on a past state, enabling the agent to explore different possibilities or correct previous errors.

By the end of this notebook, you'll possess the skills to build and deploy customized Human-in-the-Loop generative AI applications using LangGraph, Reasoning Engine, and Vertex AI.

## Get started

### Install the Vertex AI SDK and Required Packages

In [1]:
%pip install --upgrade --user --quiet \
    "google-cloud-aiplatform[langchain,reasoningengine]" \
    requests --force-reinstall

### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After it's restarted, continue to the next step.

In [2]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>

### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [1]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud Project Information and Initialize the Vertex AI SDK

Before using Vertex AI, ensure you have an existing Google Cloud project and have [enabled the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Refer to the documentation for more details on [setting up a project and development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [2]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}
STAGING_BUCKET = "gs://[your-staging-bucket]"  # @param {type:"string"}

import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION, staging_bucket=STAGING_BUCKET)

## Building and Deploying a LangGraph App on Reasoning Engine

The following sections guide you through building and deploying a LangGraph application using Reasoning Engine on Vertex AI.

### Import Libraries

Import the required Python libraries. These libraries provide the necessary tools for interacting with LangGraph, Vertex AI, and other components of the application.

In [3]:
from langchain.load import load as langchain_load
from vertexai.preview import reasoning_engines
import requests

### Define Tools

Begin by defining the tools for your LangGraph application. You'll define a custom Python function that serves as a tool within our agentic application.

In this example, we'll create a simple tool that retrieves the exchange rate requested by the user.  In practice, you can define functions to interact with APIs, query databases, or perform any other tasks your agent might need to execute.

In [4]:
def get_exchange_rate(
    currency_from: str = "USD",
    currency_to: str = "EUR",
    currency_date: str = "latest",
):
    """Retrieves the exchange rate between two currencies on a specified date.

    Uses the Frankfurter API (https://api.frankfurter.app/) to obtain
    exchange rate data.

    Args:
        currency_from: The base currency (3-letter currency code).
            Defaults to "USD" (US Dollar).
        currency_to: The target currency (3-letter currency code).
            Defaults to "EUR" (Euro).
        currency_date: The date for which to retrieve the exchange rate.
            Defaults to "latest" for the most recent exchange rate data.
            Can be specified in YYYY-MM-DD format for historical rates.

    Returns:
        dict: A dictionary containing the exchange rate information.
            Example: {"amount": 1.0, "base": "USD", "date": "2023-11-24",
                "rates": {"EUR": 0.95534}}
    """

    response = requests.get(
        f"https://api.frankfurter.app/{currency_date}",
        params={"from": currency_from, "to": currency_to},
    )
    return response.json()

### Define [`Checkpointers`](https://langchain-ai.github.io/langgraph/concepts/persistence/)

In LangGraph, [memory is checkpointing/persistence](https://github.com/langchain-ai/langgraph/discussions/352#discussioncomment-9290376). Checkpointing saves the [state](https://langchain-ai.github.io/langgraph/concepts/low_level/#checkpointer-state) of the agent's execution at each node in the graph, which is crucial for: `Resuming execution`, `Debugging and Inspection`, and `Asynchronous Operations`.

LangGraph provides a [Checkpointer Interface](https://langchain-ai.github.io/langgraph/concepts/persistence/#checkpointer-interface), defining methods for saving and loading the state. Several built-in checkpointers are available to implement this interface.

Next, you'll define the arguments for your LangGraph application's checkpointer and create a custom Python function to act as the checkpointer builder. In this case, we'll define a simple `In Memory` checkpointer.

In [5]:
checkpointer_kwargs = None

def checkpointer_builder(**kwargs):
    from langgraph.checkpoint.memory import MemorySaver

    return MemorySaver()

### Define the Human-in-the-Loop LangGraph Application

Now, you'll integrate all the components to define your Human-in-the-Loop LangGraph application within Reasoning Engine.

This application will utilize the tools and checkpointer you've defined. LangGraph offers a powerful framework for structuring these interactions and leveraging the capabilities of LLMs.

In [6]:
agent = reasoning_engines.LanggraphAgent(
    model="gemini-1.5-pro",
    tools=[get_exchange_rate],
    model_kwargs={"temperature": 0, "max_retries": 6},
    checkpointer_kwargs=checkpointer_kwargs,
    checkpointer_builder=checkpointer_builder,
)

### Local Testing

This section covers local testing of your LangGraph application before deployment to ensure it behaves as expected.

In [7]:
agent.set_up()

In [8]:
inputs = {
    "messages": [
        ("user", "What is the exchange rate from US dollars to Swedish currency?")
    ]
}

In [10]:
response = agent.query(
    input=inputs,
    config={"configurable": {"thread_id": "synchronous-thread-id"}},
)

response["messages"][-1]["kwargs"]["content"]

You can also utilize [streaming](https://langchain-ai.github.io/langgraph/how-tos/stream-values/) mode to stream back the `values` of the graph, representing the full state after each node execution.

In [11]:
for state_values in agent.stream_query(
    input=inputs,
    stream_mode="values",
    config={"configurable": {"thread_id": "streaming-thread-values"}},
):
    print(state_values)

Alternatively, you can stream back `updates` to the graph. These represent the changes to the state after each node is executed.

In [12]:
for state_updates in agent.stream_query(
    input=inputs,
    stream_mode="updates",
    config={"configurable": {"thread_id": "streaming-thread-updates"}},
):
    print(state_updates)

## Human-in-the-loop

### Reviewing Tool Calls

LangGraph's Human-in-the-Loop functionality provides various [use cases](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/#use-cases) for incorporating human intervention and oversight into agent workflows (state machines). This notebook focuses on the [Reviewing Tool Calls](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/review-tool-calls/) use case.

To achieve this, the agent needs to [interrupt](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/#interrupt) execution in the following scenarios:

* Before invoking the tool (when the LLM generates a tool call AI Message).
* After receiving a tool response.

In [56]:
response = agent.query(
    input=inputs,
    interrupt_before=["tools"],  # Before invoking the tool.
    interrupt_after=["tools"],  # After getting a tool message.
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
)
langchain_load(response["messages"][-1]).pretty_print()

The process was interrupted *before invoking the tool*.

After review, we assume the LLM-generated tool call (`AI Message`) is correct and proceed to resume execution.

In [57]:
response = agent.query(
    input=None,  # Resume (continue with the tool call AI Message).
    interrupt_before=["tools"],
    interrupt_after=["tools"],
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
)
langchain_load(response["messages"][-1]).pretty_print()

The process is interrupted again *after receiving the tool message*.

Upon review, if the LLM-generated `Tool Message` appears correct, we can resume execution.

In [15]:
response = agent.query(
    input=None,  # Resume (continue with the Tool Message).
    interrupt_before=["tools"],
    interrupt_after=["tools"],
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
)
langchain_load(response["messages"][-1]).pretty_print()

### Fetching State History

You can fetch the state history by calling `.get_state_history`.

In [16]:
for state_snapshot in agent.get_state_history(
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
):
    if state_snapshot["metadata"]["step"] >= 0:
        print(f'step {state_snapshot["metadata"]["step"]}: {state_snapshot["config"]}')
        state_snapshot["values"]["messages"][-1].pretty_print()
        print("\n")

### Time Travel

LangGraph's [Time Travel](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/time-travel/) demonstrates how to build a conversational agent with persistent memory, enabling human intervention to correct past actions.  Essentially, it "rewinds" the conversation to a previous state, allows for mistake correction, and permits the agent to continue from that corrected point.

You can "time travel" by calling `.get_state`. By default, the agent retrieves the `latest state`.

In [17]:
state = agent.get_state(
    config={
        "configurable": {
            "thread_id": "human-in-the-loop-deepdive",
        }
    }
)

print(f'step {state["metadata"]["step"]}: {state["config"]}')
state["values"]["messages"][-1].pretty_print()

To retrieve an earlier state, you need to specify the `checkpoint_id` (and `checkpoint_ns`).

In [18]:
snapshot_config = {}
for state_snapshot in agent.get_state_history(
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
):
    if state_snapshot["metadata"]["step"] == 1:
        snapshot_config = state_snapshot["config"]
        break

snapshot_config

In [19]:
state = agent.get_state(config=snapshot_config)
print(f'step {state["metadata"]["step"]}: {state["config"]}')
state["values"]["messages"][-1].pretty_print()

In [20]:
state

### Replay

LangGraph's [Replay](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/time-travel/#replay-a-state) feature allows you to resume or replay a conversation from any specific point in its history.

You can initiate a replay by passing the `state["config"]` back to the agent. Note that the execution resumes exactly where it was left off, executing a tool call.

In [21]:
state["config"]

In [22]:
for state_values in agent.stream_query(
    input=None,  # resume
    stream_mode="values",
    config=state["config"],
):
    langchain_load(state_values["messages"][-1]).pretty_print()

### Branching

LangGraph's [Branching](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/time-travel/#branch-off-a-past-state) feature allows you to modify and re-run a LangGraph conversation from a specific point in its history (rather than just from the latest state).  This enables the agent to explore alternate trajectories or allows a user to "version control" changes in a workflow.

In this example, you will:
* Update the tool calls from a previous step.
* Call `.update_state` to rerun the step with the updated configuration.

In [23]:
last_message = state["values"]["messages"][-1]
print(last_message)
print(last_message.tool_calls)

Update the tool calls from the previous step.

In [24]:
last_message.tool_calls[0]["args"]["currency_date"] = "2024-09-01"
last_message.tool_calls

Call `.update_state` to rerun the step with the updated configuration.

In [25]:
branch_config = agent.update_state(
    config=state["config"],
    values={"messages": [last_message]},  # the update we want to make
)
branch_config

In [26]:
for state_values in agent.stream_query(
    input=None,  # resume
    stream_mode="values",
    config=branch_config,
):
    langchain_load(state_values["messages"][-1]).pretty_print()

## Deploying the Agent

In [27]:
remote_agent = reasoning_engines.ReasoningEngine.create(
    reasoning_engines.LanggraphAgent(
        model="gemini-1.5-pro",
        tools=[get_exchange_rate],
        model_kwargs={"temperature": 0, "max_retries": 6},
        checkpointer_kwargs=checkpointer_kwargs,
        checkpointer_builder=checkpointer_builder,
    ),
    requirements=[
        "google-cloud-aiplatform[reasoningengine,langchain]",
        "requests",
    ],
)

remote_agent

## Querying the Remote Agent

### Remote testing

In [28]:
for state_updates in remote_agent.stream_query(
    input=inputs,
    stream_mode="updates",
    config={"configurable": {"thread_id": "remote-streaming-thread-updates"}},
):
    print(state_updates)

In [29]:
for state_values in remote_agent.stream_query(
    input=inputs,
    stream_mode="values",
    config={"configurable": {"thread_id": "remote-human-in-the-loop-overall"}},
):
    print(state_values)

### Reviewing Tool Calls

In [33]:
response = remote_agent.query(
    input=inputs,
    interrupt_before=["tools"],  # Before invoking the tool.
    interrupt_after=["tools"],  # After getting a tool message.
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
)
langchain_load(response["messages"][-1]).pretty_print()

In [34]:
response = remote_agent.query(
    input=None,  # Resume (continue with the tool call AI Message).
    interrupt_before=["tools"],
    interrupt_after=["tools"],
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
)
langchain_load(response["messages"][-1]).pretty_print()

In [35]:
response = agent.query(
    input=None,  # Resume (continue with the Tool Message).
    interrupt_before=["tools"],
    interrupt_after=["tools"],
    config={"configurable": {"thread_id": "human-in-the-loop-deepdive"}},
)
langchain_load(response["messages"][-1]).pretty_print()

## Cleaning up

After you've finished experimenting, it's a good practice to clean up your cloud resources. You can delete the deployed Reasoning Engine instance to avoid any unexpected charges on your Google Cloud account.

In [54]:
remote_agent.delete()