# Installation

In [None]:
!pip install langchain-google-vertexai langchain-google-community[search] langgraph

# Prompt design

Let's start with a simple math problem:

In [1]:
math_problem1 = (
    "John had 1097 candies. They ate 14 yesterday, 18 today and shared 341 more with "
    "their classmates. How many candies to they have left? "
)

Let's try how LLM handles it:

In [12]:
from langchain_google_vertexai import ChatVertexAI
from langchain_core.messages import SystemMessage, HumanMessage

llm = ChatVertexAI(
   model_name="gemini-1.5-pro-001",
   temperature=0.)
result = llm.invoke([
    SystemMessage(content="Give a short answer (a single number only)."),
    HumanMessage(content=math_problem1)
])
print(result.content)

704 



It's not the right answer, since we expected *1097-14-18-341=724*. Let's try a **sampling** (or self-consistency) technique. We're going to retrieve an answer multiple time from a LLM, and then we'll take a look at the probability distribution:

In [18]:
from collections import Counter

answers = Counter()
llm_high_temperature = ChatVertexAI(
   model_name="gemini-1.5-pro-001",
   temperature=0.7)

for _ in range(20):
 answer = llm_high_temperature.invoke([
    SystemMessage(content="Give a short answer (a single number only)."),
    HumanMessage(content=math_problem1)
 ]).content
 answers[answer] += 1

print(answers)

Counter({'704 \n': 17, '724 \n': 3})


We saw that the model was able to figure out the right answer at least a few times, but it's still not enough (since this answer doesn't have the highest frequency in the sample).

## Controlled generation (naive)

Let's remove our system message:

In [4]:
print(llm.invoke([
    HumanMessage(content=math_problem1)
]).content)


Here's how to solve the problem step-by-step:

* **Step 1: Find the total eaten.** 
   John ate 14 + 18 = 32 candies.

* **Step 2: Find the total gone.**
   John ate 32 candies and gave away 341, for a total of 32 + 341 = 373 candies.

* **Step 3: Subtract to find the remaining candies.**
   John started with 1097 and lost 373, leaving 1097 - 373 = 724 candies.

**Answer:** John has 724 candies left. 



Now we get some reasoning traces, but the answer is right. How can we parse it reliably?

In [5]:
result = llm.invoke([
    SystemMessage(content="Always give a final answer in a form FINAL_ANSWER=..."),
    HumanMessage(content=math_problem1)
 ]).content
print(result)

Here's how to solve the problem:

* **Total eaten:** 14 + 18 = 32 candies
* **Total given away:** 341 candies
* **Total candies gone:** 32 + 341 = 373 candies
* **Candies left:** 1097 - 373 = 724 candies

FINAL_ANSWER=724 



In [6]:
from langchain.output_parsers.regex import RegexParser

parser = RegexParser(regex="FINAL_ANSWER=(\d+)", output_keys=["answer"])
print(parser.invoke(result))

{'answer': '724'}


In [7]:
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate(
    [("system", "Always give a final answer in a form FINAL_ANSWER=..."),
    ("human", "{user_input}")]
)
chain = prompt | llm | parser
chain.invoke(math_problem1)

{'answer': '724'}

In [9]:
assert prompt.invoke({"user_input": "test"}) == prompt.invoke("test")

We will use a Gemma 2 27B model as an example of a small model. You need to go to the Vertex AI Model Garden, search for Gemma 2 and deploy the model first (it will use one Nvidia A100 GPU). [This](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/deploy-and-inference-tutorial) guide has more instructions. After deployment is finished (it might take 15-20 minutes), you need to copy the endpoint_id from the Vertex AI and update the parameters below:

If you don't want to do that, feel free to use Gemini-1.5-flash-001 as an example of a smaller model.

In [None]:
small_llm = ChatVertexAI(model_name="gemini-1.5-flash-001", location="us-central1")

In [8]:
GEMMA_ENDPOINT_ID = ""
PROJECT_ID = ""
LOCATION = "us-central1

In [33]:
from langchain_google_vertexai import VertexAIModelGarden
small_llm = VertexAIModelGarden(
    endpoint_id=GEMMA_ENDPOINT_ID,
    project=PROJECT_ID,
    location=LOCATION,
    prompt_arg="inputs",
    allowed_model_args=["temperature", "max_tokens"]
).bind(max_tokens=512)

I0000 00:00:1729773888.955651 8206579 fork_posix.cc:77] Other threads are currently calling into gRPC, skipping fork() handlers
I0000 00:00:1729773889.583101 8206579 fork_posix.cc:77] Other threads are currently calling into gRPC, skipping fork() handlers


In [34]:
prompt = ChatPromptTemplate(
    [
    ("human", "Always give a final answer in a form FINAL_ANSWER=...\n{user_input}")]
)
parser = RegexParser(regex="FINAL_ANSWER=\s?(\d+)", output_keys=["answer"])

In [35]:

chain_small = prompt | small_llm | parser
chain_small.invoke(math_problem1)

{'answer': '724'}

## CoT

Let's try it with a CoT prompting:

In [36]:
cot_prompt = (
    "Always think step-by-step. Explain your reasoning."
    " Split a problem into sequence of reasoning steps and try to solve it."
    " Always give a final answer in a form FINAL_ANSWER=..."
)

In [37]:
cot_prompt_template = ChatPromptTemplate(
    [("system", cot_prompt),
     ("human", "{user_input}")]
)
chain_cot = cot_prompt_template | llm | parser
chain_cot.invoke(math_problem1)

{'answer': '724'}

Now we got it right!

# Calculator

In [38]:
from langchain_google_vertexai import ChatVertexAI
llm = ChatVertexAI(model_name="gemini-1.5-pro-001", temperature=1.0)

Let's define a more complex math problem:

In [39]:
math_problem2 = "How much is 23*2**2+156/4-18?"

In [40]:
print(small_llm.invoke(math_problem2))


First, we follow the order of operations (PEMDAS/BODMAS):

1. **Exponents:** 2**2 = 4

2. **Multiplication:** 23*4 = 92 and 156/4 = 39

3. **Addition and Subtraction** (from left to right):
   92 + 39 = 131
   131 - 18 = 113



Therefore, 23*2**2+156/4-18 = **113**.



In [41]:
calculator_prompt = (
   "You have access to calculator that can solve mathematical problems. "
   "If you want to ask a calculator, start with CALCULATOR: and generate an experssion "
   "to be evaluated by a calculator (it should have only numbers and mathematical operators)."
   "If you ask CALCULATOR, don't do anything else."
   "If you think you have a final solution, start it with FINAL_ANSWER=.\n"
)

calculator_prompt_template = ChatPromptTemplate(
    [("system", calculator_prompt),
     MessagesPlaceholder(variable_name="messages")]
)

In [42]:
step1 = (calculator_prompt_template | llm).invoke(
    [HumanMessage(content=math_problem2)])
print(step1.content)

I0000 00:00:1729773946.311823 8206579 fork_posix.cc:77] Other threads are currently calling into gRPC, skipping fork() handlers


CALCULATOR: 23*2**2+156/4-18 



In [43]:
step2 = eval(step1.content.replace("CALCULATOR", "").strip(" \n:"))
print(step2)

113.0


Now we can pass the result based to the LLM and ask it to generate the final result:

In [44]:
from langchain_core.messages import AIMessage

print((calculator_prompt_template | llm).invoke(
    [HumanMessage(content=math_problem2),
     step1,
     HumanMessage(content="113")
    ]).content)

FINAL_ANSWER=113 



### Inherit from BaseTool

In [45]:
from pydantic import BaseModel, Field
from langchain.tools import Tool
from langchain_core.tools import BaseTool
from typing import Optional, Type
from langchain.callbacks.manager import (
    AsyncCallbackManagerForToolRun,
    CallbackManagerForToolRun,
)



class CalculatorInput(BaseModel):
    """Input to the Calculator."""

    expression: str = Field(
        description="evaluates mathematical expressions"
    )

class CalculatorTool(BaseTool):
  name: str = "Calculator"
  args_schema: Optional[Type[BaseModel]] = CalculatorInput
  description: str = (
      "Useful for when you need to evaluate a mathematical expression."
  )


  def _run(
        self, expression: str, run_manager: Optional[CallbackManagerForToolRun] = None
    ) -> str:
      """Run the Calcualtor tool."""
      return eval(expression)


calculator_tool = CalculatorTool()


In [46]:
step2a = llm.invoke(math_problem2, tools=[calculator_tool])
print(step2a.tool_calls)

[{'name': 'Calculator', 'args': {'expression': '23*2**2+156/4-18'}, 'id': '3fa3e5a6-a343-489f-b730-ac3768aa4ae6', 'type': 'tool_call'}]


In [47]:
from langchain_core.messages import ToolMessage

print((calculator_prompt_template | llm).invoke(
    [HumanMessage(content=math_problem2),
     step2a,
     ToolMessage(content="113", tool_call_id=step2a.tool_calls[0]["id"])
    ]).content)

FINAL_ANSWER=113 



### Decorator

In [48]:
from langchain.tools import tool

@tool
def calculator(expression: str) -> str:
    """Evaluates mathematical expressions."""
    return eval(expression)

In [49]:
step2b = llm.invoke(math_problem2, tools=[calculator])
print(step2b.tool_calls)

[{'name': 'calculator', 'args': {'expression': '23*2**2+156/4-18'}, 'id': '5ab408a9-7f79-48d9-a90e-32bfe3ee2428', 'type': 'tool_call'}]


### OpenAPI spec

In [50]:
calculator_declaration = {
    "name": "Calculator",
    "description": "Useful for when you need to evaluate a mathematical expression.",
    "parameters": {
        "properties": {
            "expression": {"type": "string", "title": "expression"}
        },
        "title": 'CalculatorInput',
        "required": ["expression"],
        "description": 'Input to the Calculator tool.',
        "type": "object"
    }
}
step2c = llm.invoke(
    math_problem2,
    tools=[{"function_declarations": [calculator_declaration]}])
print(step2c.tool_calls[0])

{'name': 'Calculator', 'args': {'expression': '23*2**2+156/4-18'}, 'id': '76b66c7e-3a64-41f0-a16d-ff5f890a61ff', 'type': 'tool_call'}


## Pydantic models

In [51]:
from typing import List
from pydantic import BaseModel, Field

class Plan(BaseModel):
    """Plan to execute a task."""

    steps: List[str] = Field(
        description="a plan with steps"
    )

output = llm.invoke("Prepare a plan how to solve the following task: Learn German as a foreign language. It should be an enumerated list of actions.")
output1 = llm.invoke(output.content, functions=[Plan])
print(output1.tool_calls[0])

{'name': 'Plan', 'args': {'steps': ['Set realistic goals: Define motivation, establish achievable goals, and create a timeline.', 'Choose your resources: Find a course or method that suits you and gather supplementary materials.', 'Establish a study routine: Schedule dedicated learning time, focus on all language skills, and review regularly.', 'Immerse yourself in the language: Surround yourself with German and connect with German speakers.', "Track your progress and stay motivated: Use a language learning journal, don\\'t be afraid to make mistakes, and reward your efforts."]}, 'id': '907994c1-350c-49b0-8935-2bcff6d649da', 'type': 'tool_call'}


Now we can exectute it and get the Pydantic model:

In [52]:
plan = Plan(**output1.tool_calls[0]['args'])
print(type(plan))
print(plan.steps)

<class '__main__.Plan'>
['Set realistic goals: Define motivation, establish achievable goals, and create a timeline.', 'Choose your resources: Find a course or method that suits you and gather supplementary materials.', 'Establish a study routine: Schedule dedicated learning time, focus on all language skills, and review regularly.', 'Immerse yourself in the language: Surround yourself with German and connect with German speakers.', "Track your progress and stay motivated: Use a language learning journal, don\\'t be afraid to make mistakes, and reward your efforts."]


# ReACT pattern

In [82]:
react_prompt = ChatPromptTemplate(
    [("system", (
        "You are a helpful assistant. Try to use available tools "
        "when appropriate to better answer the question."
      )),
      MessagesPlaceholder(variable_name="messages"),
    ]
)
llm_with_calculator = llm.bind_tools(tools=[calculator])
chain = react_prompt | llm_with_calculator


In [84]:
input_message = HumanMessage(content="how much is 45546*123213")
message1 = chain.invoke([input_message])

if message1.tool_calls and message1.tool_calls[0]["name"] == "calculator":
    calculator_result = calculator.invoke(message1.tool_calls[0]["args"])
    tool_message = ToolMessage(content=calculator_result, tool_call_id=message1.tool_calls[0]["id"])
    final_message = llm_with_calculator.invoke([input_message, message1, tool_message])
else:
    final_message = message1

print(final_message.content)

The answer is 5611859298.


This example is very naive, what if our flow is more complicated? Instead of writing everything ourselves, we can use a ready ReACT agent available on LangChain:

In [87]:
from langgraph.prebuilt import create_react_agent

react_agent = create_react_agent(llm, [calculator], messages_modifier=react_prompt)
result = react_agent.invoke(
    {"messages": [("user", "how much is 45546*123213-2")]})
print(result["messages"][-1].content)

  react_agent = create_react_agent(llm, [calculator], messages_modifier=react_prompt)


The answer is 5611859296.


Let's inspect the output:

In [88]:
for message in result["messages"]:
  print(type(message))
  if hasattr(message, "tool_calls"):
    print(message.tool_calls)
  else:
    print(message.content)

<class 'langchain_core.messages.human.HumanMessage'>
how much is 45546*123213-2
<class 'langchain_core.messages.ai.AIMessage'>
[{'name': 'calculator', 'args': {'expression': '45546*123213-2'}, 'id': '04a3c482-6e04-4ef7-b829-027f497e8c2e', 'type': 'tool_call'}]
<class 'langchain_core.messages.tool.ToolMessage'>
5611859296
<class 'langchain_core.messages.ai.AIMessage'>
[]


In [None]:
llm = ChatVertexAI(model_name="gemini-1.0-pro-001")
response = llm.invoke("What is the capital of Germany?", functions=[search_tool])
response.tool_calls[0]

We can create a tool from a function using a LangChain decorator:

In [None]:
from langchain.tools import tool

@tool
def search(query: str) -> str:
    """Run the query execution results."""
    return s.run(query)

llm = ChatVertexAI(model_name="gemini-1.0-pro-001")
response = llm.invoke("What is the capital of Germany?", functions=[search])
response.tool_calls[0]

In [None]:
@tool
def multiply(a: int, b: int) -> int:
    """Multiplies a and b.

    Args:
        a: first int
        b: second int
    """
    return a * b

In [None]:
multiply.invoke(input={"a": 1, "b": 2})

# ToolConfig

Let's define two tools:



In [100]:
search_declaration = {
    "name": "Search",
    "description": "Useful when you need to answer questions about current and future events. You should ask targeted questions.",
    "parameters": {
        "properties": {
            "query": {"type": "string", "title": "query"}
        },
        "title": 'SearchInput',
        "required": ["query"],
        "description": 'Input to the Google Search tool.',
        "type": "object"
    }
}

maps_declaration = {
    "name": "MapSearch",
    "description": "Useful to answer question about maps and locations. You can ask targeted questions.",
    "parameters": {
        "properties": {
            "query": {"type": "string", "title": "query"},
            "country": {"type": "string", "title": "query", "description": "A country used to restrict geo area to answer the query."}
        },
        "required": ["query"],
        "description": 'A query to Google Maps tool.',
        "type": "object"
    }
}

If we invoke our model, it calls the *Search* tool:

In [101]:
llm = ChatVertexAI(model_name="gemini-1.5-pro-001")
response = llm.invoke(
    "What is the capital of Germany?",
    tools=[{"function_declarations": [search_declaration, maps_declaration]}])

print(response.tool_calls[0])


I0000 00:00:1729781728.051681 8206579 fork_posix.cc:77] Other threads are currently calling into gRPC, skipping fork() handlers


{'name': 'Search', 'args': {'query': 'What is the capital of Germany?'}, 'id': '6e2c5eca-0f2e-44b7-9405-997e8199f81e', 'type': 'tool_call'}


But what if we want to make sure it **ALWAYS** decided to call a *Search* tool? We can use a *tool_config* parameter:

In [102]:
response1 = llm.invoke(
    "What is the capital of Germany?", tools=[{"function_declarations": [search_declaration, maps_declaration]}],
    tool_config={"function_calling_config": {"mode": "ANY", "allowed_function_names": ["Search"]}})
print(response1.tool_calls[0])

{'name': 'Search', 'args': {'query': 'What is the capital of Germany?'}, 'id': '28cfa14a-f80e-4d29-9ac4-6c32bd4c9b72', 'type': 'tool_call'}


It's equivalent to:

In [103]:
response1 = llm.invoke(
    "What is the capital of Germany?", tools=[{"function_declarations": [search_declaration, maps_declaration]}],
    tool_choice={"mode": "ANY", "allowed_function_names": ["Search"]})
print(response1.tool_calls[0])

{'name': 'Search', 'args': {'query': 'What is the capital of Germany?'}, 'id': '2e5b4d87-3ec5-438a-9726-060acf215023', 'type': 'tool_call'}


We can also tell the model never call any tools:

In [104]:
response2 = llm.invoke("What is the capital of Germany?", tools=[{"function_declarations": [search_declaration, maps_declaration]}],
                       tool_config={"function_calling_config": {"mode": "NONE"}})
print(response2.tool_calls)
print(response2.content)

[]
The capital of Germany is **Berlin**. 



It's equvalent to:

In [105]:
response2 = llm.invoke("What is the capital of Germany?", tools=[{"function_declarations": [search_declaration, maps_declaration]}],
                       tool_choice="none")
print(response2.tool_calls)
print(response2.content)

[]
The capital of Germany is **Berlin**. 



Or we can ask the model to call *MapSearch* tool only:

In [106]:
response2 = llm.invoke("What is the capital of Germany?", tools=[{"function_declarations": [search_declaration, maps_declaration]}],
                       tool_config={"function_calling_config": {"mode": "ANY", "allowed_function_names": ["MapSearch"]}})
print(response2.tool_calls)
print(response2.content)

[{'name': 'MapSearch', 'args': {'query': 'What is the capital of Germany?'}, 'id': '00da797b-a802-44fb-9912-f58f5c5940ff', 'type': 'tool_call'}]



# Tools provided by Google

## Google Search

You need to follow the instructions and create your search engine as described [here](https://). Then, set up the variables below:



In [None]:
google_search_api_key = "PUT YOUR API KEY HERE"
google_cse_id = "PUT YOUR CSE ID HERE"


In [90]:
from langchain_google_community import GoogleSearchAPIWrapper
search = GoogleSearchAPIWrapper(
    k=10, google_api_key=google_search_api_key, google_cse_id=google_cse_id
)

In [91]:
result = search.run("What is LangChain")
print(result)

Build context-aware, reasoning applications with LangChain's flexible framework that leverages your company's data and APIs. LangChain provides AI developers with tools to connect language models with external data sources. It is open-source and supported by an active community. LangChain is a framework for developing applications powered by large language models (LLMs). LangChain is a framework for developing applications powered by large language models (LLMs). For these applications, LangChain simplifies the entire ... LangChain is a framework that simplifies the process of creating generative AI application interfaces. Developers working on these types of interfaces use ... Nov 9, 2023 ... LangChain is a sophisticated framework comprising several key components that work in synergy to enhance natural language processing tasks. LangChain is an open source orchestration framework for the development of applications using large language models (LLMs), like chatbots and virtual ... Jun

In [92]:
result = search.results("What is the weather in Munich tomorrow?", num_results=3)
print(len(result))
print(result[0])

3
{'title': 'Munich, Bavaria, Germany Weather Forecast | AccuWeather', 'link': 'https://www.accuweather.com/en/de/munich/80331/weather-forecast/178086', 'snippet': 'Hourly Weather · 1 PM 55°. rain drop 2% · 2 PM 56°. rain drop 0% · 3 PM 58°. rain drop 0% · 4 PM 57°. rain drop 0% · 5 PM 56°.'}


In [93]:
from langchain_google_community import GoogleSearchRun

search_tool = GoogleSearchRun(api_wrapper=search)
agent_executor = create_react_agent(llm, [calculator, search_tool], messages_modifier=prompt)
result = agent_executor.invoke(
    {"messages": [("user", "how much is distance from Earth to Moon multiplied by 2?")]})

  agent_executor = create_react_agent(llm, [calculator, search_tool], messages_modifier=prompt)


In [94]:
for message in result["messages"]:
  print(type(message))
  if hasattr(message, "tool_calls") and message.tool_calls:
    print(message.tool_calls)
  else:
    print(message.content)

<class 'langchain_core.messages.human.HumanMessage'>
how much is distance from Earth to Moon multiplied by 2?
<class 'langchain_core.messages.ai.AIMessage'>
[{'name': 'google_search', 'args': {'query': 'distance from Earth to Moon'}, 'id': '72a558cb-d9e9-4cf7-a402-d15148a668bd', 'type': 'tool_call'}]
<class 'langchain_core.messages.tool.ToolMessage'>
The average distance between the Earth and the Moon is 384 400 km (238 855 miles). How far is that in light-seconds? Light travels at 300,000 kilometres per ... Well, the Moon is not always the same distance away from Earth. The orbit is not a perfect circle. When the Moon is the farthest away, it's 252,088 miles away. Mar 26, 2023 ... This image showing real size between Moon and Earth with real distance + Jupiter (Just for the sake of comparison) The average distance to the Moon is 382,500 km. The distance varies because the Moon travels around Earth in an elliptical orbit. At perigee, the point at which ... A lunar distance, 384,399 km 

## Grounding with Vertex Search

If you're using Gemini, you can use built-in capabilities to ground it on Google Search results:

In [95]:
from vertexai.generative_models import grounding
from vertexai.generative_models import Tool as VertexTool

tool = VertexTool.from_google_search_retrieval(grounding.GoogleSearchRetrieval())

response = llm.invoke("How far is moon from the Earth?", tools=[tool])

In [96]:
print(response.content)

The average distance between the Moon and Earth is around 384,400 kilometers (238,855 miles).

It's interesting to note that the Moon doesn't orbit Earth in a perfect circle, but rather in an elliptical path. This means the distance between Earth and the Moon is constantly changing.

- At its farthest point, called apogee, the Moon is about 405,696 kilometers (252,088 miles) from Earth.
- At its closest point, called perigee, the Moon is about 363,104 kilometers (225,623 miles) from Earth. 

To put this distance into perspective, if you were to line up 30 Earths, that would roughly equal the distance between Earth and the Moon. 



Let's explore grounding metadata:

In [98]:
response.response_metadata["grounding_metadata"]["web_search_queries"]

['how far is the moon from the earth']

In [99]:
response.response_metadata["grounding_metadata"]["grounding_supports"]

[{'segment': {'end_index': 93,
   'text': 'The average distance between the Moon and Earth is around 384,400 kilometers (238,855 miles).',
   'part_index': 0,
   'start_index': 0},
  'grounding_chunk_indices': [0, 1],
  'confidence_scores': [0.95101345, 0.95101345]},
 {'segment': {'start_index': 285,
   'end_index': 389,
   'text': '- At its farthest point, called apogee, the Moon is about 405,696 kilometers (252,088 miles) from Earth.',
   'part_index': 0},
  'grounding_chunk_indices': [0],
  'confidence_scores': [0.99039346]},
 {'segment': {'start_index': 390,
   'end_index': 494,
   'text': '- At its closest point, called perigee, the Moon is about 363,104 kilometers (225,623 miles) from Earth.',
   'part_index': 0},
  'grounding_chunk_indices': [0],
  'confidence_scores': [0.96773213]}]