## Installation

In [None]:
%%bash
pip install --upgrade langchain-core
pip install --upgrade langchain_google_vertexai[anthropic,all]

## Model invocation

Let's invoke a default model:

Doesn't work any longer. The default model, text-bison, was deprecated. 

In [None]:
from langchain_google_vertexai import VertexAI
llm = VertexAI()
llm.invoke("Which question can you answer?")

We can see, that the default version is `text-bison`:

In [None]:
print(llm.model_name)

Now let's change the model name and use Gemini-pro-1.5 running in Europe:

In [None]:
llm_gemini = VertexAI(model_name="gemini-1.5-pro-001", location="europe-west1")
print(llm_gemini.invoke("Which question can you answer?"))

Let's stream the results:

In [None]:
for chunk in llm_gemini.stream("Write a poem about Google Cloud and LangChain"):
  print(chunk)

Now let's override the default safety settings, and also control the length of the output:

In [None]:
from langchain_google_vertexai import HarmBlockThreshold, HarmCategory


for chunk in llm_gemini.stream("Write a poem about Google Cloud and LangChain", 
                               temperature=0.9, 
                               max_output_tokens=200, 
#                               stop=["."], 
                               safety_settings={HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE}
                               ):
  print(chunk)

## LangChain interfaces: PromptTemplate and Parsers

Let's use a PromptTemplate and build our first chain (a sequence of steps we'd like to orchestrate):

In [None]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser


prompt_template = PromptTemplate.from_template(
    "Extract {entities} entities from the item description:\n{description}\n."
    "Answer with a valid json as an output."
)

chain = prompt_template | llm_gemini | JsonOutputParser()

A prompt template is a runnable that substitutes parameters into the prompt:

In [None]:
s = prompt_template.invoke({"description": "A", "entities": "B"})
print(s)

Let's take a description of a Pixel 7a phone from this [website](https://store.google.com/product/pixel_7a?hl=de) (a few first paragraphs) and pass it to the model:

In [None]:
description = """Meet Google Pixel 7a, our latest A-Series phone that delivers all the helpfulness of Google for less. It’s built with Google Tensor G2, our flagship processor, and Titan M2, our dedicated security chip, making it faster, more efficient and more secure.

Pixel 7a is packed with many of the must-have features of our premium phones that are now available on an A-series phone for the first time — like Face Unlock, 8GB of RAM, an up to 90Hz Smooth Display and wireless charging. Pixel 7a provides the core Pixel experience, starting at $499."""

result = chain.invoke({"entities": "price, RAM", "description": description})
print(result)

As we can see, the model was able to parse the attributes we asked for, and the parser transformed it into a valid json object.

In [None]:
type(result)

## Chat models

In [19]:
from langchain_core.messages import BaseMessage, HumanMessage


Now let's create our first message. In practice, we'll use classes that inherit from a BaseMessage (and a type, or role, is already defined):

In [20]:
message = BaseMessage(content="Hi, how are you?", type="human", additional_kwargs={"chapter": 2})

In [None]:
from langchain_google_vertexai import ChatVertexAI

chat_model = ChatVertexAI(model_name="gemini-1.5-pro-001")
message = HumanMessage(content="Hi, how are you?")
answer = chat_model.invoke([message])
print(answer.content)

In [None]:
message2 = HumanMessage(content="Can you tell me how much is 2+2?")
answer2 = chat_model.invoke([message, answer, message2], temperature=0.9)
print(answer2.content)

In [None]:
print(answer.response_metadata["usage_metadata"])

In [None]:
type(answer)

Now let's use a chat PromptTemplate:

In [25]:
from langchain_core.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate
)
from langchain_core.messages import SystemMessage


chat_template = ChatPromptTemplate.from_messages(
    [
        SystemMessage(
            content=(
                "You are a helpful assistant that helps extract entities from product descriptions."
                "You always respond in a json format."
            )
        ),
        HumanMessagePromptTemplate.from_template("Extract the following entities:\n{entities}\n from the item's description:\n{description}."),
    ]
)
chat_model = ChatVertexAI(model_name="gemini-1.5-pro-001")

In [26]:
chain = chat_template | chat_model | JsonOutputParser()
result = chain.invoke({"entities": "price, RAM", "description": description})

In [None]:
print(result)

## Callbacks

Let's use a pre-defined callback that memorizes amount of tokens consumed:

In [28]:
from langchain_google_vertexai.callbacks import VertexAICallbackHandler

handler = VertexAICallbackHandler()

config = {
    'callbacks' : [handler]
}
result = chain.invoke({"entities": "price, RAM", "description": description}, \
                      config=config)

In [None]:
print(handler.prompt_tokens)

## Use Codey model

Codey models help you to write code:

In [None]:
import vertexai
from vertexai.generative_models import GenerativeModel, Part
from vertexai.vision_models import Image

PROJECT_ID = "ds-on-gsp"
REGION = "us-central1"
vertexai.init(project=PROJECT_ID, location=REGION)

IMAGE_FILE = "gs://ds-on-gsp-aiml-sa-bucket/chapter-18-data/func_generated_image.png"
image = Image.load_from_file(IMAGE_FILE)

generative_multimodal_model = GenerativeModel("gemini-2.0-flash-exp")
response = generative_multimodal_model.generate_content([Part.from_text("What is shown in this image?"), Part.from_uri(uri=IMAGE_FILE, mime_type="image/png")])

print(response)

In [60]:
# as of Jan 23, 2025, gemini-2.0 is available in certain regions, not including asia-northeast3
codey_llm = VertexAI(model_name="gemini-2.0-flash-exp", max_output_tokens=2048, location=REGION)

In [None]:
print(codey_llm.invoke("Generate a python script to sort a list of integer numbers."))

## Try OSS models

You can also use open-source models with Vertex Model Garden. First, you need to deploy a model (e.g., LLama as described in a model card in Google Cloud consolde). After that, add your values:

In [None]:
llama_endpoint_id = "8520345401566429184"
projects = !gcloud config get project
project = "ds-on-gsp"
location = "us-central1"

In [None]:
from langchain_google_vertexai import VertexAIModelGarden

llama_model = VertexAIModelGarden(
    endpoint_id=llama_endpoint_id,
    project=project,
    location=location,
)
output = llama_model.invoke(["How much is 2+2"])
print(output)

In [None]:
output = llama_model.invoke(["Write a poem about LangChain and Google Cloud"])
print(output)

With Model Garden, you can use additional arguments that the model supports, but you need to provide them during model initialization (so that they're passed to the request):

In [None]:
llama_model1 = VertexAIModelGarden(
    endpoint_id=llama_endpoint_id,
    project=project,
    location=location,
    allowed_model_args=["max_tokens", "top_k"]
)
output = llama_model1.invoke(["Write a poem about LangChain and Google Cloud"], max_tokens=300)
print(output)

Let's use another open source model, Falcon Instruct 40B deployed on Model Garden:

In [81]:
falcon_endpoint_id = "6259005125486968832"
project = "ds-on-gsp"
location = "asia-northeast3"

In [None]:
from langchain_google_vertexai import VertexAIModelGarden


falcon_model = VertexAIModelGarden(
    endpoint_id=falcon_endpoint_id,
    project=project,
    location=location,
    request_arg="generated_text"
)
output = falcon_model.invoke(["How old are you?"])
print(output)

You can also use third-party models like Claude from Anthropic that don't require any deployment on Model Garden:

In [62]:
project = "ds-on-gsp"
location = "us-central1"

In [None]:
from langchain_google_vertexai.model_garden import ChatAnthropicVertex

model = ChatAnthropicVertex(
        project=project,
        location=location,
    )
raw_system_message = (
    "You're a useful assistant that helps with math problems. Think step by step and provide reasoning for each step."
    )
question = (
    "Hello, how much is 2+2?"
)
system_message = SystemMessage(content=raw_system_message)
message = HumanMessage(content=question)
response = model.invoke([system_message, message], model_name="claude-3-sonnet@20240229")

In [None]:
print(response.content)

# Prompt engineering

Let's look at example how we can improve our prompt and use LangChain interfaces for that:

In [75]:
instruction = (
  "---INSTRUCTION--- \nYou are an intelligent assistant that helps marketers write great copy for campaigns on our website, "
  "which sells premium ceiling fans to design-conscious customers. Please create campaign copy (a slogan, a tagline, a short "
  "description, and three calls-to-action) based on keywords. Use the information from your context to choose the right products "
  "to advertise. Follow the examples below to ensure that you follow company branding standards.\n"
)

In [76]:
examples = [
    {
        "keywords": "best fan for hot summer days, powerful, cozy, wood tone, enjoy cold drink",
        "response": (
         "Slogan:  Breeze 4000: Feel the Difference.\n"
          "Tagline: Design, Comfort, Performance – The Ultimate Summer Upgrade.\n"
          "Short Description:  Beat the heat in style with the Breeze 4000. Its sleek wood-tone design and "
          "whisper-quiet operation create the perfect oasis for enjoying a cool drink on those hot summer days.\n"
          "Call to action: 1/ Experience the Breeze 4000 difference today.  (Emphasizes the unique qualities)\n"
          "2/ Upgrade your summer. Shop the Breeze 4000 now. (Creates a sense of urgency)\n"
          "3/ Find your perfect Breeze 4000 style. (Focus on design and personalization)"
        )
    },
]

In [77]:
prompt_template = "---CONTEXT---\n{context}\n------KEYWORDS FOR CREATING COPY---\n{keywords}\n---EXAMPLES---\n{examples}"
context = [
  {
    "name": "Whirlwind BreezeMaster 3000",
    "performanceRating": "high",
    "outdoor": True,
    "powerSource": "electric",
    "price": 249.99
  }
]
keywords = "best fan for dry heat, powerful, outdoor, porch, affordable"

In [None]:
from langchain_core.prompts.few_shot import FewShotPromptTemplate
from langchain_core.prompts.prompt import PromptTemplate

example_prompt = PromptTemplate(
    input_variables=["keywords", "response"], template="Example keywords:\n{keywords}\nExample response:\n{response}"
)

prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=instruction,
    suffix="---CONTEXT---\n{context}\n---KEYWORDS FOR CREATING COPY---\n{keywords}\n",
    input_variables=["context", "keywords"],
)

In [None]:
llm = VertexAI(model_name="gemini-2.0-flash-exp", location="us-central1")

respose = (prompt | llm).invoke({"context": context, "keywords": keywords})
print(respose)