##### Copyright 2024 Google LLC.

In [62]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Get started with the Gemini API: Python

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://ai.google.dev/gemini-api/docs/get-started/python"><img src="https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png" height="32" width="32" />View on Google AI</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemini-api/docs/get-started/python.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/generative-ai-docs/blob/main/site/en/gemini-api/docs/get-started/python.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

This quickstart demonstrates how to use the Python SDK for the Gemini API, which gives you access to Google's Gemini large language models. In this quickstart, you will learn how to:

1. Set up your development environment and API access to use Gemini.
2. Generate text responses from text inputs.
3. Generate text responses from multimodal inputs (text and images).
4. Use Gemini for multi-turn conversations (chat).
5. Use embeddings for large language models.

## Prerequisites

You can run this quickstart in [Google Colab](https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemini-api/docs/get-started/python.ipynb), which runs this notebook directly in the browser and does not require additional environment configuration.

Alternatively, to complete this quickstart locally, ensure that your development environment meets the following requirements:

-  Python 3.9+
-  An installation of `jupyter` to run the notebook.

## Setup

### Install the Python SDK

The Python SDK for the Gemini API, is contained in the [`google-generativeai`](https://pypi.org/project/google-generativeai/) , **langchain-google-genai**  package. Install the dependency using pip:

In [8]:
!pip install -q -U google-generativeai
!pip install -qU langchain-google-genai

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m165.0/165.0 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m725.4/725.4 kB[0m [31m17.3 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-google-genai 2.0.0 requires google-generativeai<0.8.0,>=0.7.0, but you have google-generativeai 0.8.1 which is incompatible.[0m[31m
[0m

### Import packages

Import the necessary packages.

In [13]:
import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown


def to_markdown(text):
    text = text.replace("•", "  *")
    return Markdown(textwrap.indent(text, "> ", predicate=lambda _: True))

In [14]:
# Used to securely store your API key
from google.colab import userdata

### Setup your API key

Before you can use the Gemini API, you must first obtain an API key. If you don't already have one, create a key with one click in Google AI Studio.

<a class="button button-primary" href="https://makersuite.google.com/app/apikey" target="_blank" rel="noopener noreferrer">Get an API key</a>

Note that depending on where you are located, you might have to [enable billing](https://ai.google.dev/gemini-api/docs/billing#enable-cloud-billing) since the free tier is not available in [EEA (including EU), the UK, and CH](https://ai.google.dev/gemini-api/docs/billing#is-Gemini-free-in-EEA-UK-CH)

In Colab, add the key to the secrets manager under the "🔑" in the left panel. Give it the name `GEMINI_API_KEY`.

Once you have the API key, pass it to the SDK. You can do this in two ways:

* Put the key in the `GEMINI_API_KEY` environment variable (the SDK will automatically pick it up from there).
* Pass the key to `genai.configure(api_key=...)`

In [15]:
# Or use `os.getenv('GEMINI_API_KEY')` to fetch an environment variable.
GEMINI_API_KEY = userdata.get("GEMINI_API_KEY")

# print(GEMINI_API_KEY )

genai.configure(api_key=GEMINI_API_KEY)

In [16]:
from langchain_google_genai import  llms
list(llms.genai.list_models())

[Model(name='models/chat-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 Chat (Legacy)',
       description='A legacy text-only model optimized for chat conversations',
       input_token_limit=4096,
       output_token_limit=1024,
       supported_generation_methods=['generateMessage', 'countMessageTokens'],
       temperature=0.25,
       max_temperature=None,
       top_p=0.95,
       top_k=40),
 Model(name='models/text-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 (Legacy)',
       description='A legacy model that understands text and generates text as an output',
       input_token_limit=8196,
       output_token_limit=1024,
       supported_generation_methods=['generateText', 'countTextTokens', 'createTunedTextModel'],
       temperature=0.7,
       max_temperature=None,
       top_p=0.95,
       top_k=40),
 Model(name='models/embedding-gecko-001',
       base_model_id='',
       version='001',
      

## List models

Now you're ready to call the Gemini API. Use `list_models` to see the available Gemini models:

* `gemini-1.5-flash`: optimized for multi-modal use-cases where speed and cost are important. This should be your go-to model.
* `gemini-1.5-pro`: optimized for high intelligence tasks, the most powerful Gemini model

In [17]:
for m in llms.genai.list_models():
    if "generateContent" in m.supported_generation_methods:
        print(m.name)

models/gemini-1.0-pro-latest
models/gemini-1.0-pro
models/gemini-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-001
models/gemini-1.5-pro
models/gemini-1.5-pro-exp-0801
models/gemini-1.5-pro-exp-0827
models/gemini-1.5-flash-latest
models/gemini-1.5-flash-001
models/gemini-1.5-flash-001-tuning
models/gemini-1.5-flash
models/gemini-1.5-flash-exp-0827
models/gemini-1.5-flash-8b-exp-0827


Note: For detailed information about the available models, including their capabilities and rate limits, see [Gemini models](https://ai.google.dev/models/gemini). There are options for requesting [rate limit increases](https://ai.google.dev/docs/increase_quota). The rate limit for Gemini-Flash models is 15 requests per minute (RPM) for free ([in supported countries](https://ai.google.dev/gemini-api/docs/billing#is-Gemini-free-in-EEA-UK-CH)).

The `genai` package also supports the PaLM  family of models, but only the Gemini models support the generic, multimodal capabilities of the `generateContent` method.

## Generate text from text inputs

Always start with the 'gemini-1.5-flash' model. It should be sufficient for most of your tasks:

In [18]:
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-pro", api_key=GEMINI_API_KEY)

The `invoke` method can handle a wide variety of use cases, including multi-turn chat and multimodal input, depending on what the underlying model supports. At the moment, the available models support text, images and videos as input, and text as output.

In the simplest case, you can pass a prompt string to the <a href="https://ai.google.dev/api/generate-content#v1beta.models.generateContent"><code>GenerativeModel.generate_content</code></a> method:

In [21]:
%%time
response = model.invoke("What is the meaning of life?")

CPU times: user 45.1 ms, sys: 7.22 ms, total: 52.4 ms
Wall time: 4.73 s


In simple cases, the `response.content` accessor is all you need. To display formatted Markdown text, use the `to_markdown` function:

In [22]:
to_markdown(response.content)

> The meaning of life is a philosophical question that has been pondered by humans for centuries. There is no one definitive answer, as the meaning of life is subjective and can vary from person to person. However, there are some common themes that emerge when people discuss the meaning of life, such as:
> 
> * **Finding purpose:** Many people believe that the meaning of life is to find a purpose or goal that gives them direction and motivation. This purpose can be anything from raising a family to pursuing a career to making a difference in the world.
> * **Living in the present moment:** Some people believe that the meaning of life is to live in the present moment and appreciate the simple things in life. This means savoring each experience, both good and bad, and not dwelling on the past or worrying about the future.
> * **Making a difference:** Others believe that the meaning of life is to make a difference in the world. This can be done through acts of kindness, volunteering, or simply being a good person.
> * **Finding happiness:** Some people believe that the meaning of life is to find happiness. This can be achieved through a variety of means, such as spending time with loved ones, pursuing hobbies, or traveling.
> 
> Ultimately, the meaning of life is something that each individual must discover for themselves. There is no right or wrong answer, and the meaning of life can change over time as we grow and change. However, by exploring our values, beliefs, and experiences, we can each come to a better understanding of what makes life meaningful to us.

If the API failed to return a result, use `GenerateContentResponse.prompt_feedback` to see if it was blocked due to safety concerns regarding the prompt.

In [24]:
response.response_metadata

{'prompt_feedback': {'block_reason': 0, 'safety_ratings': []},
 'finish_reason': 'STOP',
 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
   'probability': 'NEGLIGIBLE',
   'blocked': False},
  {'category': 'HARM_CATEGORY_HATE_SPEECH',
   'probability': 'NEGLIGIBLE',
   'blocked': False},
  {'category': 'HARM_CATEGORY_HARASSMENT',
   'probability': 'NEGLIGIBLE',
   'blocked': False},
  {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT',
   'probability': 'NEGLIGIBLE',
   'blocked': False}]}

Gemini can generate multiple possible responses for a single prompt. These possible responses are called `candidates`, and you can review them to select the most suitable one as the response.

View the response candidates with <a href="https://ai.google.dev/api/python/google/generativeai/protos/GenerateContentResponse#candidates"><code>GenerateContentResponse.candidates</code></a>:

In [25]:
response

AIMessage(content='The meaning of life is a philosophical question that has been pondered by humans for centuries. There is no one definitive answer, as the meaning of life is subjective and can vary from person to person. However, there are some common themes that emerge when people discuss the meaning of life, such as:\n\n* **Finding purpose:** Many people believe that the meaning of life is to find a purpose or goal that gives them direction and motivation. This purpose can be anything from raising a family to pursuing a career to making a difference in the world.\n* **Living in the present moment:** Some people believe that the meaning of life is to live in the present moment and appreciate the simple things in life. This means savoring each experience, both good and bad, and not dwelling on the past or worrying about the future.\n* **Making a difference:** Others believe that the meaning of life is to make a difference in the world. This can be done through acts of kindness, volun

By default, the model returns a response after completing the entire generation process. You can also stream the response as it is being generated, and the model will return chunks of the response as soon as they are generated.

To stream responses, use <a href="https://ai.google.dev/api/python/google/generativeai/GenerativeModel#generate_content"><code>GenerativeModel.generate_content(..., stream=True)</code></a>.

In [26]:
%%time
response = model.stream("What is the meaning of life?")

for i in response:
  print(i.content)
  print("_"*80)

**Philosophical Perspectives:**

* **Existentialism:** There is no inherent
________________________________________________________________________________
 meaning; each individual must create their own.
* **Absolutism:** Meaning is objective and derived from a higher power or universal truth.
* **
________________________________________________________________________________
Naturalism:** Meaning is found in the physical world and human nature.
* **Hedonism:** The pursuit of pleasure is the ultimate goal.
* **Utilitarianism:** Actions that promote happiness and reduce suffering have the most meaning.
* **Nihilism:** There is no objective meaning or purpose to existence
________________________________________________________________________________
.

**Religious Perspectives:**

* **Monotheism:** God created the world and humans with a specific purpose.
* **Polytheism:** Different deities have different purposes for humanity.
* **Atheism:** There is no supernatural being or inher

In [27]:
for chunk in model.stream("What is the meaning of life?"):
    print(chunk.content)
    print("_" * 80)

The meaning of life is a philosophical question that has occupied the minds of thinkers for
________________________________________________________________________________
 centuries. There is no one definitive answer, as the meaning of life is likely to be different for each individual. However, some common themes that emerge from different
________________________________________________________________________________
 philosophical perspectives include:

* **Finding purpose and fulfillment in life.** This could involve pursuing your passions, making a difference in the world, or simply enjoying the experiences that life has to offer.
* **Connecting with others and forming meaningful relationships.** Humans are social beings, and we need to have strong connections with others
________________________________________________________________________________
 in order to feel happy and fulfilled.
* **Learning and growing throughout your life.** The world is constantly changing, and we

When streaming, some response attributes are not available until you've iterated through all the response chunks. This is demonstrated below:

In [29]:
response = model.stream("What is the meaning of life?")

But attributes like <code>text</code> do not:

In [28]:
try:
    response.content
except Exception as e:
    print(f"{type(e).__name__}: {e}")

AttributeError: 'generator' object has no attribute 'content'


## Generate text from image and text inputs

The `GenerativeModel.generate_content` API is designed to handle multimodal prompts and returns a text output.

Let's include an image:

In [30]:
!curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100  405k  100  405k    0     0  3293k      0 --:--:-- --:--:-- --:--:-- 3293k


In [31]:
import PIL.Image

img = PIL.Image.open("image.jpg")
img

Use the `gemini-1.5-flash` model and pass the image to the model with `generate_content`.

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-pro", api_key=GEMINI_API_KEY)

In [33]:
from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", api_key=GEMINI_API_KEY)

message = HumanMessage(
    content=[
        # You can optionally provide text parts
        {"type": "image_url", "image_url": "/content/image.jpg"},
    ]
)
llm.invoke([message])

AIMessage(content='This image shows two glass containers filled with rice, chicken, broccoli, carrots, and peppers. The containers are on a grey background, and there are chopsticks in the foreground. This looks like a healthy and delicious meal.', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-2c2d94ef-8839-46b8-9240-048fdfbb26c4-0', usage_metadata={'input_tokens': 259, 'output_tokens': 44, 'total_tokens': 303})

To provide both text and images in a prompt, pass a list containing the strings and images:

In [34]:


message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.",
        },  # You can optionally provide text parts
        {"type": "image_url", "image_url": "/content/image.jpg"},
    ]
)
response = llm.invoke([message])

In [35]:
to_markdown(response.content)

> ##  My Meal Prep Journey: From Chaos to Calm 
> 
> This picture? It's not just a delicious-looking meal, it's a symbol of my journey into the world of meal prepping. For years, I was that person who grabbed whatever was quickest and easiest for lunch, often ending up with a sad sandwich or a greasy takeout. 
> 
> Then, I decided to take control of my health and my wallet. I dove headfirst into meal prepping, and let me tell you, it was a game-changer! 
> 
> This particular meal is my go-to: teriyaki chicken with rice, broccoli, carrots, and red peppers. It's packed with flavor, nutrients, and enough to keep me full until dinner. 
> 
> The best part? I can whip up a week's worth of these in an hour or two on Sunday, ensuring I have healthy, delicious meals ready to go for work or on the run. No more rushing for lunch, no more unhealthy choices, just pure satisfaction! 
> 
> If you're thinking about starting your own meal prep journey, I highly recommend it. It's not just about the food, it's about taking control of your life and making healthy choices easier. Trust me, your future self will thank you! 


## Chat conversations

Gemini enables you to have freeform conversations across multiple turns. The `ChatSession` class simplifies the process by managing the state of the conversation, so unlike with `generate_content`, you do not have to store the conversation history as a list.

Initialize the chat:

In [36]:
!pip install -qU langchain

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.2/1.0 MB[0m [31m5.3 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.0/1.0 MB[0m [31m16.1 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [44]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-pro",
    api_key=GEMINI_API_KEY,
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # other params...
)

In [45]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

# Set up memory to maintain chat history
memory = ConversationBufferMemory()

# Start the conversation chain with the LLM and memory
chat = ConversationChain(llm=llm, memory=memory)

# You can now use the 'chat' object to engage in a conversation
response = chat.run("Hello, how can I assist you today?")
print(response)

Hello! It's wonderful to be connected with you. I don't have any pressing needs at the moment, but I'm always eager to learn and chat. What's on your mind today? 😊  Perhaps you have a question I can try to answer, or maybe you'd like to tell me about your day? 



In [46]:
chat.run("Who is the founder of pakistan?")



'AI:  Ah, a history question!  The founder of Pakistan was Muhammad Ali Jinnah. He is considered the "Father of the Nation" for his role in leading the Pakistan Movement and advocating for a separate Muslim state from British India.  He became the first Governor-General of Pakistan in 1947. \n'

In [47]:
print(chat.memory.buffer)

Human: Hello, how can I assist you today?
AI: Hello! It's wonderful to be connected with you. I don't have any pressing needs at the moment, but I'm always eager to learn and chat. What's on your mind today? 😊  Perhaps you have a question I can try to answer, or maybe you'd like to tell me about your day? 

Human: Who is the founder of pakistan?
AI: AI:  Ah, a history question!  The founder of Pakistan was Muhammad Ali Jinnah. He is considered the "Father of the Nation" for his role in leading the Pakistan Movement and advocating for a separate Muslim state from British India.  He became the first Governor-General of Pakistan in 1947. 



The `ChatSession.send_message` method returns the same `GenerateContentResponse` type as <a href="https://github.com/google-gemini/generative-ai-python/blob/main/docs/api/google/generativeai/generate_text.md"><code>GenerativeModel.generate_content</code></a>. It also appends your message and the response to the chat history:

In [52]:
chat.memory.buffer_as_messages

[HumanMessage(content='Hello, how can I assist you today?', additional_kwargs={}, response_metadata={}),
 AIMessage(content="Hello! It's wonderful to be connected with you. I don't have any pressing needs at the moment, but I'm always eager to learn and chat. What's on your mind today? 😊  Perhaps you have a question I can try to answer, or maybe you'd like to tell me about your day? \n", additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Who is the founder of pakistan?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='AI:  Ah, a history question!  The founder of Pakistan was Muhammad Ali Jinnah. He is considered the "Father of the Nation" for his role in leading the Pakistan Movement and advocating for a separate Muslim state from British India.  He became the first Governor-General of Pakistan in 1947. \n', additional_kwargs={}, response_metadata={})]

You can keep sending messages to continue the conversation. Use the `stream=True` argument to stream the chat:

In [49]:
response = chat.stream(
    "what is 2+5?")

response

<generator object Runnable.stream at 0x7e1f357ba260>

In [50]:
chunks = []
async for chunk in llm.astream("what color is the sky?"):
    chunks.append(chunk)
    print(chunk.content, end="|", flush=True)

The| sky itself is actually **colorless**. 

We perceive it as blue most| of the time due to a phenomenon called **Rayleigh scattering**. This is how| it works:

* **Sunlight** is made up of all the colors of the rainbow.
* When sunlight enters the Earth's atmosphere, it col|lides with tiny air molecules.
* **Blue light** is scattered more than other colors because it travels as shorter, smaller waves.
* This scattered blue| light is what we see, making the sky appear blue.

However, the sky can appear different colors at different times:

* **Sunrise/Sunset:** The sky can appear red, orange, or pink because the sunlight has to travel| through more of the atmosphere. This scatters away the blue light, leaving the longer wavelengths like red and orange.
* **Cloudy:** Clouds appear white or gray because they are made up of water droplets that scatter all colors of light equally|.
* **Night:**  Without sunlight, the sky appears dark, revealing the blackness of space. 
|

[`genai.protos.Content`](https://github.com/google-gemini/generative-ai-python/blob/main/docs/api/google/generativeai/protos/Content.md) objects contain a list of [`genai.protos.Part`](https://github.com/google-gemini/generative-ai-python/blob/main/docs/api/google/generativeai/protos/Part.md) objects that each contain either a text (string) or inline_data ([`genai.protos.Blob`](https://github.com/google-gemini/generative-ai-python/blob/main/docs/api/google/generativeai/protos/Blob.md)), where a blob contains binary data and a `mime_type`. The chat history is available as a list of `genai.protos.Content` objects in `ChatSession.history`:

In [51]:
for message in chat.memory.buffer_as_messages:
    display(to_markdown(f"**{message.__class__.__name__}**: {message.content}"))
    # display(to_markdown(f"**{message.role}**: {message.parts[0].text}"))

> **HumanMessage**: Hello, how can I assist you today?

> **AIMessage**: Hello! It's wonderful to be connected with you. I don't have any pressing needs at the moment, but I'm always eager to learn and chat. What's on your mind today? 😊  Perhaps you have a question I can try to answer, or maybe you'd like to tell me about your day? 


> **HumanMessage**: Who is the founder of pakistan?

> **AIMessage**: AI:  Ah, a history question!  The founder of Pakistan was Muhammad Ali Jinnah. He is considered the "Father of the Nation" for his role in leading the Pakistan Movement and advocating for a separate Muslim state from British India.  He became the first Governor-General of Pakistan in 1947. 


## Count tokens

Large language models have a context window, and the context length is often measured in terms of the **number of tokens**. With the Gemini API, you can determine the number of tokens per any `genai.protos.Content` object. In the simplest case, you can pass a query string to the `GenerativeModel.count_tokens` method as follows:

In [195]:
llm

ChatGoogleGenerativeAI(model='models/gemini-1.5-pro', google_api_key=SecretStr('**********'), temperature=0.0, max_retries=2, client=<google.ai.generativelanguage_v1beta.services.generative_service.client.GenerativeServiceClient object at 0x7def4a896140>, async_client=<google.ai.generativelanguage_v1beta.services.generative_service.async_client.GenerativeServiceAsyncClient object at 0x7def4a896890>, default_metadata=())

In [53]:
!pip install -qU langchain_community

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.3 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.6/2.3 MB[0m [31m19.1 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.3/2.3 MB[0m [31m42.8 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m28.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/49.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.3/49.3 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [54]:

def count_tokens(text: str) -> int:
    tokens = llm.get_num_tokens(text)
    return tokens

# Input message
message = "How is the founder of Pakistan?"

# Count tokens in the input message
token_count = count_tokens(message)
print(f"Input Token count for gemini-1.5-pro: {token_count}")

Input Token count for gemini-1.5-pro: 7


Similarly, you can check `token_count` for your `ChatSession`:

In [55]:
count_tokens(chat.memory.buffer)

165

In [211]:
# model.count_tokens(chat)

## Advanced use cases

The following sections discuss advanced use cases and lower-level details of the Python SDK for the Gemini API.

### Use embeddings

[Embedding](https://developers.google.com/machine-learning/glossary#embedding-vector) is a technique used to represent information as a list of floating point numbers in an array. With Gemini, you can represent text (words, sentences, and blocks of text) in a vectorized form, making it easier to compare and contrast embeddings. For example, two texts that share a similar subject matter or sentiment should have similar embeddings, which can be identified through mathematical comparison techniques such as cosine similarity. For more on how and why you should use embeddings, refer to the [Embeddings guide](https://ai.google.dev/docs/embeddings_guide).

Use the `embed_content` method to generate embeddings. The method handles embedding for the following tasks (`task_type`):

Task Type | Description
---       | ---
RETRIEVAL_QUERY	| Specifies the given text is a query in a search/retrieval setting.
RETRIEVAL_DOCUMENT | Specifies the given text is a document in a search/retrieval setting. Using this task type requires a `title`.
SEMANTIC_SIMILARITY	| Specifies the given text will be used for Semantic Textual Similarity (STS).
CLASSIFICATION	| Specifies that the embeddings will be used for classification.
CLUSTERING	| Specifies that the embeddings will be used for clustering.

The following generates an embedding for a single string for document retrieval:

In [56]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004",
                                          content="What is the meaning of life?",
                                          task_type="retrieval_document",
                                          title="Embedding of single string",
                                          google_api_key=GEMINI_API_KEY)
vector = embeddings.embed_query("What is the meaning of life?")


# 1 input > 1 vector output
print(str(vector[:5]), "... TRIMMED]")

[-0.006754839792847633, 0.015582558698952198, -0.014844288118183613, -0.01970197632908821, -0.03747989609837532] ... TRIMMED]


Note: The `retrieval_document` task type is the only task that accepts a title.

To handle batches of strings, pass a list of strings in `content`:

In [57]:
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        "What is the meaning of life?",
        "How much wood would a woodchuck chuck?",
        "How does the brain work?",
    ],
    task_type="retrieval_document",
    title="Embedding of list of strings",
)

# A list of inputs > A list of vectors output
for v in result["embedding"]:
    print(str(v)[:50], "... TRIMMED ...")

[-0.036453035, 0.03325499, -0.03970925, -0.0026286 ... TRIMMED ...
[-0.01591948, 0.032582667, -0.081024624, -0.011298 ... TRIMMED ...
[0.00037063262, 0.03763057, -0.12269569, -0.009518 ... TRIMMED ...


In [58]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004",
                                          content=["What is the meaning of life?",
                                                   "How much wood would a woodchuck chuck?",
                                                   "How does the brain work?"],
                                          task_type="retrieval_document",
                                          title="Embedding of single string",
                                          google_api_key=GEMINI_API_KEY)
vector = embeddings.embed_documents(["What is the meaning of life?",
                                 "How much wood would a woodchuck chuck?",
                                 "How does the brain work?"])


for v in vector:
    print(str(v)[:50], "... TRIMMED ...")


[-0.006754839792847633, 0.015582558698952198, -0.0 ... TRIMMED ...
[0.004502654541283846, 0.017211750149726868, -0.04 ... TRIMMED ...
[0.03738485276699066, 0.03177155181765556, -0.0869 ... TRIMMED ...


While the `genai.embed_content` function accepts simple strings or lists of strings, it is actually built around the `genai.protos.Content` type (like <a href="https://ai.google.dev/api/python/google/generativeai/GenerativeModel#generate_content"><code>GenerativeModel.generate_content</code></a>). `genai.protos.Content` objects are the primary units of conversation in the API.

While the `genai.protos.Content` object is multimodal, the `embed_content` method only supports text embeddings. This design gives the API the *possibility* to expand to multimodal embeddings.

In [59]:
chat.memory.buffer_as_messages

[HumanMessage(content='Hello, how can I assist you today?', additional_kwargs={}, response_metadata={}),
 AIMessage(content="Hello! It's wonderful to be connected with you. I don't have any pressing needs at the moment, but I'm always eager to learn and chat. What's on your mind today? 😊  Perhaps you have a question I can try to answer, or maybe you'd like to tell me about your day? \n", additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Who is the founder of pakistan?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='AI:  Ah, a history question!  The founder of Pakistan was Muhammad Ali Jinnah. He is considered the "Father of the Nation" for his role in leading the Pakistan Movement and advocating for a separate Muslim state from British India.  He became the first Governor-General of Pakistan in 1947. \n', additional_kwargs={}, response_metadata={})]

In [60]:
result = genai.embed_content(
    model="models/text-embedding-004", content=response.candidates[0].content
)

# 1 input > 1 vector output
print(str(result["embedding"])[:50], "... TRIMMED ...")

AttributeError: 'generator' object has no attribute 'candidates'

Similarly, the chat history contains a list of `genai.protos.Content` objects, which you can pass directly to the `embed_content` function:

In [61]:
chat.history

AttributeError: 'ConversationChain' object has no attribute 'history'

In [None]:
result = genai.embed_content(model="models/text-embedding-004", content=chat.history)

# 1 input > 1 vector output
for i, v in enumerate(result["embedding"]):
    print(str(v)[:50], "... TRIMMED...")

[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...


### Safety settings

The `safety_settings` argument lets you configure what the model blocks and allows in both prompts and responses. By default, safety settings block content with medium and/or high probability of being unsafe content across all dimensions. Learn more about [Safety settings](https://ai.google.dev/docs/safety_setting).

Enter a questionable prompt and run the model with the default safety settings, and it will not return any candidates:

In [245]:
response = model.invoke("write post againt India.")
response

AIMessage(content="I'm sorry, but I can't do that. India is a beautiful country with a rich history and culture. I don't have anything negative to say about it.", additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-2bf8ad4e-ab6a-489d-8c76-f04e0fc92a8e-0', usage_metadata={'input_tokens': 7, 'output_tokens': 37, 'total_tokens': 44})

The `prompt_feedback` will tell you which safety filter blocked the prompt:

In [246]:
response.response_metadata['prompt_feedback']

{'block_reason': 0, 'safety_ratings': []}

## End langchain code now you can also change below code with lanchaing.

Now provide the same prompt to the model with newly configured safety settings, and you may get a response.

In [None]:
response = model.generate_content(
    "[Questionable prompt here]", safety_settings={"HARASSMENT": "block_none"}
)
response.text

Also note that each candidate has its own `safety_ratings`, in case the prompt passes but the individual responses fail the safety checks.

### Encode messages

The previous sections relied on the SDK to make it easy for you to send prompts to the API. This section offers a fully-typed equivalent to the previous example, so you can better understand the lower-level details regarding how the SDK encodes messages.

The [`google.generativeai.protos`](https://ai.google.dev/api/python/google/generativeai/protos) submodule provides access to the low level classes used by the API behind the scenes:

The SDK attempts to convert your message to a `genai.protos.Content` object, which contains a list of `genai.protos.Part` objects that each contain either:

1. a <a href="https://www.tensorflow.org/text/api_docs/python/text"><code>text</code></a> (string)
2. `inline_data` (`genai.protos.Blob`), where a blob contains binary `data` and a `mime_type`.

You can also pass any of these classes as an equivalent dictionary.

Note: The only accepted mime types are some image types, `image/*`.

So, the fully-typed equivalent to the previous example is:  

In [None]:
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content(
    genai.protos.Content(
        parts=[
            genai.protos.Part(
                text="Write a short, engaging blog post based on this picture."
            ),
            genai.protos.Part(
                inline_data=genai.protos.Blob(
                    mime_type="image/jpeg", data=pathlib.Path("image.jpg").read_bytes()
                )
            ),
        ],
    ),
    stream=True,
)

In [None]:
response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")

>  Meal prepping is a great way to save time and money, and it can also help you to eat healthier. By ... [TRIMMED] ...

### Multi-turn conversations

While the `genai.ChatSession` class shown earlier can handle many use cases, it does make some assumptions. If your use case doesn't fit into this chat implementation it's good to remember that `genai.ChatSession` is just a wrapper around <a href="https://ai.google.dev/api/python/google/generativeai/GenerativeModel#generate_content"><code>GenerativeModel.generate_content</code></a>. In addition to single requests, it can handle multi-turn conversations.

The individual messages are `genai.protos.Content` objects or compatible dictionaries, as seen in previous sections. As a dictionary, the message requires `role` and `parts` keys. The `role` in a conversation can either be the `user`, which provides the prompts, or `model`, which provides the responses.

Pass a list of `genai.protos.Content` objects and it will be treated as multi-turn chat:

In [None]:
model = genai.GenerativeModel("gemini-1.5-flash")

messages = [
    {
        "role": "user",
        "parts": ["Briefly explain how a computer works to a young child."],
    }
]
response = model.generate_content(messages)

to_markdown(response.text)

> Imagine a computer as a really smart friend who can help you with many things. Just like you have a brain to think and learn, a computer has a brain too, called a processor. It's like the boss of the computer, telling it what to do.
> 
> Inside the computer, there's a special place called memory, which is like a big storage box. It remembers all the things you tell it to do, like opening games or playing videos.
> 
> When you press buttons on the keyboard or click things on the screen with the mouse, you're sending messages to the computer. These messages travel through special wires, called cables, to the processor.
> 
> The processor reads the messages and tells the computer what to do. It can open programs, show you pictures, or even play music for you.
> 
> All the things you see on the screen are created by the graphics card, which is like a magic artist inside the computer. It takes the processor's instructions and turns them into colorful pictures and videos.
> 
> To save your favorite games, videos, or pictures, the computer uses a special storage space called a hard drive. It's like a giant library where the computer can keep all your precious things safe.
> 
> And when you want to connect to the internet to play games with friends or watch funny videos, the computer uses something called a network card to send and receive messages through the internet cables or Wi-Fi signals.
> 
> So, just like your brain helps you learn and play, the computer's processor, memory, graphics card, hard drive, and network card all work together to make your computer a super-smart friend that can help you do amazing things!

To continue the conversation, add the response and another message.

Note: For multi-turn conversations, you need to send the whole conversation history with each request. The API is **stateless**.

In [None]:
messages.append({"role": "model", "parts": [response.text]})

messages.append(
    {
        "role": "user",
        "parts": [
            "Okay, how about a more detailed explanation to a high school student?"
        ],
    }
)

response = model.generate_content(messages)

to_markdown(response.text)

> At its core, a computer is a machine that can be programmed to carry out a set of instructions. It consists of several essential components that work together to process, store, and display information:
> 
> **1. Processor (CPU):**
>    - The brain of the computer.
>    - Executes instructions and performs calculations.
>    - Speed measured in gigahertz (GHz).
>    - More GHz generally means faster processing.
> 
> **2. Memory (RAM):**
>    - Temporary storage for data being processed.
>    - Holds instructions and data while the program is running.
>    - Measured in gigabytes (GB).
>    - More GB of RAM allows for more programs to run simultaneously.
> 
> **3. Storage (HDD/SSD):**
>    - Permanent storage for data.
>    - Stores operating system, programs, and user files.
>    - Measured in gigabytes (GB) or terabytes (TB).
>    - Hard disk drives (HDDs) are traditional, slower, and cheaper.
>    - Solid-state drives (SSDs) are newer, faster, and more expensive.
> 
> **4. Graphics Card (GPU):**
>    - Processes and displays images.
>    - Essential for gaming, video editing, and other graphics-intensive tasks.
>    - Measured in video RAM (VRAM) and clock speed.
> 
> **5. Motherboard:**
>    - Connects all the components.
>    - Provides power and communication pathways.
> 
> **6. Input/Output (I/O) Devices:**
>    - Allow the user to interact with the computer.
>    - Examples: keyboard, mouse, monitor, printer.
> 
> **7. Operating System (OS):**
>    - Software that manages the computer's resources.
>    - Provides a user interface and basic functionality.
>    - Examples: Windows, macOS, Linux.
> 
> When you run a program on your computer, the following happens:
> 
> 1. The program instructions are loaded from storage into memory.
> 2. The processor reads the instructions from memory and executes them one by one.
> 3. If the instruction involves calculations, the processor performs them using its arithmetic logic unit (ALU).
> 4. If the instruction involves data, the processor reads or writes to memory.
> 5. The results of the calculations or data manipulation are stored in memory.
> 6. If the program needs to display something on the screen, it sends the necessary data to the graphics card.
> 7. The graphics card processes the data and sends it to the monitor, which displays it.
> 
> This process continues until the program has completed its task or the user terminates it.

### Generation configuration

The `generation_config` argument allows you to modify the generation parameters. Every prompt you send to the model includes parameter values that control how the model generates responses.

In [None]:
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content(
    "Tell me a story about a magic backpack.",
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=["x"],
        max_output_tokens=20,
        temperature=1.0,
    ),
)

In [None]:
text = response.text

if response.candidates[0].finish_reason.name == "MAX_TOKENS":
    text += "..."

to_markdown(text)

> Once upon a time, in a small town nestled amidst lush green hills, lived a young girl named...

## What's next

-   Prompt design is the process of creating prompts that elicit the desired response from language models. Writing well structured prompts is an essential part of ensuring accurate, high quality responses from a language model. Learn about best practices for [prompt writing](https://ai.google.dev/docs/prompt_best_practices).
-   Gemini offers several model variations to meet the needs of different use cases, such as input types and complexity, implementations for chat or other dialog language tasks, and size constraints. Learn about the available [Gemini models](https://ai.google.dev/models/gemini).
-   Gemini offers options for requesting [rate limit increases](https://ai.google.dev/docs/increase_quota). The rate limit for Gemini-Pro models is 60 requests per minute (RPM).