# Introducción a la API de Gemini en Vertex AI con cURL / REST API

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_curl.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Logotipo de Google Colaboratory"><br> Ejecutar en Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fgetting-started%2Fintro_gemini_curl.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Logotipo de Google Cloud Colab Enterprise"><br> Ejecutar en Colab Enterprise
    </a>
  </td>       
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_curl.ipynb">
      <img width="32px" src="https://www.svgrepo.com/download/217753/github.svg" alt="Logotipo de GitHub"><br> Ver en GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/getting-started/intro_gemini_curl.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Logotipo de Vertex AI"><br> Abrir en Vertex AI Workbench
    </a>
  </td>
   <td style="text-align: center">
    <a href="https://goo.gle/4jeQxSk">
      <img width="32px" src="https://cdn.qwiklabs.com/assets/gcp_cloud-e3a77215f0b8bfa9b3f611c0d2208c7e8708ed31.svg" alt="Logotipo de Google Cloud"><br> Abrir en Skills
    </a>
  </td>
</table>

<div style="clear: both;"></div>

| Autores |
| --- |
| [Eric Dong](https://github.com/gericdong) |
| [Polong Lin](https://github.com/polong-lin) |

## Descripción general

**Video de YouTube: Introducción a Gemini en Vertex AI**

<a href="https://www.youtube.com/watch?v=YfiLUpNejpE&list=PLIivdWyY5sqJio2yeg1dlfILOUO2FoFRx" target="_blank">
  <img src="https://img.youtube.com/vi/YfiLUpNejpE/maxresdefault.jpg" alt="Introducción a Gemini en Vertex AI" width="500">
</a>

En este tutorial, aprenderás a usar la API REST de Vertex AI con comandos cURL para interactuar con el modelo Gemini 2.5 Flash.

Completarás las siguientes tareas:

- Generación de texto
- Generación de texto en streaming
- Chat
- Llamada a funciones
- Entrada multimodal
- Generación controlada
- Búsqueda como herramienta
- Ejecución de código

### Costos

Este tutorial utiliza componentes facturables de Google Cloud:

- Vertex AI

Consulta los [precios de Vertex AI](https://cloud.google.com/vertex-ai/pricing) y utiliza la [Calculadora de Precios](https://cloud.google.com/products/calculator/) para generar una estimación de costos basada en tu uso proyectado.

## Comenzando

### Instalar las bibliotecas requeridas

In [1]:
%%capture

!sudo apt install -q jq

### Configurar el proyecto de Google Cloud

Para comenzar a usar Vertex AI, debes tener un proyecto de Google Cloud existente y [habilitar la API de Vertex AI](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Aprende más sobre [cómo configurar un proyecto y un entorno de desarrollo](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [2]:
import os
from google import genai

PROJECT_ID = "qwiklabs-gcp-03-c5def5360c93"
LOCATION = "global"
client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)

## Usar el modelo Gemini 2.5 Flash

In [3]:
MODEL_ID = "gemini-2.5-flash"

api_host = "aiplatform.googleapis.com"
if LOCATION != "global":
    api_host = f"{LOCATION}-aiplatform.googleapis.com"

os.environ["API_ENDPOINT"] = (
    f"{api_host}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}"
)
API_ENDPOINT = os.environ["API_ENDPOINT"]

## Generación de texto

El método `generateContent` puede manejar una amplia variedad de casos de uso, incluyendo chat de múltiples turnos y entrada multimodal, dependiendo de lo que admita el modelo subyacente. En este ejemplo, envías un mensaje de texto y solicitas que el modelo responda en texto.

In [4]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": { "text": "Why is the sky blue?" },
    },
    "generation_config": {
      "response_modalities": "TEXT",
     },
  }' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json

The sky is blue primarily due to a phenomenon called **Rayleigh Scattering**. Here's a breakdown of how it works:

1.  **Sunlight is White Light:** Sunlight, which appears white to us, is actually made up of all the colors of the rainbow (red, orange, yellow, green, blue, indigo, violet). Each of these colors has a different wavelength.
    *   **Shorter Wavelengths:** Blue and violet light have shorter, smaller wavelengths.
    *   **Longer Wavelengths:** Red and orange light have longer, larger wavelengths.

2.  **Earth's Atmosphere:** Our atmosphere is composed mainly of tiny nitrogen (N₂) and oxygen (O₂) molecules. These molecules are much smaller than the wavelengths of visible light.

3.  **Rayleigh Scattering:** When sunlight enters the Earth's atmosphere, it interacts with these tiny air molecules.
    *   **Preferential Scattering:** Because the air molecules are so small, they are much more effective at scattering shorter-wavelength light (blue and violet) than longer-wavelen

### Streaming

La API de Gemini proporciona un mecanismo de respuesta en streaming. Con este enfoque, no necesitas esperar la respuesta completa; puedes comenzar a procesar fragmentos tan pronto como estén disponibles.

In [5]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:streamGenerateContent \
 \
  -d '{
    "contents": {
      "role": "USER",
      "parts": { "text": "Why is the sky blue?" }
    }
  }' 2>/dev/null >response.json

jq -r ".[] | .candidates[] | .content.parts[].text" response.json

The sky appears blue due to a phenomenon called **Rayleigh scattering**, which describes
 how light interacts with particles much smaller than its wavelength. It's all about how sunlight interacts with Earth's atmosphere.

Here's a breakdown:

1.  **Sunlight is White Light:** Sunlight, which appears white to us, is actually made up of all the colors of the rainbow (red
, orange, yellow, green, blue, indigo, violet). Each of these colors has a different **wavelength**.
    *   **Blue and violet light** have shorter, smaller wavelengths.
    *   **Red and orange light** have longer, larger wavelengths.

2.  **Earth's Atmosphere:** Our atmosphere is made
 mostly of tiny nitrogen (N2) and oxygen (O2) molecules. These molecules are much smaller than the wavelengths of visible light.

3.  **Rayleigh Scattering:** When sunlight enters the atmosphere, these tiny air molecules scatter the light. However, they don't scatter all colors equally:
    *   **S
horter wavelengths (like blue and violet

### Parámetros del modelo

Cada mensaje que envías al modelo incluye valores de parámetros que controlan cómo el modelo genera una respuesta. El modelo puede generar diferentes resultados para diferentes valores de parámetros. Puedes experimentar con diferentes parámetros del modelo para ver cómo cambian los resultados.

In [6]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {"text": "Tell me a story."}
      ]
    },
    "generation_config": {
      "temperature": 0.2,
      "top_p": 0.1,
      "top_k": 16,
      "max_output_tokens": 2048,
      "candidate_count": 1,
      "stop_sequences": []
    },
    "safety_settings": {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_LOW_AND_ABOVE"
    }
  }' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json

In a kingdom nestled between towering, snow-capped peaks and a sea that whispered ancient secrets, lived an Emperor who loved beauty above all else. His palace was a marvel of jade and gold, his gardens bloomed with flowers from every corner of the world, and his halls echoed with the finest music. Yet, his greatest treasure was not a jewel or a painting, but a small, exquisite clockwork nightingale.

Crafted by the most ingenious artisan in the land, this bird was a masterpiece of polished brass, tiny gears, and sapphire eyes. When wound, it would perch on a velvet cushion and sing. Its song was perfect, a cascade of notes so precise, so flawless, that it brought tears to the Emperor's eyes. Every trill, every warble, was exactly the same, day after day, year after year. The Emperor would sit for hours, listening, convinced he possessed the most beautiful sound in existence.

One day, a traveler from a distant land arrived at the palace, bearing tales of wonders. Among them, he spoke 

### Chat

La API de Gemini admite conversaciones naturales de múltiples turnos y es ideal para tareas de texto que requieren interacciones de ida y vuelta.

Especifica el campo `role` solo si el contenido representa un turno en una conversación. Puedes establecer `role` en uno de los siguientes valores: `user`, `model`.

In [16]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          { "text": "Hello" }
        ]
      },
      {
        "role": "model",
        "parts": [
          { "text": "Hello! I am glad you could both make it." }
        ]
      },
      {
        "role": "user",
        "parts": [
          { "text": "So what is the first order of business?" }
        ]
      }
    ]
  }' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json

That's a great question! Since we're just getting started and haven't established a specific topic or goal yet, I'm ready to follow your lead.

What would you like to discuss or accomplish first? What brought you here today? Just let me know, and we can dive right in!


### Llamada a funciones

La llamada a funciones te permite crear una descripción de una función en tu código y luego pasar esa descripción a un modelo de lenguaje en una solicitud. Este ejemplo muestra cómo pasar una descripción de una función que devuelve información sobre dónde se está proyectando una película. Varias declaraciones de funciones se incluyen en la solicitud, como `find_movies` y `find_theaters`.

Aprende más sobre [llamada a funciones](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/function-calling).

In [17]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
  "contents": {
    "role": "user",
    "parts": {
      "text": "Which theaters in Mountain View show Barbie movie?"
    }
  },
  "tools": [
    {
      "function_declarations": [
        {
          "name": "find_movies",
          "description": "find movie titles currently playing in theaters based on any description, genre, title words, etc.",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA or a zip code e.g. 95616"
              },
              "description": {
                "type": "string",
                "description": "Any kind of description including category or genre, title words, attributes, etc."
              }
            },
            "required": [
              "description"
            ]
          }
        },
        {
          "name": "find_theaters",
          "description": "find theaters based on location and optionally movie title which are is currently playing in theaters",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA or a zip code e.g. 95616"
              },
              "movie": {
                "type": "string",
                "description": "Any movie title"
              }
            },
            "required": [
              "location"
            ]
          }
        },
        {
          "name": "get_showtimes",
          "description": "Find the start times for movies playing in a specific theater",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA or a zip code e.g. 95616"
              },
              "movie": {
                "type": "string",
                "description": "Any movie title"
              },
              "theater": {
                "type": "string",
                "description": "Name of theater"
              },
              "date": {
                "type": "string",
                "description": "Date for requested showtime"
              }
            },
            "required": [
              "location",
              "movie",
              "theater",
              "date"
            ]
          }
        }
      ]
    }
  ]
}' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].functionCall" response.json

{
  "name": "find_theaters",
  "args": {
    "movie": "Barbie",
    "location": "Mountain View"
  }
}


## Entrada multimodal

Gemini es un modelo multimodal que admite agregar imágenes y videos en mensajes de texto o chat para obtener una respuesta en texto.

### Descargar una imagen desde Google Cloud Storage

In [18]:
! gsutil cp "gs://cloud-samples-data/generative-ai/image/320px-Felis_catus-cat_on_snow.jpg" ./image.jpg

Copying gs://cloud-samples-data/generative-ai/image/320px-Felis_catus-cat_on_snow.jpg...
/ [1 files][ 17.4 KiB/ 17.4 KiB]                                                
Operation completed over 1 objects/17.4 KiB.                                     


### Generar texto desde una imagen local

Especifica la codificación [base64](https://en.wikipedia.org/wiki/Base64) de la imagen o video para incluirla en el mensaje y el campo `mime_type`. Los tipos MIME admitidos para imágenes incluyen `image/png` y `image/jpeg`.

In [19]:
%%bash

# Encode image data in base64
image_file="image.jpg"
if [[ -f "$image_file" ]]; then
  if command -v base64 &> /dev/null; then
    # base64 is available
    if [[ "$(uname -s)" == "Darwin" ]]; then
      # macOS -b 0 to avoid line wrapping
      data=$(base64 -b 0 -i "$image_file")
    else
      # Linux -w 0 to avoid line wrapping
      data=$(base64 -w 0 "$image_file")
    fi
  else
    echo "Error: base64 command not found."
    exit 1
  fi
else
  echo "Error: Image file '$image_file' not found."
  exit 1
fi

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d "{
      'contents': {
        'role': 'USER',
        'parts': [
          {
            'text': 'Is it a cat?'
          },
          {
            'inline_data': {
              'data': '${data}',
              'mime_type':'image/jpeg'
            }
          }
        ]
       }
    }" 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json

Yes, that is a cat. It's a tabby cat, likely a domestic shorthair, walking in the snow.


### Generar texto desde una imagen en Google Cloud Storage

Especifica el URI de Cloud Storage de la imagen para incluirla en el mensaje. El bucket que almacena el archivo debe estar en el mismo proyecto de Google Cloud que envía la solicitud. También debes especificar el campo `mime_type`. Los tipos MIME admitidos para imágenes incluyen `image/png` y `image/jpeg`.

In [20]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {
          "text": "Describe this image"
        },
        {
          "file_data": {
            "mime_type": "image/png",
            "file_uri": "gs://cloud-samples-data/generative-ai/image/320px-Felis_catus-cat_on_snow.jpg"
          }
        }
      ]
    },
    "generation_config": {
      "temperature": 0.2,
      "top_p": 0.1,
      "top_k": 16,
      "max_output_tokens": 2048,
      "candidate_count": 1,
      "stop_sequences": []
    },
    "safety_settings": {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_LOW_AND_ABOVE"
    }
  }' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json

This image features a **domestic tabby cat** standing in a **snowy landscape**.

Here's a detailed description:

*   **Subject:** A medium-sized cat with a classic **brown and black striped tabby coat**. Its fur appears thick and well-suited for cold weather.
*   **Eyes:** The cat has striking **amber or yellowish-green eyes** that are looking directly at the viewer, giving it an alert and curious expression.
*   **Pose:** It is standing with its body slightly turned to the right, but its head is swiveled to face forward. One of its front paws (the left one from its perspective) is delicately lifted off the snow, suggesting it has just paused or is about to take another step. Its tail is striped and held somewhat low.
*   **Setting:** The ground is covered in a thick, pristine layer of **white snow**. The snow appears soft and powdery.
*   **Background:** The background is also entirely snow, but it is softly blurred, creating a shallow depth of field that keeps the cat sharply in focu

### Generar texto desde un archivo de video

Especifica el URI de Cloud Storage del video para incluirlo en el mensaje. El bucket que almacena el archivo debe estar en el mismo proyecto de Google Cloud que envía la solicitud. También debes especificar el campo `mime_type`. Los tipos MIME admitidos para videos incluyen `video/mp4`.

In [21]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d \
'{
    "contents": {
      "role": "USER",
      "parts": [
        {
          "text": "Answer the following questions using the video only. What is the profession of the main person? What are the main features of the phone highlighted? Which city was this recorded in?"
        },
        {
          "file_data": {
            "mime_type": "video/mp4",
            "file_uri": "gs://github-repo/img/gemini/multimodality_usecases_overview/pixel8.mp4"
          }
        }
      ]
    }
  }' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json

Based on the video:

*   The profession of the main person is a **photographer**.
*   The main features of the phone highlighted are **Video Boost** and **Night Sight** (which activates in low light to improve video quality).
*   This was recorded in **Tokyo**.


### Generación controlada

La generación controlada te permite definir un esquema de respuesta para especificar la estructura de la salida del modelo, los nombres de los campos y el tipo de datos esperado para cada campo.

In [13]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "user",
      "parts": {
        "text": "List a few popular cookie recipes."
      }
    },
    "generationConfig": {
        "response_mime_type": "application/json",
        "response_schema": {"type": "object", "properties": {"recipe_name": {"type": "string"}}}
    },
}' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json

{
  "recipe_name": "Chocolate Chip Cookies, Oatmeal Raisin Cookies, Sugar Cookies, Peanut Butter Cookies"
}


## Búsqueda como herramienta

Usando Grounding con Google Search, puedes mejorar la precisión y actualidad de las respuestas del modelo. A partir de Gemini 2.5 Flash, Google Search está disponible como herramienta. Esto significa que el modelo puede decidir cuándo usar Google Search.

In [14]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": [
        {
            "role": "user",
            "parts": [
                {
                    "text": "What is the weather today in San Jose CA?"
                },
            ]
        }
  ],
  "tools": {
     "google_search": {}
  },
  "generationConfig": {
      "response_modalities": "TEXT"
  }
}' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[].text" response.json
jq -r ".candidates[].groundingMetadata.groundingChunks" response.json

The weather in San Jose, CA today, Saturday, February 21, 2026, is currently partly sunny with a temperature of 56°F (13°C), feeling like 53°F (11°C). The humidity is around 57%, and there is a 3% chance of rain.

For the rest of the day, the forecast indicates partly sunny conditions during the day, transitioning to cloudy at night, with a 10% chance of rain. The temperature is expected to range between 39°F (4°C) and 57°F (14°C), with humidity around 64%.
[
  {
    "web": {
      "uri": "https://www.google.com/search?q=weather+in+San Jose, CA,+US",
      "title": "Weather information for San Jose, CA, US",
      "domain": "google.com"
    }
  }
]


### Ejecución de código

La función de ejecución de código de la API de Gemini permite que el modelo genere y ejecute código Python y aprenda de manera iterativa a partir de los resultados hasta llegar a una salida final.

In [15]:
%%bash

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
  "contents": {
    "role": "user",
    "parts": {
      "text": "Calculate 20th fibonacci number. Then find the nearest palindrome to it."
    }
  },
  "tools": [
      {"code_execution": {},}
  ]
}' 2>/dev/null >response.json

jq -r ".candidates[].content.parts[]" response.json

{
  "text": "Let's break this down into two parts:\n\n### 1. Calculate the 20th Fibonacci Number\n\nI'll use a Python function to calculate the Fibonacci sequence.\n\n"
}
{
  "executableCode": {
    "language": "PYTHON",
    "code": "def fibonacci(n):\n    if n <= 0:\n        return 0\n    elif n == 1:\n        return 1\n    else:\n        a, b = 0, 1\n        for _ in range(2, n + 1):\n            a, b = b, a + b\n        return b\n\nfib_20 = fibonacci(20)\nprint(f\"The 20th Fibonacci number is: {fib_20}\")"
  }
}
{
  "codeExecutionResult": {
    "outcome": "OUTCOME_OK",
    "output": "The 20th Fibonacci number is: 6765\n"
  }
}
{
  "text": "The 20th Fibonacci number is **6765**.\n\n### 2. Find the Nearest Palindrome to 6765\n\nNow, I need to find the nearest palindrome to 6765. I'll define a function to check if a number is a palindrome and then search outwards from 6765.\n\n"
}
{
  "executableCode": {
    "language": "PYTHON",
    "code": "def is_palindrome(n):\n    return str(n) ==