## Getting Started

### Install Vertex AI SDK for Python


In [1]:
%pip install --upgrade --user --quiet google-cloud-aiplatform

### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

In [2]:
import sys

if "google.colab" in sys.modules:
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

If you are running this notebook on Google Colab, run the cell below to authenticate your environment.


In [1]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()



### Set Google Cloud project information and initialize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [2]:
# Use the environment variable if the user doesn't provide Project ID.
import os

import vertexai

PROJECT_ID = "[your-project-id]"  # @param {type: "string", placeholder: "[your-project-id]" isTemplate: true}
if not PROJECT_ID or PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

vertexai.init(project=PROJECT_ID, location=LOCATION)

### Import libraries


In [3]:
import IPython.display
from vertexai.generative_models import (
    GenerationConfig,
    GenerativeModel,
    HarmBlockThreshold,
    HarmCategory,
    Part,
)

### Load the Gemini 1.5 Flash model

To learn more about all [Gemini API models on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models).


In [4]:
MODEL_ID = "gemini-1.5-flash-002"  # @param {type:"string"}

model = GenerativeModel(MODEL_ID)

## Audio understanding

Gemini 1.5 Flash can directly process audio for long-context understanding.


In [8]:
# Define bucket name and file path
bucket_name = "bucket_for_ai"  # Replace with your actual bucket name
file_name = "sample_audio.mp3"

# GCS URI and Public URL
audio_file_uri = f"gs://{bucket_name}/{file_name}"
audio_file_url = f"https://storage.googleapis.com/{bucket_name}/{file_name}"

IPython.display.Audio(audio_file_url)

#### Transcription

In [9]:
prompt = """
    Can you transcribe the audio content?
"""

audio_file = Part.from_uri(audio_file_uri, mime_type="audio/mpeg")
contents = [audio_file, prompt]

responses = model.generate_content(contents, stream=True)

for response in responses:
    print(response.text)

The
 stale smell of old beer lingers. It takes heat to bring out the odor. A cold dip
 restores health and zest. A salt pickle tastes fine with ham. Tacos al pastor are
 my favorite. A zestful food is the hot cross bun.



## All modalities (images, video, audio, text) at once

Gemini 1.5 Flash is natively multimodal and supports interleaving of data from different modalities, it can support a mix of audio, visual, text, and
code inputs in the same input sequence.

In [None]:

# Define bucket name and file path
image_file_name = "kids_playing_soccer.png"  # Change with your own file name

# GCS URI and Public URL
image_file_uri = f"gs://{bucket_name}/{image_file_name}"
image_file_url = f"https://storage.googleapis.com/{bucket_name}/{image_file_name}"

IPython.display.Image(image_file_url, width=450)
image_file = Part.from_uri(image_file_uri, mime_type="image/png")

# Adjust your prompt as needed for your image
prompt = """
  How many kids are in the image? How many soccer balls? How many agility soccer cones?
"""

contents = [image_file, prompt]

response = model.generate_content(contents)
print(response.text)