<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>

# Lab 3.2.3
# *Google BigQuery and Gemini API*

## Introduction

The Google BigQuery UI provides access to Google's extensive collection of public data sets via an SQL-based query engine.

The BigQuery API provides programmatic access to the data sets.

The Google Gemini API provides programmatic access to Google's Generative AI models.

Each of these is explored in this lab.

## BigQuery Web UI

The Google BigQuery UI can be used to discover interesting data before writing Python code to access it. Then we can reproduce it in an API request so as to aggregate large amounts of data on Google's infrastructure before pulling the results into our application.

Work through the Quickstart at https://cloud.google.com/bigquery/docs/quickstarts/quickstart-web-ui.

You will need to set up a Google Cloud Platform account if you don't already have one. (This should not cost anything during the trial period unless you perform a large amount of querying. Afterwards, costs are based on actual resource usage, but most offerings have a free tier.)

## BigQuery API

- Open Google Cloud Console (https://console.cloud.google.com/home/) and select to create a project. A project is required to enable access to Google Cloud services such as BigQuery and Gemini.

- Check that the BigQuery API is enabled in your project by visiting https://console.cloud.google.com/apis/library/bigquery.googleapis.com.

### Authentication

Create a **service account** at https://console.cloud.google.com/iam-admin/serviceaccounts/create. A service account is used by an application to access Google Cloud Platform's services and has an associated email address (different from your own).


- Give the account an appropriate name, and under step 2 (Grant this service account access to project (optional)), choose "Owner" under the "Select a Role" dropdown.

- Ignore step 3 and click "Done".

Go to https://console.cloud.google.com/iam-admin/serviceaccounts to create a **service account key**. This will be downloaded to your computer so that you can connect to the BigQuery API via this Jupyter notebook.

- Select your recently created project.
- Click the email address of the service account.
- Click the Keys tab.
- Click the Add key drop-down menu, then select Create new key.
- Select JSON as the Key type and click Create.
- The keys will get saved to your computer.

Note the location and copy the file path (of the json file) to somewhere safe, for future reference.

See here for more information:

Service Account creation: https://cloud.google.com/iam/docs/service-accounts-create#creating (under Console)

Service Account key creation: https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console (under Console)


### Using the Python API

Google provides Python libraries for wrapping the Google APIs.

Installing the "google-cloud-bigquery", "google-cloud-storage", and "google-cloud-bigquery-storage" libraries should cover all the dependencies for the BigQuery section of this lab.

In [1]:
!pip install google-cloud-storage



In [2]:
!pip install google-cloud-bigquery



In [3]:
!pip install google-cloud-bigquery-storage # has additional capabilities for reading data from BigQuery using the BigQuery Storage API



In [30]:
from google.cloud import bigquery
from google.cloud import storage
from google.cloud import bigquery_storage

Invoke a method of the `.Client` object that takes the path to your key files as a string argument:

In [5]:
from google.colab import files
uploaded = files.upload()


Saving my-project-iod-ff-a2efdb087076.json to my-project-iod-ff-a2efdb087076.json


In [12]:
key_path = 'my-project-iod-ff-a2efdb087076.json'          #: Change this to match your key filename

This should not throw an error if key retrieval / assignment worked:

In [13]:
storage_client = storage.Client.from_service_account_json(key_path)

*Nb. The `storage` object was used in the above example, but there are other objects of interest that have polymorphic `Client` members that are used similarly, such as `bigquery`, which is used below.*

Next, execute this:

In [14]:
client = bigquery.Client.from_service_account_json(key_path)

This client is associated with the default project (which was set or defaulted in the BigQuery UI):

In [32]:
client.project

'bigquery-public-data'

A BigQuery project contains datasets. Datasets contain tables. To get at the data in a table we need to create a reference that covers this hierarchy; in the `bigquery` library this looks like `project.dataset.table`.  

(Nb. Queries can be performed on projects and datasets, but most queries are performed on tables.)

To explore the public datasets we will start by reassigning our `client` variable using optional `project` parameter (set to `bigquery-public-data`):

In [16]:
#project = 'bigquery-public-data'
client = bigquery.Client.from_service_account_json(key_path, project = 'bigquery-public-data')
print(client.project)

bigquery-public-data


Here is how to get a list of the datasets in the current project:

In [17]:
datasets = list(client.list_datasets())
print(datasets)

[<google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e82a10>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9c850>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9cd90>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9f950>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9f7d0>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9f8d0>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9d690>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9cf50>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9e350>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9cfd0>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9f3d0>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e9e950>, <google.cloud.bigquery.dataset.DatasetListItem object at 0x795c14e8d4d0>, <google.cloud.bigquery.dataset.Datase

That wasn't helpful. We need to go deeper into the object structure to get at something meaningful. Below is a function that exploits the `format` method of `project` and `dataset_id`, providing an easy way to list datasets:

In [18]:
# function for listing datasets in a project:
def printDatasetList(client):
    project = client.project    #: only one project can be associated with a client instance
    datasets = list(client.list_datasets())
    if datasets:
        print('Datasets in project {}:'.format(project))
        for dataset in datasets:
            print('\t{}'.format(dataset.dataset_id))
        found = True
    else:
        print('{} project does not contain any datasets.'.format(project))
        found = False
    return found

In [19]:
# list datasets in the default project:
flag = printDatasetList(client)  #: assigning to `flag` suppresses printing the return value (normally `True`)

Datasets in project bigquery-public-data:
	america_health_rankings
	austin_311
	austin_bikeshare
	austin_crime
	austin_incidents
	austin_waste
	baseball
	bbc_news
	bigqueryml_ncaa
	bitcoin_blockchain
	blackhole_database
	blockchain_analytics_ethereum_mainnet_us
	bls
	bls_qcew
	breathe
	broadstreet_adi
	catalonian_mobile_coverage
	catalonian_mobile_coverage_eu
	census_bureau_acs
	census_bureau_construction
	census_bureau_international
	census_bureau_usa
	census_opportunity_atlas
	census_utility
	cfpb_complaints
	chicago_crime
	chicago_taxi_trips
	clemson_dice
	cloud_storage_geo_index
	cms_codes
	cms_medicare
	cms_synthetic_patient_data_omop
	country_codes
	covid19_aha
	covid19_covidtracking
	covid19_ecdc
	covid19_ecdc_eu
	covid19_genome_sequence
	covid19_geotab_mobility_impact
	covid19_geotab_mobility_impact_eu
	covid19_google_mobility
	covid19_google_mobility_eu
	covid19_govt_response
	covid19_italy
	covid19_italy_eu
	covid19_jhu_csse
	covid19_jhu_csse_eu
	covid19_nyt
	covid19_open_dat

This list should correspond to what is shown here https://bigquery.cloud.google.com/publicdatasets under the **bigquery-public-data** item.

Here is how to create a dataset reference object by assigning a project and a dataset name:

In [20]:
dataset_id = 'samples'
dataset_ref = client.dataset(dataset_id)

If our current project was something other than `bigquery-public-data`, we could still create this reference by specifying the project that contains the dataset:

In [21]:
dataset_id = 'samples'
dataset_ref = client.dataset(dataset_id, project = 'bigquery-public-data')

How can we get the path of the dataset?

In [22]:
#ANSWER:
dataset_ref.path

'/projects/bigquery-public-data/datasets/samples'

Explore more of this object's members:

*(HINT: You can type `dataset_ref.` in a new line, then hit the [Tab] key to see the available members for the object.)*

In [34]:
#?
dataset_ref.project

'bigquery-public-data'

Here is a function for listing the tables in a dataset:

In [24]:
# function for listing tables in a dataset:
def printTableList(client, dataset_id):
    project = client.project
    dataset_ref = client.dataset(dataset_id, project = project)
    tables = list(client.list_tables(dataset_ref))
    if tables:
        print('Tables in dataset {}:'.format(dataset_id))
        for table in tables:
            print('\t{}'.format(table.table_id))
        found = True
    else:
        print('{} dataset does not contain any tables.'.format(dataset_id))
        found = False
    return found

Use this function to list the tables in the current dataset:

In [19]:
#ANSWER
printTableList(client, dataset_id)

Tables in dataset samples:
	github_nested
	github_timeline
	gsod
	natality
	shakespeare
	trigrams
	wikipedia


True

To create a reference to a table within the dataset, we use the `table_id` attribute:

In [25]:
table_id = 'shakespeare'
table_ref = dataset_ref.table(table_id)

To access the data in the table itself, we use the `get_table()` method:

In [26]:
table = client.get_table(table_ref)  # API Request

NOTE: The contents of the table are not actually in our memory after this call! We are working with a Big Data platform, now, and we could easily end up pulling GBs or TBs of data by accident.

To minimise data bandwidth, memory consumption, and processing time, Big Data platforms employ ***lazy evaluation***. This means that no computation or data transfer actually takes place until we *realise* (use) the data. Even if we execute subsequent code that performs calculations on the data, no data flow or computation actually occurs until we request output (e.g. by executing a print to stdout or writing to a file).

What kind of object is returned by `client.get_table`?

In [22]:
#ANSWER:
type(table)

How can we view the design of the table (column names and types)? The name of the object attribute we need is the same term we learned in the module on databases:

In [23]:
#ANSWER
print(table.schema)

[SchemaField('word', 'STRING', 'REQUIRED', None, 'A single unique word (where whitespace is the delimiter) extracted from a corpus.', (), None), SchemaField('word_count', 'INTEGER', 'REQUIRED', None, 'The number of times this word appears in this corpus.', (), None), SchemaField('corpus', 'STRING', 'REQUIRED', None, 'The work from which this word was extracted.', (), None), SchemaField('corpus_date', 'INTEGER', 'REQUIRED', None, 'The year in which this corpus was published.', (), None)]


Again, this is messy. If we wanted to refer to the column names and types in code, we might use something like this (which we could then parse into a dict):

In [27]:
result = ["{0} {1}".format(schema.name,schema.field_type) for schema in table.schema]
print(result)

['word STRING', 'word_count INTEGER', 'corpus STRING', 'corpus_date INTEGER']


But if we just want to print them, here is another neat function for that:

In [28]:
# function to print a table schema:
def printTableSchema(aTable):
    schemas = list(aTable.schema)
    if schemas:
        print('Table schema for {}:'.format(aTable.table_id))
        for aSchema in schemas:
            print('\t{0} {1}'.format(aSchema.name, aSchema.field_type))
        found = True
    else:
        found = False
    return found

Use this function to print the table schema:

In [26]:
#ANSWER:
printTableSchema(table)

Table schema for shakespeare:
	word STRING
	word_count INTEGER
	corpus STRING
	corpus_date INTEGER


True

Now that we know what the columns are, we can write queries. Actually, we construct a query job by assigning an SQL statement to a method of the `client` object:

In [35]:
# Step 2: Authenticate
from google.colab import auth
auth.authenticate_user()


In [36]:
# Step 3: Import libraries and set project ID
from google.cloud import bigquery
from google.cloud import bigquery_storage

project_id = "my-project-iod-ff"  # your GCP project
client = bigquery.Client(project=project_id)
bqstorage_client = bigquery_storage.BigQueryReadClient()


In [39]:
# Step 4: Run a public dataset query using your project as the billing project
query_job = client.query("""
    SELECT name, gender, number
    FROM `bigquery-public-data.usa_names.usa_1910_2013`
    WHERE state = 'TX'
    LIMIT 10
""", project=project_id)

# Step 5: Convert to DataFrame
df = query_job.to_dataframe(bqstorage_client=bqstorage_client)
df


Unnamed: 0,name,gender,number
0,Willie,F,260
1,Flora,F,40
2,Stella,F,69
3,Esther,F,69
4,Erma,F,38
5,Katie,F,58
6,Ethel,F,154
7,Cora,F,61
8,Margarita,F,47
9,Mae,F,63


In [40]:
sql = "SELECT COUNT(1) FROM bigquery-public-data.samples.shakespeare"
query_job = client.query(sql)

This will throw an error since we don't have permission to create queries inside the `bigquery-public-data` project. Instead we set the project to our BigQuery project name.

In [41]:
client = bigquery.Client.from_service_account_json(key_path, project = 'my-project-iod-ff') #<<< your BigQuery project ID here!
query_job = client.query(sql)

If that worked, show what query_job is:

In [42]:
# ANSWER
type(query_job)

Once again, due to lazy execution, no actual execution occurs until we request output:

In [43]:
for row in query_job:  # API request - fetches results
    print(row)

Row((164656,), {'f0_': 0})


And, again, we need to manipulate this to make it neat. Each member of the rowset is a list and we only want to extract the value, which is in the first member:

In [31]:
print(row[0])

164656


So, we now know that this table has 164,656 rows. (We would not want to print it!)

Write, execute, and print the results of a query that fetches 10 rows from the table, each containing the "word", "word_count", and "corpus" fields:

In [44]:
#ANSWER
sql = "SELECT word, word_count, corpus FROM bigquery-public-data.samples.shakespeare LIMIT 10"
query_job = client.query(sql) #E: , location='US') #: OK if client.project = 'myreallybigquery'

# print these as above:
for row in query_job:  # API request - fetches results
    # Now have 3 fields to test (Nb. this approach may be overkill for non-production code):
    assert row[0] == row.word == row['word']
    assert row[1] == row.word_count == row['word_count']
    assert row[2] == row.corpus == row['corpus']
    print(row['word'], row['word_count'], row['corpus'])


LVII 1 sonnets
augurs 1 sonnets
dimm'd 1 sonnets
plagues 1 sonnets
treason 1 sonnets
surmise 1 sonnets
heed 1 sonnets
Unthrifty 1 sonnets
quality 1 sonnets
wherever 1 sonnets


Whenever you catch yourself writing a swag of code to do something that seems rudimentary or low-level, there is a very good chance that you don't need to. A much easier way to handle the above requirement is to use the `to_dataframe` method of the QueryJob object:

In [33]:
!pip uninstall numpy -y
!pip install --no-cache-dir numpy

Found existing installation: numpy 2.0.2
Uninstalling numpy-2.0.2:
  Successfully uninstalled numpy-2.0.2
Collecting numpy
  Downloading numpy-2.3.1-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (62 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m488.7 kB/s[0m eta [36m0:00:00[0m
[?25hDownloading numpy-2.3.1-cp311-cp311-manylinux_2_28_x86_64.whl (16.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.9/16.9 MB[0m [31m67.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: numpy
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
opencv-python-headless 4.12.0.88 requires numpy<2.3.0,>=2; python_version >= "3.9", but you have numpy 2.3.1 which is incompatible.
cupy-cuda12x 13.3.0 requires numpy<2.3,>=1.22, but you have numpy 2.3.1 which is incompatible.
numba 0.60.0 requires numpy<

In [1]:
import numpy
print(numpy.__version__)


2.3.1


In [2]:
pip install --upgrade numpy pandas pyarrow db-dtypes google-cloud-bigquery


Collecting pandas
  Downloading pandas-2.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (91 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m91.2/91.2 kB[0m [31m1.1 MB/s[0m eta [36m0:00:00[0m
Collecting pyarrow
  Downloading pyarrow-21.0.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (3.3 kB)
Collecting google-cloud-bigquery
  Downloading google_cloud_bigquery-3.35.0-py3-none-any.whl.metadata (8.0 kB)
Downloading pandas-2.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.4/12.4 MB[0m [31m46.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyarrow-21.0.0-cp311-cp311-manylinux_2_28_x86_64.whl (42.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.8/42.8 MB[0m [31m13.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading google_cloud_bigquery-3.35.0-py3-none-any.whl (256 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m 

In [1]:
import pandas as pd
print(pd.__version__)


2.3.1


In [2]:
!pip install db-dtypes




In [3]:
import db_dtypes
print("db-dtypes is installed!")


db-dtypes is installed!


In [6]:
!pip install --quiet db-dtypes google-cloud-bigquery google-cloud-bigquery-storage


In [45]:
df = query_job.to_dataframe()
df

Unnamed: 0,word,word_count,corpus
0,LVII,1,sonnets
1,augurs,1,sonnets
2,dimm'd,1,sonnets
3,plagues,1,sonnets
4,treason,1,sonnets
5,surmise,1,sonnets
6,heed,1,sonnets
7,Unthrifty,1,sonnets
8,quality,1,sonnets
9,wherever,1,sonnets


#### Additional Notes

1. Here is a readable way to code long SQL statements:

In [None]:
sql = """
    SELECT word, word_count, corpus
    FROM bigquery-public-data.samples.shakespeare
    LIMIT 10
    """

2. If you had an application that needed to modify the tables or datasets in the `bigquery-public-data` is project, you could copy them to our own project, where you would have the permissions to do as you please with the data (subject to Google's terms of use).

3. We aren't limited to the datasets that are already in BigQuery. We can upload tables from our computer, and we can pull data in from other online sources.

## Google Gemini

Google Gemini (formerly Bard) is a multimodal generative AI chatbot. It can process text, audio, images and video.

Create an API key at https://aistudio.google.com/app/apikey . Copy the key and paste it into a text file called 'gemini_key.txt'

## Google Gemini UI
While signed into Google experiment with some prompts at https://aistudio.google.com/app/prompts/new_chat. A prompt gallery is available at https://aistudio.google.com/app/gallery.

## Google Gemini API

The library `google-generativeai` gives access to Gemini models. For this section download the following two files from the DATA folder:

* `equation.jpg`
* `JFK.mp3`

In [46]:
# -U gives the latest version
!pip install -U google-generativeai



In [47]:
import google.generativeai as genai
from IPython.display import Markdown # allows Markdown text to be displayed in the notebook

Firstly we read our API key from `gemini_key.txt`:

In [48]:
from google.colab import files
uploaded = files.upload()


Saving gemini_key.txt to gemini_key.txt


In [49]:
filename = 'gemini_key.txt'
try:
    with open(filename, 'r') as f:
        # It's assumed your file contains a single line containing your API key only
        key = f.read().strip()
except FileNotFoundError:
    print("'%s' file not found" % filename)

In [50]:
genai.configure(api_key=key)

A list of methods in `google.generativeai` can be seen at https://github.com/google-gemini/generative-ai-python/blob/main/docs/api/google/generativeai.md. We shall use the following:

* `configure()`: creates a client object by passing in the API key
* `GenerativeModel()`: used to access the model
  * `generate_content()`: used to generate responses from the model
* `list_models()`: used to see available models
* `upload_file()`: used to upload image/audio files

The following code lists the available models for text generation:

In [51]:
for m in genai.list_models():
    if "generateContent" in m.supported_generation_methods:
        print(m.name)

models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-flash-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash
models/gemini-2.5-flash-lite-preview-06-17
models/gemini-2.5-pro-preview-05-06
models/gemini-2.5-pro-preview-06-05
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-preview-image-generation
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemin

As suggested by the name, Gemini Flash is designed for faster responses while Gemini Pro works better at more challenging tasks. Pro has a lower rate limit of 2 requests per minute as seen in https://ai.google.dev/pricing.

In [52]:
model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content("What is an API?")

The attribute `text` shows in Markdown format the response of the model to the question.

In [53]:
Markdown(response.text)

API stands for **Application Programming Interface**.  It's essentially a messenger that allows different software systems to talk to each other.  Think of it as a menu in a restaurant.  You (the application) don't need to know how the kitchen (the other software system) prepares the food (data), you just need to know what options are available (the API's functions) and how to order them (the API's requests).

More technically, an API is a set of rules and specifications that software programs can follow to communicate and exchange information. It defines how one application can request services or data from another application.  This communication typically involves sending requests and receiving responses, often in formats like JSON or XML.

Here's a breakdown of key aspects:

* **Requests:**  An application sends a request to the API, specifying what it needs.
* **Responses:** The API processes the request and sends back a response containing the requested data or information.
* **Functions/Endpoints:** These are specific tasks or data points the API can provide access to. For example, a weather API might have endpoints for "get current temperature," "get forecast," etc.
* **Documentation:**  Good APIs come with detailed documentation explaining how to use them.

**Examples:**

* **Google Maps API:** Allows developers to embed maps and location-based services into their applications.
* **Twitter API:** Allows developers to access and interact with Twitter data, such as tweets and user profiles.
* **Payment gateway APIs:**  Allow online businesses to process payments securely.

In short, APIs are crucial for modern software development, enabling modularity, reusability, and integration between different systems.


Study the response object and identify the total token count.

In [54]:
# ANSWER
response.usage_metadata.total_token_count

356

Next, we have Gemini process a mathematical equation in an image.

In [56]:
from google.colab import files
uploaded = files.upload()

Saving equation.jpg to equation.jpg


In [57]:
sample_image = genai.upload_file(path="equation.jpg")

In [58]:
prompt = "What is in this image?"
Markdown(model.generate_content([prompt, sample_image]).text)

That's a quadratic equation:

0 = x² - 5x + 6

It's set equal to zero, meaning it's ready to be solved for the values of *x* that make the equation true.  These values are called the roots or solutions of the equation.


Use the Gemini 1.5 Pro model to solve the equation in the image.

In [59]:
# REPLACE ??? with code
model = genai.GenerativeModel(model_name="gemini-1.5-pro")

prompt = "Solve the equation in this image."

response = model.generate_content([prompt, sample_image])

Markdown(">" + response.text)

>The equation is:

0 = x² - 5x + 6

This is a quadratic equation, and we can solve it by factoring:

0 = (x - 2)(x - 3)

This equation is true if either (x - 2) = 0 or (x - 3) = 0.  Therefore, the solutions are:

x = 2  or  x = 3

Finally we transcribe a short audio clip.

In [60]:
from google.colab import files
uploaded = files.upload()

Saving JFK.mp3 to JFK.mp3


In [61]:
audio_file = genai.upload_file(path='JFK.mp3')

model = genai.GenerativeModel(model_name="gemini-1.5-flash")

# Create a prompt.
prompt = "Transcribe the audio."

# Pass the prompt and the audio file to Gemini using generate_content
response = model.generate_content([prompt, audio_file]) #???

# Print the response.
print(response.text)

And so, my fellow Americans, ask not what your country can do for you, ask what you can do for your country.



## Further reading

If you wish to pick up a few more skills in BigQuery you can go to https://cloud.google.com/bigquery/create-simple-app-api and https://cloud.google.com/bigquery/docs/samples.

Alternatively, you can take a deeper dive into the API here: https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/usage.html.

The Google Gemini API documentation is at https://ai.google.dev/gemini-api/docs

## - END -



---



---



> > > > > > > > > © 2025 Institute of Data


---



---



