<a href="https://colab.research.google.com/github/suryasari/MLPlateNumber/blob/main/plateNumber.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2024 Google LLC.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Explore vision capabilities with the Gemini API

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://ai.google.dev/gemini-api/docs/vision"><img src="https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png" height="32" width="32" />View on ai.google.dev</a>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemini-api/docs/vision.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/generative-ai-docs/blob/main/site/en/gemini-api/docs/vision.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

The Gemini API is able to process images and videos, enabling a multitude of
 exciting developer use cases. Some of Gemini's vision capabilities include
 the ability to:

*   Caption and answer questions about images
*   Transcribe and reason over PDFs, including long documents up to 2 million token context window
*   Describe, segment, and extract information from videos,
including both visual frames and audio, up to 90 minutes long
*   Detect objects in an image and return bounding box coordinates for them

This tutorial demonstrates some possible ways to prompt the Gemini API with
images and video input, provides code examples,
and outlines prompting best practices with multimodal vision capabilities.
All output is text-only.

## Setup

Before you use the File API, you need to install the Gemini API SDK package and configure an API key. This section describes how to complete these setup steps.

### Install the Python SDK and import packages

The Python SDK for the Gemini API is contained in the [google-generativeai](https://pypi.org/project/google-generativeai/) package. Install the dependency using pip.

In [1]:
!pip install -q -U google-generativeai

Import the necessary packages.

In [2]:
import google.generativeai as genai
from IPython.display import Markdown

### Set up your API key

The File API uses API keys for authentication and access. Uploaded files are associated with the project linked to the API key. Unlike other Gemini APIs that use API keys, your API key also grants access to data you've uploaded to the File API, so take extra care in keeping your API key secure. For more on keeping your keys
secure, see [Best practices for using API
keys](https://support.google.com/googleapi/answer/6310037).

Store your API key in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or are unfamiliar with Colab Secrets, refer to the [Authentication quickstart](https://github.com/google-gemini/gemini-api-cookbook/blob/main/quickstarts/Authentication.ipynb).

In [4]:
from google.colab import userdata
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

## Prompting with images

In this tutorial, you will upload images using the File API or as inline data and generate content based on those images.

### Technical details (images)
Gemini 1.5 Pro and Flash support a maximum of 3,600 image files.

Images must be in one of the following image data [MIME types](https://developers.google.com/drive/api/guides/ref-export-formats):

-   PNG - `image/png`
-   JPEG - `image/jpeg`
-   WEBP - `image/webp`
-   HEIC - `image/heic`
-   HEIF - `image/heif`

Each image is equivalent to 258 tokens.

While there are no specific limits to the number of pixels in an image besides the model’s context window, larger images are scaled down to a maximum resolution of 3072x3072 while preserving their original aspect ratio, while smaller images are scaled up to 768x768 pixels. There is no cost reduction for images at lower sizes, other than bandwidth, or performance improvement for images at higher resolution.

For best results:

*   Rotate images to the correct orientation before uploading.
*   Avoid blurry images.
*   If using a single image, place the text prompt after the image.

## Image input

For total image payload size less than 20MB, it's recommended to either upload
base64 encoded images or directly upload locally stored image files.

### Base64 encoded images

You can upload public image URLs by encoding them as Base64 payloads.
You can use the httpx library to fetch the image URLs.
The following code example shows how to do this:

In [5]:
import httpx
import base64

# Retrieve an image
image_path = "https://upload.wikimedia.org/wikipedia/commons/thumb/8/87/Palace_of_Westminster_from_the_dome_on_Methodist_Central_Hall.jpg/2560px-Palace_of_Westminster_from_the_dome_on_Methodist_Central_Hall.jpg"
image = httpx.get(image_path)

# Choose a Gemini model
model = genai.GenerativeModel(model_name="gemini-1.5-pro")

# Create a prompt
prompt = "Caption this image."
response = model.generate_content(
    [
        {
            "mime_type": "image/jpeg",
            "data": base64.b64encode(image.content).decode("utf-8"),
        },
        prompt,
    ]
)

Markdown(">" + response.text)

>A panoramic view of London showcases the city's iconic landmarks under a dramatic, cloud-filled sky. The London Eye Ferris wheel stands tall to the left, while the Houses of Parliament and its famous clock tower, Big Ben (now officially the Elizabeth Tower), take center stage.  The Shard skyscraper peeks out from behind the London skyline, offering a glimpse into the city's modern architecture. In the foreground, rooftops and a glimpse of St. Margaret's Church lead the eye towards the bustling streets below. The overcast weather adds a moody atmosphere to the vibrant cityscape.

### Multiple images

To prompt with multiple images in Base64 encoded format, you can do the
following:

In [6]:
import httpx
import base64

# Retrieve two images
image_path_1 = "https://upload.wikimedia.org/wikipedia/commons/thumb/8/87/Palace_of_Westminster_from_the_dome_on_Methodist_Central_Hall.jpg/2560px-Palace_of_Westminster_from_the_dome_on_Methodist_Central_Hall.jpg"
image_path_2 = "https://storage.googleapis.com/generativeai-downloads/images/jetpack.jpg"

image_1 = httpx.get(image_path_1)
image_2 = httpx.get(image_path_2)

# Create a prompt
prompt = "Generate a list of all the objects contained in both images."

response = model.generate_content([
{'mime_type':'image/jpeg', 'data': base64.b64encode(image_1.content).decode('utf-8')},
{'mime_type':'image/jpeg', 'data': base64.b64encode(image_2.content).decode('utf-8')}, prompt])

Markdown(response.text)

**Image 1 (London Skyline):**

* Houses of Parliament (including Big Ben)
* London Eye (ferris wheel)
* The Shard (skyscraper)
* River Thames
* Various other buildings, including residential, commercial, and governmental structures
* Cranes (indicating construction)
* Trees and green spaces


**Image 2 (Jetpack Backpack Sketch):**

* Backpack shape
* Padded strap support
* USB-C charging port
* Retractible boosters with flames/steam
* Annotations indicating:
    * Fits 18" laptop
    * Lightweight
    * Looks like a normal backpack
    * 15-minute battery life
    * Steam-powered/Green/Clean propulsion

### Upload one or more locally stored image files

Alternatively, you can upload one or more locally stored image files..

You can download and use our drawings of [piranha-infested waters](https://storage.googleapis.com/generativeai-downloads/images/piranha.jpg) and a [firefighter with a cat](https://storage.googleapis.com/generativeai-downloads/images/firefighter.jpg). First, save these files to your local directory.

Then click **Files** on the left sidebar. For each file, click the **Upload** button, then navigate to that file's location and upload it:

<img width=400 src="https://ai.google.dev/tutorials/images/colab_upload.png">

When the combination of files and system instructions that you intend to send is larger than 20 MB in size, use the File API to upload those files. Smaller files can instead be called locally from the Gemini API:


In [8]:
import PIL.Image

sample_file_2 = PIL.Image.open('piranha.jpg')
sample_file_3 = PIL.Image.open('cat-firefighter.jpg')

In [9]:
import google.generativeai as genai

# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Create a prompt.
prompt = "Write an advertising jingle based on the items in both images."

response = model.generate_content([sample_file_2, sample_file_3, prompt])

Markdown(response.text)

(Upbeat, slightly quirky music)

Got a piranha, feeling glum?
Belly red, fins feeling numb?
Fireman Fred's got the cure,
For fishy woes, he'll reassure!

(Music becomes slightly more heroic)

From burning buildings, cats he saves,
But even piranhas, he bravely waves 
His magic net, no need to fear,
Fred's Aqua-Rescue is here!

(Music returns to upbeat, ends with a splash sound)

So call on Fred, day or night,
For finned or furred, he'll make things right!
Fred's Aqua-Rescue, give us a call,
He'll save them all, big and small!
*Splash!*


Note that these inline data calls don't include many of the features available
through the File API, such as getting file metadata,
[listing](https://ai.google.dev/gemini-api/docs/vision?lang=python#list-files),
or [deleting files](https://ai.google.dev/gemini-api/docs/vision?lang=python#delete-files).

### Large image payloads

#### Upload an image file using the File API

When the combination of files and system instructions that you intend to send is larger than 20 MB in size, use the File API to upload those files.

**NOTE**: The File API lets you store up to 20 GB of files per project, with a per-file maximum size of 2 GB. Files are stored for 48 hours. They can be accessed in that period with your API key, but cannot be downloaded from the API. It is available at no cost in all regions where the Gemini API is available.

Upload the image using [`media.upload`](https://ai.google.dev/api/rest/v1beta/media/upload) and print the URI, which is used as a reference in Gemini API calls.

In [10]:
!curl -o jetpack.jpg https://storage.googleapis.com/generativeai-downloads/images/jetpack.jpg

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  349k  100  349k    0     0   319k      0  0:00:01  0:00:01 --:--:--  319k


In [11]:
# Upload the file and print a confirmation.
sample_file = genai.upload_file(path="jetpack.jpg",
                            display_name="Jetpack drawing")

print(f"Uploaded file '{sample_file.display_name}' as: {sample_file.uri}")

Uploaded file 'Jetpack drawing' as: https://generativelanguage.googleapis.com/v1beta/files/or2e2oavr4h7


The `response` shows that the File API stored the specified `display_name` for the uploaded file and a `uri` to reference the file in Gemini API calls. Use `response` to track how uploaded files are mapped to URIs.

Depending on your use case, you can also store the URIs in structures such as a `dict` or a database.

#### Verify image file upload and get metadata

You can verify the API successfully stored the uploaded file and get its metadata by calling [`files.get`](https://ai.google.dev/api/rest/v1beta/files/get) through the SDK. Only the `name` (and by extension, the `uri`) are unique. Use `display_name` to identify files only if you manage uniqueness yourself.

In [12]:
file = genai.get_file(name=sample_file.name)
print(f"Retrieved file '{file.display_name}' as: {sample_file.uri}")

Retrieved file 'Jetpack drawing' as: https://generativelanguage.googleapis.com/v1beta/files/or2e2oavr4h7


Depending on your use case, you can store the URIs in structures, such as a `dict` or a database.

#### Prompt with the uploaded image and text

After uploading the file, you can make GenerateContent requests that reference the File API URI. Select the generative model and provide it with a text prompt and the uploaded image.

In [13]:
# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Prompt the model with text and the previously uploaded image.
response = model.generate_content([sample_file, "Describe how this product might be manufactured."])

Markdown(response.text)

Manufacturing a jetpack backpack, even a conceptual steam-powered one, would be a complex process involving several stages and specialized manufacturers:

1. **Backpack Construction:**  A traditional backpack manufacturer would handle this part.  They'd use durable, lightweight fabrics like ripstop nylon or Cordura, potentially incorporating reinforced areas for the jetpack components. Padded straps and back support would be crucial for comfort and weight distribution.

2. **Booster Engine Manufacturing:** This would require a specialized engineering firm.  
    * **Steam Generation:** A miniaturized boiler and burner system would be needed, possibly using a high-energy, compact fuel source. Precise machining and welding of heat-resistant materials (like stainless steel or titanium) would be essential. Safety features like pressure relief valves would be critical.
    * **Nozzles:**  Precision-engineered nozzles would direct the steam for thrust, likely requiring 3D printing or CNC machining to achieve the required tolerances.
    * **Retractable Mechanism:**  A compact and reliable system for deploying and retracting the boosters would be needed.  This could involve hydraulics, pneumatics, or electric motors, along with intricate linkages.

3. **Power System Development:**  
    * **Battery Production:** A high-capacity, fast-charging battery (likely lithium-ion) would power the ignition system, control circuits, and potentially assist the steam generation. This would involve collaboration with a battery manufacturer.
    * **USB-C Charging Port Integration:**  Standard USB-C port components would need to be integrated into the backpack design.

4. **Control System Design and Integration:** An electronics manufacturer would develop the circuitry and software to control the jetpack. This would involve:
    * **Sensors:** Altitude, speed, and fuel level sensors would be crucial.
    * **Microcontroller:** A powerful microcontroller would manage all systems and respond to user input.
    * **User Interface:** A simple interface (perhaps a small display and buttons) would allow users to control the jetpack.

5. **Assembly and Testing:** A dedicated facility would handle the final assembly and rigorous testing.  This would include:
    * **Integration of Components:** Carefully fitting the boosters, power system, and control system within the backpack structure.
    * **Quality Control:** Thoroughly testing all components and systems to ensure safety and functionality.  This might involve wind tunnel tests, thrust measurements, and flight simulations.
    * **Safety Certification:** Meeting all relevant safety standards and obtaining necessary certifications before the product can be sold.

6. **Distribution and Sales:**  Traditional retail channels or online sales platforms could handle distribution to consumers.

This hypothetical jetpack backpack would require collaboration between various specialized manufacturers, meticulous engineering, and rigorous testing.  While the steam-powered aspect is likely more science fiction than reality with current technology, the manufacturing process described above highlights the complexity of such a device.


## Capabilties

This section outlines specific vision capabilities of the Gemini model, including object detection and bounding box coordinates.

### Get bounding boxes

Gemini models are trained to return bounding box coordinates as relative widths or heights in the range of [0, 1]. These values are then scaled by 1000 and converted to integers. Effectively, the coordinates represent the bounding box on a 1000x1000 pixel version of the image. Therefore, you'll need to convert these coordinates back to the dimensions of your original image to accurately map the bounding boxes.

In [14]:
# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Create a prompt to detect bounding boxes.
prompt = "Return a bounding box for each of the objects in this image in [ymin, xmin, ymax, xmax] format."
response = model.generate_content([sample_file_2, prompt])

Markdown(response.text)

The image contains one piranha, and its bounding box is: [269, 156, 736, 843].

The model returns bounding box coordinates in the format
`[ymin, xmin, ymax, xmax]`. To convert these normalized coordinates
to the pixel coordinates of your original image, follow these steps:

1.    Divide each output coordinate by 1000.
1.    Multiply the x-coordinates by the original image width.
1.    Multiply the y-coordinates by the original image height.

To explore more detailed examples of generating bounding box coordinates and
visualizing them on images, review our [Object Detection cookbook example](https://github.com/google-gemini/cookbook/blob/main/examples/Object_detection.ipynb).

## Prompting with video

In this tutorial, you will upload a video using the File API and generate content based on those images.

### Technical details (video)

Gemini 1.5 Pro and Flash support up to approximately an hour of video data.

Video must be in one of the following video format [MIME types](https://developers.google.com/drive/api/guides/ref-export-formats):
  -   `video/mp4`
  -   `video/mpeg`
  -   `video/mov`
  -   `video/avi`
  -   `video/x-flv`
  -   `video/mpg`
  -   `video/webm`
  -   `video/wmv`
  -   `video/3gpp`

The File API service currently extracts image frames from videos at 1 frame per second (FPS) and audio at 1Kbps, single channel, adding timestamps every second. These rates are subject to change in the future for improvements in inference.

**NOTE:** The finer details of fast action sequences may be lost at the 1 FPS frame sampling rate. Consider slowing down high-speed clips for improved inference quality.

Individual frames are 258 tokens, and audio is 32 tokens per second. With metadata, each second of video becomes ~300 tokens, which means a 1M context window can fit slightly less than an hour of video.

To ask questions about time-stamped locations, use the format `MM:SS`, where the first two digits represent minutes and the last two digits represent seconds.

For best results:

*   Use one video per prompt.
*   If using a single video, place the text prompt after the video.

### Upload a video file to the File API

**NOTE**: The File API lets you store up to 20 GB of files per project, with a per-file maximum size of 2 GB. Files are stored for 48 hours. They can be accessed in that period with your API key, but they cannot be downloaded using any API. It is available at no cost in all regions where the Gemini API is available.

The File API accepts video file formats directly. This example uses the short NASA film ["Jupiter's Great Red Spot Shrinks and Grows"](https://www.youtube.com/watch?v=JDi4IdtvDVE0). Credit: Goddard Space Flight Center (GSFC)/David Ladd (2018).

> "Jupiter's Great Red Spot Shrinks and Grows" is in the public domain and does not show identifiable people. ([NASA image and media usage guidelines.](https://www.nasa.gov/nasa-brand-center/images-and-media/))

Start by retrieving the short video:

In [15]:
!wget https://storage.googleapis.com/generativeai-downloads/images/GreatRedSpot.mp4

--2024-12-07 13:01:26--  https://storage.googleapis.com/generativeai-downloads/images/GreatRedSpot.mp4
Resolving storage.googleapis.com (storage.googleapis.com)... 142.251.175.207, 142.251.10.207, 142.251.12.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.251.175.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 238090979 (227M) [video/mp4]
Saving to: ‘GreatRedSpot.mp4’


2024-12-07 13:01:38 (19.3 MB/s) - ‘GreatRedSpot.mp4’ saved [238090979/238090979]



Upload the video to the File API and print the URI.

In [16]:
video_file_name = "GreatRedSpot.mp4"

print(f"Uploading file...")
video_file = genai.upload_file(path=video_file_name)
print(f"Completed upload: {video_file.uri}")

Uploading file...
Completed upload: https://generativelanguage.googleapis.com/v1beta/files/hbhacw2ab69s


### Verify file upload and check state

Verify the API has successfully received the files by calling the [`files.get`](https://ai.google.dev/api/rest/v1beta/files/get) method.

**NOTE**: Video files have a `State` field in the File API. When a video is uploaded, it will be in the `PROCESSING` state until it is ready for inference. Only `ACTIVE` files can be used for model inference.

In [17]:
import time

# Check whether the file is ready to be used.
while video_file.state.name == "PROCESSING":
    print('.', end='')
    time.sleep(10)
    video_file = genai.get_file(video_file.name)

if video_file.state.name == "FAILED":
  raise ValueError(video_file.state.name)

..

### Prompt with a video and text

Once the uploaded video is in the `ACTIVE` state, you can make `GenerateContent` requests that specify the File API URI for that video. Select the generative model and provide it with the uploaded video and a text prompt.

In [18]:
# Create the prompt.
prompt = "Summarize this video. Then create a quiz with answer key based on the information in the video."

# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Make the LLM request.
print("Making LLM inference request...")
response = model.generate_content([video_file, prompt],
                                  request_options={"timeout": 600})

# Print the response, rendering any Markdown
Markdown(response.text)

Making LLM inference request...


This video is about Jupiter’s Great Red Spot and the changes it has undergone. The narrator explains that Jupiter is the largest and oldest planet in our solar system, and is made up of the same elements as a star, but it didn’t become massive enough to ignite. Jupiter's appearance is caused by a swirling interior of gasses and liquids which create colored cloud bands and the Great Red Spot. The Great Red Spot is a giant storm, and because there is no landmass to slow it down, it has been active for more than a century. In more recent years, the Great Red Spot’s color has deepened, it’s shrinking and getting rounder. Scientists theorized that as it shrinks, the wind speed increases like an ice skater spinning faster by pulling in their arms, but the data collected by various NASA missions including Voyager, Hubble, and Juno shows the storm isn't spinning faster, it’s getting taller. It used to be large enough to fit three Earths, now it’s only slightly bigger than one. Scientists hope further investigation will shed light on its continued changes. 

Here is a quiz on Jupiter's Great Red Spot:

1. Jupiter is the largest and _____ planet in our solar system.
A. second-oldest
B. oldest
C. newest
D. third-oldest

2. Jupiter’s appearance is the result of the swirling interior of _____ and _____.
A. gases and lava
B. liquids and crystals
C. gases and liquids
D. gasses and crystals 

3. The Great Red Spot is a _____.
A. volcano
B. canyon
C. storm
D. sinkhole

4. As the Great Red Spot shrinks, the storm is getting _____.
A. wider
B. shorter
C. taller
D. cooler

5. Which three NASA missions have collected data on Jupiter’s Great Red Spot?
A. Voyager, Hubble, and Juno
B. Voyager, Cassini, and Juno
C. Pioneer, Hubble, and Cassini
D. Pioneer, Hubble, and Juno


Answer Key:
1. B
2. C
3. C
4. C
5. A

### Refer to timestamps in the content

You can use timestamps of the form `MM:SS` to refer to specific moments in the video.

In [20]:
# Create the prompt.
prompt = "What are the examples given at 00:35 and 00:50 supposed to show us?"

# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Make the LLM request.
print("Making LLM inference request...")
response = model.generate_content([prompt, video_file],
                                  request_options={"timeout": 600})
Markdown(response.text)

Making LLM inference request...


The example at [00:00:35] illustrates Jupiter's Great Red Spot and the movement of a smaller moon across it. The example at [00:00:50] shows how the Great Red Spot on Jupiter has changed over time, getting smaller and rounder. 

### Transcribe video and provide visual descriptions

The Gemini models can transcribe and provide visual descriptions of video content
by processing both the audio track and visual frames.
For visual descriptions, the model samples the video at a rate of **1 frame
per second**. This sampling rate may affect the level of detail in the
descriptions, particularly for videos with rapidly changing visuals.

In [22]:
# Create the prompt.
prompt = "Transcribe the audio from this video, giving timestamps for salient events in the video. Also provide visual descriptions."

# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Make the LLM request.
print("Making LLM inference request...")
response = model.generate_content([video_file, prompt],
                                  request_options={"timeout": 600})
Markdown(response.text)

Making LLM inference request...


Sure, here’s the transcript of the video with timestamps and visual descriptions. 

[00:00:00] In the black of space, Jupiter glows softly, bands of brown, white, and rust-orange swirling across its surface, the Milky Way shimmering faintly in the distance.

[00:00:16] Jupiter’s thin rings become visible as it grows larger in the frame, the swirling surface and the Milky Way growing dimmer in the background.

[00:00:21] With the background fully darkened, Jupiter rotates slowly, its reddish-brown Great Red Spot visible as it approaches the left edge of the planet.

[00:00:27] Zooming in closely on the Great Red Spot, the swirls of its surface become clearer, and Jupiter shrinks in the background as the spot grows.

[00:00:32] As it fills the screen, the moon, Io, passes to its left, and a moon’s shadow briefly passes across the top.

[00:00:42] Pulling away from the Great Red Spot, the view pans out and returns to its place on the left edge of Jupiter.

[00:00:47] The Great Red Spot again approaches the left edge, a close-up image inset over the planet itself.

[00:00:52] The inset image shrinks to show the Great Red Spot as it looked in 1995. The view shifts to show how it looked in 2009, noticeably smaller. The image then shifts to 2015, smaller still and now a darker orange hue.

[00:01:00] As the inset image disappears, the Great Red Spot appears to open up, revealing its swirling interior of orange, red, and rust-colored bands.

[00:01:05] As the ice skater demonstrates how spinning faster pulls her arms in, two graphs of data points show the Great Red Spot’s latitude and its relative height above the Jovian surface. A black line shows the data from 2014. A blue line shows the data from 2015, and a green line shows the data from 2016. The lines rise and fall similarly, with the 2015 line consistently rising a bit higher than the 2014 line, and the 2016 line rising slightly above the 2015 line. A red line showing data from 2017 then appears, again following a similar pattern but with a higher overall rise than the other lines.

[00:01:18] A potter’s hands shape a lump of clay on a spinning wheel, raising it from a rounded lump into a tall, narrow form.

[00:01:25] On the face of Jupiter, the inset image from 1995, 2009, and 2015 are shown again, showing the changing size and shape of the Great Red Spot.

[00:01:32] Earth is superimposed on Jupiter’s face, illustrating that what was once large enough to fit three Earths can now only hold just over one.

[00:01:36] A Voyager space probe, and then the Juno probe, both make their appearance in front of Jupiter.

[00:01:54] The Great Red Spot again fills the screen, its roiling surface a dark orange.

[00:01:59] The Goddard Space Flight Center logo fills the screen, the Earth visible at the edge of the screen.

Hope this helps!

## List files

You can list all uploaded files and their URIs using `files.list_files()`.

In [23]:
# List all files
for file in genai.list_files():
    print(f"{file.display_name}, URI: {file.uri}")

GreatRedSpot.mp4, URI: https://generativelanguage.googleapis.com/v1beta/files/hbhacw2ab69s
Jetpack drawing, URI: https://generativelanguage.googleapis.com/v1beta/files/or2e2oavr4h7


## Delete files

Files are automatically deleted after 2 days. You can also manually delete them using `files.delete()`.

In [24]:
genai.delete_file(video_file.name)
print(f'Deleted file {video_file.uri}')

Deleted file https://generativelanguage.googleapis.com/v1beta/files/hbhacw2ab69s


TUGAS ML - II - MENGIDENTIFIKASI PLAT NUMBER MOBIL DARI GAMBAR YANG DIUPLOAD

In [32]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
         name=fn, length=len(uploaded[fn])))

Saving alzacar.jpg to alzacar (1).jpg
User uploaded file "alzacar (1).jpg" with length 120414 bytes


In [33]:
from google.colab import files
import google.generativeai as genai
import os

# Upload the file
uploaded = files.upload()

# Get the filename and content of the uploaded file
filename = next(iter(uploaded))
file_content = uploaded[filename]

# Save the file content to a temporary file
with open(filename, 'wb') as f:
    f.write(file_content)

# Upload the temporary file using genai.upload_file
license_plate = genai.upload_file(
    path=filename,  # Pass the file path
    mime_type='image/jpeg',  # Adjust the MIME type if necessary
    display_name="License Plate"
)

print(f"Uploaded file '{license_plate.display_name}' as: {license_plate.uri}")

# Remove the temporary file
os.remove(filename)

Saving alzacar.jpg to alzacar (2).jpg
Uploaded file 'License Plate' as: https://generativelanguage.googleapis.com/v1beta/files/o5xyxcfev6nd


In [34]:
# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Prompt the model with text and the previously uploaded image.
response = model.generate_content([license_plate, "read the plate number of the image with output example (‘plat_no’: ‘B 1234 ABC’,	‘vehicle’: ‘car’,‘vehicle_type’: ‘sedan’,	‘color’: ‘red’,	‘gate_open’: ‘2024-12-02 18.15.01’,	‘gate_closed’: ‘N/A’,), use current date time for gate_open value"])

Markdown(response.text)

```json
{
	‘plat_no’: ‘SAB 1633 C’,
	‘vehicle’: ‘car’,
	‘vehicle_type’: ‘hatchback’,
	‘color’: ‘white’,
	‘gate_open’: ‘2024-07-28 13:59:27’,
	‘gate_closed’: ‘N/A’
}
```