# AISG7: Day3 - MultiModal AI with Gemini

This guide will walk you through using Google AI Studio with Gemini. You'll learn how to set up and interact with this powerful AI model, understand its capabilities and learn best practices for using it effectively.

## Goals

By the end of this guide, you should:

- Have working access to Gemini in Google AI Studio
- Understand Gemini's capabilities
- Be able to make basic API calls
- Test out one multimodal aspect of Gemini

** Please remember to replace your AI Studio API key in the .env file! **

Notes: You might need a google cloud account; there is $300 free credits for each new user and a generous free tier.   
For our API calls below, we are using the new Gemini 2.0-Flash experimental model which is not billed.

In [6]:
%pip install -r requirements.txt

Collecting google-generativeai (from -r requirements.txt (line 1))
  Downloading google_generativeai-0.8.3-py3-none-any.whl.metadata (3.9 kB)
Collecting google-ai-generativelanguage==0.6.10 (from google-generativeai->-r requirements.txt (line 1))
  Downloading google_ai_generativelanguage-0.6.10-py3-none-any.whl.metadata (5.6 kB)
Collecting google-api-core (from google-generativeai->-r requirements.txt (line 1))
  Downloading google_api_core-2.24.0-py3-none-any.whl.metadata (3.0 kB)
Collecting google-api-python-client (from google-generativeai->-r requirements.txt (line 1))
  Downloading google_api_python_client-2.156.0-py2.py3-none-any.whl.metadata (6.7 kB)
Collecting protobuf (from google-generativeai->-r requirements.txt (line 1))
  Downloading protobuf-5.29.2-cp38-abi3-manylinux2014_x86_64.whl.metadata (592 bytes)
Collecting tqdm (from google-generativeai->-r requirements.txt (line 1))
  Downloading tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━

In [7]:
import google.generativeai as genai
from dotenv import load_dotenv
import os

  from .autonotebook import tqdm as notebook_tqdm


In [8]:
# Load environment variables from .env file
load_dotenv()

# Access the API key from the environment variable
api_key = os.getenv("GOOGLE_API_KEY")

# Initialize the generativeAI client using AI Studio key
genai.configure(api_key=api_key)

In [9]:
model1=genai.GenerativeModel("gemini-2.0-flash-exp")

print("Your message to Gemini:")
msg = input()
print("Sending message to Gemini...")

# Generate text using the Gemini model

response = model1.generate_content(msg)

print(response.text)


Your message to Gemini:
Sending message to Gemini...
Hi! 😊 How can I help you today?



In [10]:
# Gemini to tell a joke

response = model1.generate_content(
"""
Tell me a joke, but do not explain why it is funny. 
Please place a carriage return after each sentence and ensure readibility.
Use this as a starting point:
OpenAI, Gemini and Claude are in a plane ..."""
)

print(response.text)

OpenAI, Gemini and Claude are in a plane.

The pilot announces they are losing altitude.

OpenAI says, "We need to calculate the optimal trajectory for a controlled descent."

Gemini suggests, "Let's brainstorm some creative solutions to repurpose the oxygen masks."

Claude calmly states, "I've already drafted a detailed apology letter to the passengers' families."



## Exploring Multimodal Capabilities with Gemini
Gemini is not just a text-based model; it can also process and generate images. Here's how you can explore its multimodal capabilities:

**Image Classification / Captioning**

You can provide an image to Gemini and ask it to generate a caption describing the image. This showcases Gemini's ability to understand visual content.

**Image Generation**

You can ask Gemini to generate images based on a text description. This demonstrates its ability to translate textual concepts into visual representations.  
Please note that the Imagen3 API for image generation is still in beta and not publically available.

**Code execution**

You can ask Gemini to generate and execute code.

there are more capabilities, including audio understanding and video understanding.

In [11]:
import httpx
import os
import base64

# image captioning
model = genai.GenerativeModel(model_name = "gemini-2.0-flash-exp")
image_path = "https://upload.wikimedia.org/wikipedia/commons/thumb/b/b6/Felis_catus-cat_on_snow.jpg/1024px-Felis_catus-cat_on_snow.jpg"

image = httpx.get(image_path)

prompt = "Caption this image."
response = model.generate_content([{'mime_type':'image/jpeg', 'data': base64.b64encode(image.content).decode('utf-8')}, prompt])

print(response.text)

Certainly! Here are a few caption options for the image of the cat in the snow:

**Short & Sweet:**

* "Winter wanderer."
* "Snowy stroll."
* "Curious kitty in the cold."
* "Paws in the snow."

**Descriptive:**

* "A tabby cat explores the winter landscape."
* "A young feline ventures out on a snowy day."
* "Striped beauty against a white backdrop."
* "The cat's amber eyes contrast with the frosty scene."

**Playful:**

* "Is it playtime yet?"
* "Making tracks in the snow."
* "Just a cat enjoying a winter walk."
* "Winter adventures with my feline friend."

**More evocative:**

* "A moment of quiet in a snowy world."
* "The delicate beauty of nature and an animal."
* "This little cat is ready for winter!"

If you have any preferences or want other options, just let me know!


In [12]:
# code generation and execution

response = model.generate_content(
    ('What is the sum of the first 50 prime numbers? '
    'Generate and run code for the calculation, and make sure you get all 50.'),
    tools='code_execution')

print(response.text)

Okay, I understand. You want me to calculate the sum of the first 50 prime numbers. I'll generate Python code to identify the first 50 primes and then sum them. Here's my plan:

1.  **Prime Number Identification:** I'll use a function to check if a number is prime. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself.
2.  **Generating the First 50 Primes:** I will loop through numbers, starting with 2, and check if they are prime. If they are, I will add them to a list. I'll stop when the list contains 50 prime numbers.
3.  **Summation:** Finally, I'll sum the list of prime numbers and report the result.

Here is the code:


``` python
def is_prime(n):
    if n <= 1:
        return False
    if n <= 3:
        return True
    if n % 2 == 0 or n % 3 == 0:
        return False
    i = 5
    while i * i <= n:
        if n % i == 0 or n % (i + 2) == 0:
            return False
        i += 6
    return True

primes = []
num = 2
while len(p

## And thats it folks 👏

You have successfully :
- used an API key from AI Studio and sent Gemini a handful of prompts
- utilised multimodal capabilities of Gemini 2.0

To find out more go to the docs for Gemini Python SDK
[https://ai.google.dev/]

Now the world is your oyster - get building and show us what you come up with!!!