# Overview

Imagen on Vertex AI (image Generative AI) offers a variety of features:

* Image generation
* Image editing
* Visual captioning
* Visual question answering

This notebook focuses on **Visual question answering** only.



Objectives
In this notebook, you will learn how to use the Vertex AI Python SDK to:

* Make request to https://fakestoreapi.com/products to retrieve 20 images
* Ask questions to the last image
* Experiment with different parameters, such as:
  * number of results to be generated
  

Resources:
* https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
* https://cloud.google.com/vertex-ai/generative-ai/docs/image/visual-question-answering
* https://github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/visual_question_answering.ipynb

**Costs**
* This notebook uses billable components of Google Cloud:
  * Vertex AI (Imagen)
* Learn about Vertex AI pricing and use the Pricing Calculator to generate a cost estimate based on your projected usage.

In [None]:
%pip install --upgrade --user google-cloud-aiplatform

In [None]:
# Restart kernel after installs so that your environment can access the new packages
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

# Authenticate your notebook environment (Colab only)

If you are running this notebook on Google Colab, you will need to authenticate your environment. To do this, run the new cell below. This step is not required if you are using Vertex AI Workbench.

In [1]:
import sys

# Additional authentication is required for Google Colab
if "google.colab" in sys.modules:
    # Authenticate user to Google Cloud
    from google.colab import auth

    auth.authenticate_user()

In [2]:
# Define project information
from google.colab import userdata
PROJECT_ID = userdata.get('VERTEX_AI_PROJECT_ID')
LOCATION = "us-central1"  # @param {type:"string"}

# Initialize Vertex AI
import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

- Make request to https://fakestoreapi.com/products to retrieve the images.

In [None]:
import requests

response = requests.get("https://fakestoreapi.com/products")
apiData = response.json()
images = []
for product in apiData:
    images.append(product.get("image"))

num_of_images = len(images)
print(images[num_of_images - 1])

* Add a function to download image
* Load the last women's shirt

In [53]:
from PIL import Image
import requests
import os

def download_image(url: str) -> str:
    """Downloads an image from the specified URL."""

    # Send a get request to the url
    response = requests.get(url)

    if response.status_code != 200:
        raise Exception(f"Failed to download image from {url}")

    # Define image related variables
    image_path = os.path.basename(url)
    image_bytes = response.content
    image_type = response.headers["Content-Type"]

    # Check for image type, currently only PNG or JPEG format are supported
    if image_type not in {"image/png", "image/jpeg"}:
        raise ValueError("Image can only be in PNG or JPEG format")

    # Write image data to a file
    with open(image_path, "wb") as f:
        f.write(image_bytes)
    return image_path

In [None]:
from vertexai.preview.vision_models import Image as VertexImage

image_path = download_image(images[num_of_images - 1])
print(image_path)

# Load the newly downloaded image
shirt_image = VertexImage.load_from_file(image_path)
shirt_image.show()

# Load the model

The model names from Vertex AI Imagen have two components: model name and version number. The naming convention follow this format: @. For example, imagetext@001 represent the version 001 of the imagetext model.

In [56]:
from vertexai.preview.vision_models import ImageTextModel

model_name = "imagetext@001"
model = ImageTextModel.from_pretrained(model_name)

# Ask questions to the image



In [3]:
def askAQuestion(image, question, number_of_results=1):
  answers = model.ask_question(image=image, question=question, number_of_results=number_of_results)
  return answers

In [None]:
answers = askAQuestion(image=shirt_image, question="What is this image about?", number_of_results=3)
print (answers)

# Ask for the existence of the items


* Ask the model is there a pair of jeans?
* Ask the model is there a pair of shoes?
* Ask the model is there a backpack?
* Ask the model is there a shirt?

In [None]:
answers = askAQuestion(image=shirt_image, question="Is there a pair of jeans?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="Is there a pair of shoes?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="Is there a backpack?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="Is there a shirt?")
print (answers[0])

# Ask for the attributes of the items

* Ask for the color of the pair of the jeans?
* Ask the model whether the Jeans are ripped?

In [None]:
answers = askAQuestion(image=shirt_image, question="What is the color of the pair of jeans?", number_of_results=2)
print (answers)

answers = askAQuestion(image=shirt_image, question="Is the pair of jeans ripped?")
print (answers[0])

# Ask some questions about the shoes

* What is the color of the shoes?
* How many shoes are on the image?
* Do the shoes have shoe laces?

In [None]:
answers = askAQuestion(image=shirt_image, question="What is the color of the shoes?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="How many shoes are on the image?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="Do the shoes have shoe laces?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="Do you see the white shoe laces on the shoes?")
print (answers[0])

# Ask some questions about the t-shirt

* The color of the t-shirt
* Does the t-shirt have a heart?
* What are the words on the t-shirt?
* What is the color of the heart?
* Is the shirt long-sleeved or short-sleeved?


In [None]:
answers = askAQuestion(image=shirt_image, question="What is the color of the t-shirt?", number_of_results=3)
print (answers)

answers = askAQuestion(image=shirt_image, question="Does the t-shirt have a heart?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="What are the words sewn on the t-shirt?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="What is the color of the heart on the shirt?")
print (answers[0])

answers = askAQuestion(image=shirt_image, question="Is the shirt long-sleeved or short-sleeved?")
print (answers[0])

* Remove temporary file

In [65]:
os.remove(image_path)