# How To Use Vertex Text-to-Image Generative AI To Inspect Image Details

This notebook outlines how to interact with Vertex AI's Imagen GenAI models to inspect images and generate detailed information about its content. Visual Question Answering (VQA) lets you provide an image to the model and ask a question about the image's contents. In response to your question you get one or more natural language answers.

More information about vertexai vision_models API can be found at https://cloud.google.com/python/docs/reference/aiplatform/latest/vertexai.vision_models

## Prepare the python development environment

First, let's identify any project specific variables to customize this notebook to your GCP environment. Change YOUR_PROJECT_ID with your own GCP project ID.

In [None]:
PROJECT_ID = "YOUR_PROJECT_ID"
LOCATION = "us-central1"

Install any needed python modules from our requirements.txt file. Most Vertex Workbench environments include all the packages we'll be using, but if you are using an external Jupyter Notebook or require any additional packages for your own needs, you can simply add them to the included requirements.txt file an run the folloiwng commands.

In [None]:
#pip install -r requirements.txt

Now we will import all required modules. For our purpose, we will be utilizing the following:

- vertexai - Provides access to the Vertex AI modules
- urllib - Download an image from a url to pass to the model for Q&A
- os - Remove the downloaded file once the Image method generates the encoded base64 string to send to the model

In [None]:
import vertexai
from vertexai.vision_models import ImageTextModel, Image
import urllib
import os

## Connect to the Vertex AI API

Specify the Project ID and location from the variables defined in the start of this notebook.

In [None]:
vertexai.init(project=PROJECT_ID, location=LOCATION)
model = ImageTextModel.from_pretrained("imagetext@001")

Define the Q&A prompt that will be passed to the model.

In [None]:
vqa_prompt = "does this sink have a dryingboard"

Next we will download an image from a url that will be passed to the Imagen Q&A service for inspection. 

In [None]:
image_url = ('https://mobileimages.lowes.com/productimages/79fc44bb-c9ef-453e-9000-d7cae1372431/49607177.jpg')
downloaded_image = 'qa_image.jpg'
urllib.request.urlretrieve(image_url, downloaded_image) 

Use the Image method from vertexai.vision_model to generate a base64 encoded string of the downloaded image, then delete the file.

In [None]:
source_image = Image.load_from_file(location='./'+downloaded_image)
os.remove(downloaded_image)

Alternatively, you can uncomment the following if you would like to use an exisitng local file instead of downloading an image.

In [None]:
#source_image = Image.load_from_file(location='./sink1.jpg')

We will now create the request body that will be passed to the REST API

In [None]:
answers = model.ask_question(
    image=source_image,
    question=vqa_prompt
    # Optional:
    #number_of_results=2,
)

You can optionally uncomment the following to view the returned status code for verification or troubleshooting

In [None]:
print(answers)

That's it! Congratulations on defining your first visual Q&A with Imagen!