## Image Captioning with Google Imagen

Mustafa Ozcicek ozcicekmustafa@gmail.com

I have been testing many different captioning models. Altough they all of them work pretty well, most of them fail when they describe simplistic logos, graphic design works etc. If you are (like me) trying to create a training dataset with captions and do not want to write captions manually, I found that Google's Imagen works pretty well. 

Before you start, make sure that your images have proper names starting from 1 to whatever. And you should adjust the range value in the loop manually based on the number of images you have.

### Import Libraries

In [16]:
import requests
import json
import base64
import os
import pandas as pd
from tqdm import tqdm
from PIL import Image
from io import BytesIO

### Define Useful Functions

In [17]:
def get_image_base64_encoding(image_path: str) -> str:
    """
    Function to return the base64 string representation of an image
    """
    with open(image_path, 'rb') as file:
        image_data = file.read()
    image_extension = os.path.splitext(image_path)[1]
    base64_encoded = base64.b64encode(image_data).decode('utf-8')
    return f"data:image/{image_extension[1:]};base64,{base64_encoded}"


def get_image_base64_encoding2(image_path: str) -> str:
    """
    
    Function to return the base64 string representation of an image
    This function does not include data:image/png;base64, before the encoding
    I am using for the iterating http requests
    
    """
    with open(image_path, 'rb') as file:
        image_data = file.read()
    image_extension = os.path.splitext(image_path)[1]
    base64_encoded = base64.b64encode(image_data).decode('utf-8')
    return f"{base64_encoded}"

def image_to_base64_PNG(image, format="PNG"):
    
    """
    Process PNG Images
    
    """

    buffer = BytesIO()
    image.save(buffer, format=format)
    image_str = base64.b64encode(buffer.getvalue()).decode("utf-8")
    return image_str


### Create a Dataframe to Keep the Captions and Other Info

In [18]:
captiondb = pd.DataFrame({"Image ID": [], "Image": [], "Description": []})
print(captiondb)

Empty DataFrame
Columns: [Image ID, Image, Description]
Index: []


### Start the Request

**Enter Google Cloud Project Credentials**

In order to get an access token, enable Vertex API on Google Cloud and open google cloud terminal
enter this line:

```!gcloud auth application-default print-access-token ``` *without the exclamation mark*

In [21]:
project_id = "visionforphdproject"

locations = "us-central1"

# Use google cloud terminal to get an access token !gcloud auth application-default print-access-token
access_token = "ya29.a0AfB_byD8X0LUPQ6ZroeSUUEWFZjXpw34DaqQ_Dg9DU7bI2vUYNZuzhZxIUiYXsoszLXaztYggoELgO2iFOVpvdtLvc65rZqVUM1HsGKiDRWMy9lmElALyN0oQpDlOenaWJIdvUIXCsSS3qw8cLpB6xiHs5O_OTOGCcpzHgqlqW1AWl-z3b7Em_yAlw_dd8B791ZIDVrW-F5bCk4EDR6WGRxpZx9WYZHM80F_U-BAlZDDauFtceCm2E4JrDuSIofKiCCW_Pz8_Z4729Uf_tgivbZ3WOp_cdiwS4M8QXl66nGA9dXG4m7n6i8MbPAQQz1BjzHn2hfgRjdYXAdLD_ziYy6goCLUxy5rQ4gGHCk8z7K3Q1RWBHlx6CYJp6SeZkLIOCUb69S0NRiUE0fXzmTyGjL94pwKMzRUOO8qaCgYKASASARMSFQHsvYlsc9Y4NSAkj0851jOBC8OnBw0427"

lang = "en" #caption language en, fr, es, de, it

caption_count = 1 #how many caption alternatives do you want max: 3

url = "https://us-central1-aiplatform.googleapis.com/v1/projects/visionforphdproject/locations/us-central1/publishers/google/models/imagetext:predict"

headers = {
    "Authorization": "Bearer " + access_token,
    "Content-Type": "application/json; charset=utf-8"
}


#### Single Image

Skip this line if you are trying to caption multiple images in a directory

In [28]:
data = f'''{{
        "instances": [
            {{
                "image": {{
                    "bytesBase64Encoded": "{image_to_base64_PNG(Image.open("../../ImageSimilarity/data/"+ str(i) + ".png"))}"
                }}
            }}
        ],
        "parameters": {{
            "sampleCount": 1,
            "language": "en"
        }}
    }}'''

response = requests.post(url, headers=headers, data=data)

if response.status_code == 200:
    result = response.json()
    print(result)
else:
    print("Request failed with status code:", response.status_code)
    print(response.text)


{'predictions': ['the letter h is in the shape of a house .'], 'deployedModelId': '6747203681382301696'}


#### Batch Captioning with for loop

In [24]:
for i in tqdm(range(1, 50)):
    
    i += 1
    
    encode_img = image_to_base64_PNG(Image.open("../../ImageSimilarity/data/"+ str(i) + ".png"))
    

    data = f'''{{
        "instances": [
            {{
                "image": {{
                    "bytesBase64Encoded": "{encode_img}"
                }}
            }}
        ],
        "parameters": {{
            "sampleCount": 1,
            "language": "en"
        }}
    }}'''

    response = requests.post(url, headers=headers, data=data)

    if response.status_code == 200:
        result = response.json()
        # Append captions to the DataFrame
        captiondb.loc[i] = [str(i) + ".png", encode_img,  result['predictions'][0]]
    else:
        print("Request failed with status code:", response.status_code)
        print(response.text)
        print(f"The process failed while captioning {i}.png")


 22%|██████████████████▍                                                               | 11/49 [00:13<00:39,  1.04s/it]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 12.png


 24%|████████████████████                                                              | 12/49 [00:13<00:29,  1.24it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 13.png


 31%|█████████████████████████                                                         | 15/49 [00:16<00:29,  1.16it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 16.png


 33%|██████████████████████████▊                                                       | 16/49 [00:17<00:22,  1.44it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 17.png


 35%|████████████████████████████▍                                                     | 17/49 [00:17<00:18,  1.72it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 18.png


 37%|██████████████████████████████                                                    | 18/49 [00:17<00:15,  1.99it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 19.png


 41%|█████████████████████████████████▍                                                | 20/49 [00:19<00:16,  1.73it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 21.png


 43%|███████████████████████████████████▏                                              | 21/49 [00:19<00:14,  1.97it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 22.png


 45%|████████████████████████████████████▊                                             | 22/49 [00:20<00:12,  2.12it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 23.png


 47%|██████████████████████████████████████▍                                           | 23/49 [00:20<00:11,  2.36it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 24.png


 49%|████████████████████████████████████████▏                                         | 24/49 [00:20<00:09,  2.61it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 25.png


 51%|█████████████████████████████████████████▊                                        | 25/49 [00:20<00:08,  2.75it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 26.png


 53%|███████████████████████████████████████████▌                                      | 26/49 [00:21<00:07,  2.97it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 27.png


 55%|█████████████████████████████████████████████▏                                    | 27/49 [00:21<00:07,  3.12it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 28.png


 57%|██████████████████████████████████████████████▊                                   | 28/49 [00:21<00:06,  3.27it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 29.png


 59%|████████████████████████████████████████████████▌                                 | 29/49 [00:22<00:06,  3.26it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 30.png


 61%|██████████████████████████████████████████████████▏                               | 30/49 [00:22<00:05,  3.38it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 31.png


 63%|███████████████████████████████████████████████████▉                              | 31/49 [00:22<00:05,  3.47it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 32.png


 65%|█████████████████████████████████████████████████████▌                            | 32/49 [00:23<00:05,  3.31it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 33.png


 67%|███████████████████████████████████████████████████████▏                          | 33/49 [00:23<00:04,  3.30it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 34.png


 69%|████████████████████████████████████████████████████████▉                         | 34/49 [00:23<00:04,  3.34it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 35.png


 71%|██████████████████████████████████████████████████████████▌                       | 35/49 [00:23<00:04,  3.43it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 36.png


 73%|████████████████████████████████████████████████████████████▏                     | 36/49 [00:24<00:03,  3.49it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 37.png


 76%|█████████████████████████████████████████████████████████████▉                    | 37/49 [00:24<00:03,  3.52it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 38.png


 78%|███████████████████████████████████████████████████████████████▌                  | 38/49 [00:24<00:03,  3.47it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 39.png


 80%|█████████████████████████████████████████████████████████████████▎                | 39/49 [00:25<00:02,  3.37it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 40.png


 82%|██████████████████████████████████████████████████████████████████▉               | 40/49 [00:25<00:02,  3.33it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 41.png


 84%|████████████████████████████████████████████████████████████████████▌             | 41/49 [00:25<00:02,  3.41it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 42.png


 86%|██████████████████████████████████████████████████████████████████████▎           | 42/49 [00:25<00:01,  3.53it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 43.png


 88%|███████████████████████████████████████████████████████████████████████▉          | 43/49 [00:26<00:01,  3.49it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 44.png


 90%|█████████████████████████████████████████████████████████████████████████▋        | 44/49 [00:26<00:01,  3.52it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 45.png


 92%|███████████████████████████████████████████████████████████████████████████▎      | 45/49 [00:26<00:01,  3.53it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 46.png


 94%|████████████████████████████████████████████████████████████████████████████▉     | 46/49 [00:27<00:00,  3.48it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 47.png


 96%|██████████████████████████████████████████████████████████████████████████████▋   | 47/49 [00:27<00:00,  3.52it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 48.png


 98%|████████████████████████████████████████████████████████████████████████████████▎ | 48/49 [00:27<00:00,  3.52it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 49.png


100%|██████████████████████████████████████████████████████████████████████████████████| 49/49 [00:27<00:00,  1.76it/s]

Request failed with status code: 429
{
  "error": {
    "code": 429,
    "message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: imagetext. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

The process failed while captioning 50.png





### Check the DB and Export

In [25]:
captiondb.head()

Unnamed: 0,Image ID,Image,Description
1,1.png,iVBORw0KGgoAAAANSUhEUgAAAQAAAAEACAIAAADTED8xAA...,a cross with the number 2 and 1 on it
2,2.png,iVBORw0KGgoAAAANSUhEUgAAAQAAAAEACAIAAADTED8xAA...,a black and white icon of a house with a squar...
3,3.png,iVBORw0KGgoAAAANSUhEUgAAAQAAAAEACAIAAADTED8xAA...,the letter s is in a circle on a white backgro...
4,4.png,iVBORw0KGgoAAAANSUhEUgAAAQAAAAEACAIAAADTED8xAA...,the 3m logo is black and white on a white back...
5,5.png,iVBORw0KGgoAAAANSUhEUgAAAQAAAAEACAIAAADTED8xAA...,a black and white logo with a cross and the nu...


In [28]:
print(captiondb.info())

<class 'pandas.core.frame.DataFrame'>
Index: 14 entries, 1 to 20
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Image ID     14 non-null     object
 1   Image        14 non-null     object
 2   Description  14 non-null     object
dtypes: object(3)
memory usage: 448.0+ bytes
None


#### Export DB to a csv file

In [30]:
captiondb.to_csv("captiondb.cvs", index=False)

### Closing Marks

Google Cloud does not allow you to run the request forever. It comes to a halt after ~10-20 iteration. You need to get in touch with them and ask to increase the quota. I have not done it yet so I cannot provide the information about that yet.