In [None]:
# Upgrade Oracle ADS to pick up latest features and maintain compatibility with Oracle Cloud Infrastructure.

!pip install -U oracle-ads

Oracle Data Science service sample notebook.

Copyright (c) 2022 Oracle, Inc. All rights reserved. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl.

***

# <font color=red>Visual Genome Repository</font>
<p style="margin-left:10%; margin-right:10%;">by <font color="teal">Oracle for Research</font></p>

---

## Overview:

Visual Genome is a dataset, a knowledge base, and an ongoing effort to connect structured image concepts to language which provides a multi-layered understanding of pictures. This dataset contains 15.13 GB of images. Specifically, it has 108,077 images along with annotation of objects, attributes, and relationships within each image.

This notebook demonstrates how to download images and objects from Oracle Cloud Infrastructure (OCI) Object Storage, build dataframe from the JSON metadata files, how access the image data, define a region, and finally visualize regions along with descriptions on a chosen image.

Developed on [General Machine Learning](https://docs.oracle.com/iaas/data-science/using/conda-gml-fam.htm) for CPU on Python 3.8 (version 1.0)

---

## Contents:

- <a href="#intro">Introduction</a>
  - <a href='#data'>Dataset</a>
  - <a href="#open_data">What is Oracle Open Data?</a>
- <a href='#df'>Building Dataframe for Visual Genome Metadata</a>
   - <a href='#build'>Downloading Metadata and Building a Dataframe</a>
- <a href='#visualize'>Visualizing Images and Regions</a>
   - <a href='#image'>Getting Image Data</a>
   - <a href='#region'>Region Data</a>
   - <a href='#vze'>Visualize Regions</a>
- <a href='#ref'>References</a>

---


Datasets are provided as a convenience. Datasets are considered third-party content and are not considered materials under your agreement with Oracle.
    
You can access the `Visual Genome` dataset license [here](https://creativecommons.org/licenses/by/4.0/).

---

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import requests

from io import BytesIO
from matplotlib.patches import Rectangle
from PIL import Image as PIL_Image

<a id="intro"></a>
# Introduction

<a id='data'></a>
## Dataset

**Data**: The Visual Genome is a large formalized knowledge representation for visual understanding. It has a complete set of descriptions and question answers that ground visual concepts to language. Each image, in the dataset, has an average of 35 objects, 26 attributes, and 21 pairwise relationships between objects. The objects, attributes, relationships, and noun phrases are canonicalized in regional descriptions and question-answer pairs to [WordNet](https://wordnet.princeton.edu/) synsets. Together, these annotations represent a large and dense dataset of image descriptions, objects, attributes, relationships, and question-answer pairs. The paper [A Hierarchical Approach for Generating Descriptive Image Paragraphs](https://arxiv.org/pdf/1611.06607v1.pdf) provides details on how the data was generated.

**Directory Structure**: It consists of two top level directories, `VG_100K`, and `VG_100K_2`. These directories contain all of the jpg images. The JSON and text metadata files are available in the top level directory.

**Template**: `https://objectstorage.us-ashburn-1.oraclecloud.com/n/idcxvbiyd8fn/b/visual_genome/o/<filename>`. For example, 
* Image: https://objectstorage.us-ashburn-1.oraclecloud.com/n/idcxvbiyd8fn/b/visual_genome/o/VG_100K/1592363.jpg
* JSON: https://objectstorage.us-ashburn-1.oraclecloud.com/n/idcxvbiyd8fn/b/visual_genome/o/qa_objects.json.zip

**Data Availability**: All data is available from the [Visual Genome repository](https://opendata.oraclecloud.com/ords/r/opendata/opendata/details?data_set_id=1), which is part of [Oracle Open Data](https://opendata.oraclecloud.com/ords/r/opendata/opendata/home).

<a id="open_data"></a>
## What is Oracle Open Data?

Oracle Open Data is a free service that curates information - spatial images, protein sequences, and annotated text files from the world's leading scientific databases. The repository connects researchers, developers, students, and educators with petabytes of open data from trusted resources. Use Oracle Open Data to view important metadata and sample code for each data set, which simplifies technical complexities and makes it easy for researchers to use.

<a id='df'></a>
# Building Dataframe for Visual Genome Metadata

<a id='build'></a>
## Downloading Metadata and Building a Dataframe

OCI Object Storage enables customers to securely store any type of data in its native format. With built-in redundancy, OCI Object Storage is ideal for building modern applications that require scale and flexibility, as it can be used to consolidate multiple data sources for analytics, backup, or archive purposes.

The Visual Genome metadata for images and objects are stored in OCI Object Storage in the JSON format. Pandas can read JSON and convert it to a tabular format.

In [None]:
img_url = "https://objectstorage.us-ashburn-1.oraclecloud.com/n/idcxvbiyd8fn/b/visual_genome/o/image_data_v1.json.zip"
print("Downloading image metadata...")
image_df = pd.read_json(img_url, compression="infer")
print("Download complete")

object_url = "https://objectstorage.us-ashburn-1.oraclecloud.com/n/idcxvbiyd8fn/b/visual_genome/o/objects_v1_2.json.zip"
print("Downloading object metadata...")
object_df = pd.read_json(object_url, compression="infer")
print("Download complete")

For ease of use, you can merge these two data frames into one and visualize it.

In [None]:
full_df = image_df.merge(object_df, left_on="id", right_on="image_id")
full_df.head()

<a id='visualize'></a>
# Visualizing Images and Regions

<a id='image'></a>
## Getting Image Data

The following cell filters the `full_df` dataframe for `image_id = 107995`. It extracts the URL of the image and information about objects in it. This object information includes the size of the bounding box, its location, the label for the object and in ID.

In [None]:
image_id = 107995
image_url, objects = full_df[full_df["id"] == image_id][["oci_url", "objects"]].values[
    0
]
print("The url of the image is {}".format(image_url))

You can display the image using the `PIL_Image` module.

In [None]:
# if file is on disk, use an HTTP handler and BytesIO to read file
if image_url.startswith("http://") or image_url.startswith("https://"):
    response = requests.get(image_url)
    image = PIL_Image.open(BytesIO(response.content))
else:
    image = PIL_Image.open(image_url)
image

<a id='region'></a>
## Region Data

The object metadata contains the names of the objects, an object ID and information about the bounding box. In particular, `x` and `y` parameters give the pixel location of the **top left** corner of the region. The size of the region is determined by the width, `w`, and height, `h`, parameters.

In [None]:
objects[0]

The next cell defines a `Region` class. It is used to manage information about an region.

In [None]:
class Region:
    def __init__(self, obj):
        self.x = obj["x"]
        self.y = obj["y"]
        self.width = obj["w"]
        self.height = obj["h"]
        self.names = obj["names"]
        self.object_id = obj["object_id"]
        self.synsets = obj["synsets"]
        self.phrase = "N/A"
        if len(obj["names"]) > 0:
            self.phrase = obj["names"][0]

<a id='vze'></a>
## Visualize Regions

You can visualize regions by overlaying them over on the image. The next cell defines a UDF `visualize_regions` function. This function accepts two parameters, `img` and `objects`. The `img` parameter is a `PIL.JpegImagePlugin.JpegImageFile` object of the background image. The `objects` parameter is a Python list that contains information the metadata of the regions.

In [None]:
def visualize_regions(img, objects):
    plt.imshow(img)
    ax = plt.gca()  # get current axes
    for obj in objects:
        region = Region(obj)  # construct each region object
        ax.add_patch(
            Rectangle(
                (region.x, region.y),  #
                region.width,
                region.height,
                fill=False,
                edgecolor="red",
                linewidth=3,
            )
        )
        ax.text(
            region.x,
            region.y,
            region.phrase,
            style="italic",
            bbox={"facecolor": "white", "alpha": 0.7, "pad": 10},
        )
    fig = plt.gcf()  # get current figure
    fig.set_size_inches(18, 10)  # resize the image with width 18 and height 10
    plt.tick_params(labelbottom="off", labelleft="off")
    plt.show()

Since there are many regions, only first 8 are displayed here.

In [None]:
visualize_regions(image, objects[:8])

<a id='ref'></a>
# References

- [ADS Library Documentation](https://accelerated-data-science.readthedocs.io/en/latest/index.html)
- [Data Science YouTube Videos](https://www.youtube.com/playlist?list=PLKCk3OyNwIzv6CWMhvqSB_8MLJIZdO80L)
- [OCI Data Science Documentation](https://docs.cloud.oracle.com/en-us/iaas/data-science/using/data-science.htm)
- [Oracle Data & AI Blog](https://blogs.oracle.com/datascience/)
- [Understanding Conda Environments](https://docs.cloud.oracle.com/en-us/iaas/data-science/using/use-notebook-sessions.htm#conda_understand_environments)
- [Use Resource Manager to Configure Your Tenancy for Data Science](https://docs.cloud.oracle.com/en-us/iaas/data-science/using/orm-configure-tenancy.htm)
- [Visual Genome at Oralce Open Data](https://opendata.oraclecloud.com/ords/r/opendata/opendata/details?data_set_id=1&clear=RP,13&session=516912761537082)
- [Visual Genome Tutorial](http://visualgenome.org/api/v0/api_beginners_tutorial.html)
- [Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations](http://visualgenome.org/static/paper/Visual_Genome.pdf)