# PASCAL VOC Dataset

* [TensforFlwo Datasets - voc](https://www.tensorflow.org/datasets/catalog/voc)

> PASCAL Visual Object Classes Challenge, corresponding to the Classification and Detection competitions.

<img src="image/pascal_voc_xml_example.png" align="left" width=350/>

In [2]:
from typing import (
    Tuple,
)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds

In [3]:
import sys
sys.path.append("../../lib")

In [19]:
%load_ext autoreload
%autoreload 2

from util_tf.tfds import (
    convert_pascal_voc_bndbox_to_yolo_bbox
)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


# TensorFlow Datasets 

TFDS has its own Bounding Box representation class as ```(ymin,xmin,ymax,xmax)```. Bounding boxes in TFDS Datasets is reformatted into this format.

* [tfds.features.BBox](https://www.tensorflow.org/datasets/api_docs/python/tfds/features/BBox)

In [8]:
bbox = tfds.features.BBox(
    ymin=0.0, xmin=1.0, ymax=1.1, xmax=2.1
)
list(bbox)

[0.0, 1.0, 1.1, 2.1]

# TensorFlow Datasets Pascal VOC

TFDS BBox format is ```(y_min, x_min, y_max, x_max)``` and normalized by the image size.

```
tfds.features.BBox(
    ymin, xmin, ymax, xmax
)
```

* [datasets/tensorflow_datasets/object_detection/voc.py](https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/object_detection/voc.py#L89-121)

```
def _get_example_objects(annon_filepath):
  """Function to get all the objects from the annotation XML file."""
  with tf.io.gfile.GFile(annon_filepath, "r") as f:
    root = xml.etree.ElementTree.parse(f).getroot()

    # Disable pytype to avoid attribute-error due to find returning
    # Optional[Element]
    # pytype: disable=attribute-error
    size = root.find("size")
    width = float(size.find("width").text)
    height = float(size.find("height").text)

    for obj in root.findall("object"):
      # Get object's label name.
      label = obj.find("name").text.lower()
      # Get objects' pose name.
      pose = obj.find("pose").text.lower()
      is_truncated = obj.find("truncated").text == "1"
      is_difficult = obj.find("difficult").text == "1"
      bndbox = obj.find("bndbox")
      xmax = float(bndbox.find("xmax").text)
      xmin = float(bndbox.find("xmin").text)
      ymax = float(bndbox.find("ymax").text)
      ymin = float(bndbox.find("ymin").text)
      yield {
          "label": label,
          "pose": pose,
          "bbox": tfds.features.BBox(
              ymin / height, xmin / width, ymax / height, xmax / width
          ),
          "is_truncated": is_truncated,
          "is_difficult": is_difficult,
      }
```

In [9]:
ds, info = tfds.load(
    name='voc', 
    split='train',
    data_dir="/Volumes/SSD/data/yolov1",
    with_info=True,
)
fig = tfds.show_examples(ds, info)

2023-03-03 11:37:27.469939: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "NOT_FOUND: Could not locate the credentials file.". Retrieving token from GCE failed with "FAILED_PRECONDITION: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".


Downloading and preparing dataset 868.85 MiB (download: 868.85 MiB, generated: Unknown size, total: 868.85 MiB) to /Volumes/SSD/data/yolov1/voc/2007/4.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

KeyboardInterrupt: 

In [None]:
info