In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# **ImageNet Dataset**

## 1. Introduction

**ImageNet** is one of the most influential datasets in computer vision. It was created to support large-scale visual recognition research by providing millions of labeled images across thousands of object categories.

It became widely known after the **ImageNet Large Scale Visual Recognition Challenge (ILSVRC)**, which ran annually from 2010 to 2017 and served as the benchmark for image classification, detection, and localization tasks.



## 2. Key Facts

| Property              | Description                                                |
| --------------------- | ---------------------------------------------------------- |
| **Dataset Name**      | ImageNet                                                   |
| **Introduced By**     | Fei-Fei Li et al., 2009                                    |
| **Number of Images**  | ~14 million labeled images                                 |
| **Number of Classes** | 21,841 total (1,000 commonly used for classification)      |
| **Image Size**        | Varies (commonly resized to 224×224 for models)            |
| **Challenge**         | ILSVRC (ImageNet Large Scale Visual Recognition Challenge) |
| **Common Split**      | 1.2M train / 50K validation / 100K test images             |



## 3. Data Organization

The dataset is organized based on the **WordNet** hierarchy — a lexical database that groups English words into sets of synonyms called *synsets*.
Each synset in ImageNet corresponds to a single object category.

**Example WordNet Hierarchy:**

```
Entity
 ├── Animal
 │    ├── Dog
 │    │    ├── Golden Retriever
 │    │    ├── German Shepherd
 │    │    └── Bulldog
 │    ├── Cat
 │    └── Horse
 └── Vehicle
      ├── Car
      └── Airplane
```

Each leaf node (e.g., "Golden Retriever") represents a class in the dataset with thousands of associated labeled images.



## 4. ILSVRC Subset (Commonly Used)

The **ILSVRC subset** is the most popular and standardized version of ImageNet used in deep learning research.

| Split              | Number of Images | Description                   |
| ------------------ | ---------------- | ----------------------------- |
| **Training Set**   | ~1.2 million     | 1,000 categories              |
| **Validation Set** | 50,000           | 50 images per class           |
| **Test Set**       | 100,000          | Labels not publicly available |

This subset is used to train and benchmark most major CNN architectures.



## 5. Importance in Deep Learning

ImageNet played a central role in the **deep learning revolution**.

* In **2012**, the **AlexNet** model (Krizhevsky et al.) achieved a massive jump in accuracy by training a deep CNN on ImageNet using GPUs.
* This milestone marked the beginning of deep learning’s dominance in computer vision.

**Impact:**

* Led to the creation of architectures like **VGG**, **GoogLeNet (Inception)**, **ResNet**, **DenseNet**, **EfficientNet**, and others.
* Became a universal benchmark for feature extraction and transfer learning.



## 6. Example: Training with ImageNet

Since ImageNet is huge, researchers often use **pretrained models** (trained on ImageNet) and **fine-tune** them on smaller datasets.



In [None]:
# Example using PyTorch:

import torch
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image

# Load a pretrained model
model = models.resnet50(pretrained=True)
model.eval()

# Preprocess the image
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

img = Image.open("example.jpg")
img_t = preprocess(img).unsqueeze(0)


In [None]:
# Predict
with torch.no_grad():
    output = model(img_t)

# Get top-5 predictions
_, indices = torch.sort(output, descending=True)
print(indices[0][:5])

## 7. Transfer Learning with ImageNet

ImageNet-pretrained models are widely used as **feature extractors** for other computer vision tasks (e.g., object detection, segmentation, classification).

Two main approaches:

1. **Feature Extraction** — Freeze convolutional layers, use pretrained features, train only the final classifier.
2. **Fine-Tuning** — Start with pretrained weights and train the entire model (with a smaller learning rate).


In [3]:

# Example (Keras):

from tensorflow.keras.applications import ResNet50
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model

base_model = ResNet50(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)  # 10 classes in new dataset

model = Model(inputs=base_model.input, outputs=predictions)

for layer in base_model.layers:
    layer.trainable = False  # freeze base model

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model.summary()


2025-10-11 02:12:19.149009: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1760148739.355777      89 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1760148739.414480      89 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-10-11 02:12:33.326966: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m94765736/94765736[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 0us/step



## 8. Preprocessing Standards

Models trained on ImageNet use a **standard input format**:

* **Image size:** 224×224 pixels
* **Color channels:** RGB
* **Normalization:**

  * Mean = [0.485, 0.456, 0.406]
  * Std = [0.229, 0.224, 0.225]

Example (PyTorch preprocessing above uses the same).


## 9. Evaluation Metrics

* **Top-1 Accuracy:** The fraction of images where the top prediction matches the ground truth.
* **Top-5 Accuracy:** The fraction of images where the correct label is within the top 5 predictions.

**Example:**

| Model                  | Top-1 | Top-5 |
| ---------------------- | ----- | ----- |
| AlexNet (2012)         | 63.3% | 84.6% |
| VGG-16 (2014)          | 71.5% | 89.8% |
| ResNet-50 (2015)       | 76.2% | 92.8% |
| EfficientNet-B7 (2019) | 84.4% | 97.1% |



## 10. Variants and Extensions

| Variant                                     | Description                                              |
| ------------------------------------------- | -------------------------------------------------------- |
| **ImageNet-21k**                            | Extended dataset with 21,841 categories.                 |
| **Tiny ImageNet**                           | Simplified version (200 classes × 500 images per class). |
| **ImageNet-Sketch, ImageNet-A, ImageNet-R** | Used for robustness and domain adaptation research.      |



## 11. How to Access ImageNet

* **Official Website:** [http://image-net.org/](http://image-net.org/)
* **Tiny ImageNet:** Available via [Stanford Tiny ImageNet Challenge](https://www.kaggle.com/c/tiny-imagenet)
* Pretrained weights are already included in major deep learning libraries:

  * PyTorch (`torchvision.models`)
  * TensorFlow / Keras (`keras.applications`)
  * HuggingFace models (`transformers`)


## 12. Summary

| Aspect           | Description                                                      |
| ---------------- | ---------------------------------------------------------------- |
| **Purpose**      | Large-scale image dataset for training visual recognition models |
| **Classes**      | 1,000 (ILSVRC subset)                                            |
| **Images**       | 1.2M training, 50K validation                                    |
| **Input Size**   | 224×224×3                                                        |
| **Output**       | One of 1,000 categories                                          |
| **Applications** | Pretraining, benchmarking, transfer learning                     |
| **Key Models**   | AlexNet, VGG, ResNet, Inception, EfficientNet                    |

# **COCO Dataset (Common Objects in Context)**

## 1. Introduction

**COCO (Common Objects in Context)** is a large-scale dataset designed for **object detection**, **segmentation**, **keypoint detection**, and **captioning** tasks.
It was introduced by Microsoft in 2014 to promote **context-aware object recognition** — not just detecting objects, but understanding them in realistic, cluttered scenes.

Unlike datasets such as ImageNet, COCO focuses on **complex everyday scenes** containing multiple objects that overlap and interact.



## 2. Key Facts

| Property                 | Description                                               |
| ------------------------ | --------------------------------------------------------- |
| **Dataset Name**         | COCO (Common Objects in Context)                          |
| **Introduced By**        | Microsoft Research (2014)                                 |
| **Total Images**         | ~330,000 images                                           |
| **Labeled Images**       | ~200,000 images                                           |
| **Object Instances**     | 1.5 million+ labeled objects                              |
| **Number of Categories** | 80 object classes                                         |
| **Annotations**          | Object detection, segmentation masks, keypoints, captions |



## 3. Dataset Organization

The COCO dataset is divided into several task-specific subsets:

| Task                      | Annotation Type   | Example Output              |
| ------------------------- | ----------------- | --------------------------- |
| **Object Detection**      | Bounding boxes    | “person”, “car”, “dog”      |
| **Instance Segmentation** | Pixel masks       | Object outline masks        |
| **Keypoint Detection**    | 17 body keypoints | Pose estimation             |
| **Image Captioning**      | Text captions     | “A dog running in a field.” |



## 4. Dataset Splits

| Split         | Description                 | Number of Images |
| ------------- | --------------------------- | ---------------- |
| **Train2017** | Training set                | 118,000          |
| **Val2017**   | Validation set              | 5,000            |
| **Test2017**  | Testing (no public labels)  | 41,000           |
| **Unlabeled** | Additional unlabeled images | 123,000          |

Total images (all splits) ≈ **330,000**.



## 5. Object Categories

COCO has **80 object classes** organized into 12 supercategories:

| Supercategory | Example Classes                        |
| ------------- | -------------------------------------- |
| Person        | person                                 |
| Vehicle       | car, bus, truck, bicycle, motorcycle   |
| Outdoor       | traffic light, fire hydrant, stop sign |
| Animal        | dog, cat, bird, horse, cow, elephant   |
| Food          | banana, apple, sandwich, pizza, cake   |
| Furniture     | chair, bed, couch, dining table        |
| Indoor        | TV, laptop, book, clock, scissors      |



## 6. Example Annotation Format

COCO uses **JSON-based annotations** for flexibility.
Each entry contains bounding boxes, segmentation masks, and category labels.

Example (`instances_train2017.json`):

```json
{
  "annotations": [
    {
      "id": 42,
      "image_id": 12345,
      "category_id": 18,
      "bbox": [x_min, y_min, width, height],
      "segmentation": [[x1, y1, x2, y2, ...]],
      "area": 4215.0,
      "iscrowd": 0
    }
  ],
  "categories": [
    {"id": 18, "name": "dog", "supercategory": "animal"}
  ]
}
```



## 7. Visualization Example

You can visualize COCO annotations using the official **pycocotools** library.

In [None]:
from pycocotools.coco import COCO
import matplotlib.pyplot as plt
import requests
from PIL import Image

dataDir = 'path_to_coco'
dataType = 'val2017'
annFile = f'{dataDir}/annotations/instances_{dataType}.json'

coco = COCO(annFile)

# Load one image and its annotations
img_id = coco.getImgIds()[0]
img_data = coco.loadImgs(img_id)[0]

# Load and display image
image = Image.open(requests.get(img_data['coco_url'], stream=True).raw)
plt.imshow(image)
plt.axis('off')

# Load annotations and overlay
ann_ids = coco.getAnnIds(imgIds=img_data['id'])
anns = coco.loadAnns(ann_ids)
coco.showAnns(anns)


## 8. COCO Evaluation Metrics

COCO uses **mean Average Precision (mAP)** as the main metric — but in a more detailed way than PASCAL VOC.

| Metric                           | Description                                                                  |
| -------------------------------- | ---------------------------------------------------------------------------- |
| **AP (IoU=0.50:0.95)**           | Mean Average Precision averaged over IoU thresholds (0.5 to 0.95, step 0.05) |
| **AP50**                         | AP at IoU=0.50 (loose overlap)                                               |
| **AP75**                         | AP at IoU=0.75 (strict overlap)                                              |
| **APsmall / APmedium / APlarge** | AP for small, medium, and large objects                                      |
| **AR (Average Recall)**          | Measures detection recall across thresholds                                  |

These metrics encourage models to perform well not just at detecting objects, but doing so **accurately and consistently** across sizes and IoU thresholds.



## 9. Example Architectures Trained on COCO

| Model                  | Task                  | Description                                  |
| ---------------------- | --------------------- | -------------------------------------------- |
| **Faster R-CNN**       | Object Detection      | Two-stage detector with region proposals     |
| **YOLOv5 / YOLOv8**    | Object Detection      | Single-stage real-time detector              |
| **Mask R-CNN**         | Instance Segmentation | Extends Faster R-CNN with pixel masks        |
| **DETR (Transformer)** | Object Detection      | Uses attention mechanisms instead of anchors |
| **DeepLabV3+**         | Semantic Segmentation | High-accuracy dense segmentation             |

All these models are typically **pretrained on COCO**, then fine-tuned for specific applications.



## 10. Example: Using COCO with Detectron2 (PyTorch)

Detectron2 makes it easy to train on COCO or load pretrained weights.

In [None]:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2 import model_zoo
import cv2
import matplotlib.pyplot as plt

# Load pretrained COCO model
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")

predictor = DefaultPredictor(cfg)

# Inference
image = cv2.imread("example.jpg")
outputs = predictor(image)

# Visualize
from detectron2.utils.visualizer import Visualizer
v = Visualizer(image[:, :, ::-1], scale=0.8)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
plt.imshow(out.get_image())
plt.axis("off")

## 11. Applications

* **Object Detection:** Identifying and localizing objects in scenes
* **Instance Segmentation:** Outlining each object pixel-by-pixel
* **Pose Estimation:** Detecting human keypoints (e.g., arms, legs)
* **Image Captioning:** Generating natural language descriptions of scenes
* **Panoptic Segmentation:** Combining instance + semantic segmentation



## 12. Access and Download

* **Official Site:** [https://cocodataset.org](https://cocodataset.org)
* **GitHub Repository:** [https://github.com/cocodataset/cocoapi](https://github.com/cocodataset/cocoapi)
* **Direct Downloads:**

  * [Images](https://cocodataset.org/#download)
  * [Annotations](https://cocodataset.org/#download)
* **Python Tools:** `pycocotools` (`pip install pycocotools`)



## 13. Summary

| Aspect            | Description                                          |
| ----------------- | ---------------------------------------------------- |
| **Goal**          | Real-world object recognition with context           |
| **Images**        | ~330K                                                |
| **Categories**    | 80                                                   |
| **Annotations**   | Bounding boxes, masks, keypoints, captions           |
| **Metrics**       | mAP, AP50, AP75, AR                                  |
| **Use Cases**     | Detection, segmentation, pose estimation, captioning |
| **Common Models** | Faster R-CNN, Mask R-CNN, YOLO, DETR                 |



## 14. Comparison: COCO vs ImageNet

| Feature              | ImageNet               | COCO                                |
| -------------------- | ---------------------- | ----------------------------------- |
| **Purpose**          | Classification         | Detection, segmentation, captioning |
| **Images**           | ~1.2M                  | ~330K                               |
| **Classes**          | 1,000                  | 80                                  |
| **Annotations**      | Single label per image | Bounding boxes, masks, keypoints    |
| **Scene Complexity** | Mostly single object   | Multiple objects, natural context   |
| **Output Type**      | Class label            | Boxes, masks, captions              |

# **Google Open Images Dataset**

## 1. Introduction

The **Google Open Images Dataset** is one of the largest and most diverse datasets for **object detection, classification, and visual relationship detection**.
It was released by **Google AI** to provide a more **realistic and large-scale benchmark** than earlier datasets such as ImageNet and COCO.

Unlike datasets focusing only on bounding boxes or segmentation, Open Images also contains **image-level labels, object bounding boxes, visual relationships, and segmentation masks** for millions of images.



## 2. Key Facts

| Property           | Description                                                                  |
| ------------------ | ---------------------------------------------------------------------------- |
| **Dataset Name**   | Open Images Dataset                                                          |
| **Created By**     | Google AI                                                                    |
| **Total Images**   | ~9 million                                                                   |
| **Labeled Images** | ~6 million                                                                   |
| **Object Classes** | 600+ categories                                                              |
| **Annotations**    | Bounding boxes, image-level labels, visual relationships, segmentation masks |
| **License**        | Openly available for research and non-commercial use                         |



## 3. Dataset Organization

Open Images contains **four main types of annotations**:

| Type                            | Description                                                           | Example                      |
| ------------------------------- | --------------------------------------------------------------------- | ---------------------------- |
| **Image-level Labels**          | Each image can have multiple labels (multi-label classification).     | “Dog”, “Outdoor”, “Tree”     |
| **Object Bounding Boxes**       | Coordinates marking object locations.                                 | (x_min, y_min, x_max, y_max) |
| **Visual Relationships**        | Triplets like *(person, playing, guitar)* that describe interactions. | Object interactions          |
| **Instance Segmentation Masks** | Pixel-level masks for some classes.                                   | Outlined object areas        |



## 4. Dataset Splits

| Split          | Number of Images | Purpose          |
| -------------- | ---------------- | ---------------- |
| **Train**      | ~9,000,000       | Model training   |
| **Validation** | ~41,000          | Model tuning     |
| **Test**       | ~125,000         | Model evaluation |

Note: Only a subset of images (≈2 million) have detailed bounding box annotations.
The rest may have only image-level labels.



## 5. Object Categories

Open Images covers **600 object classes**, including a wide range of real-world categories such as:

* **Animals**: cat, dog, bird, horse
* **Vehicles**: car, airplane, train, bicycle
* **Food**: apple, pizza, sandwich, cake
* **Household items**: chair, laptop, refrigerator
* **People & Clothing**: person, t-shirt, backpack
* **Nature**: tree, flower, mountain, sky

Each object category has a **hierarchical structure**, meaning classes are organized in a tree (e.g., “vehicle → car → sports car”).



## 6. Example Annotation Format

Bounding boxes are stored as CSV files instead of JSON.
Each row corresponds to one bounding box.

Example (`train-annotations-bbox.csv`):

| ImageID    | LabelName | XMin | XMax | YMin | YMax | IsGroupOf | IsOccluded | IsTruncated |
| ---------- | --------- | ---- | ---- | ---- | ---- | --------- | ---------- | ----------- |
| 000abcd123 | /m/01g317 | 0.12 | 0.34 | 0.23 | 0.45 | 0         | 0          | 0           |

* `LabelName` is a **Freebase ID** (e.g., `/m/01g317` → “Person”).
* Coordinates are normalized between 0 and 1.

Mapping between IDs and class names is stored in:

```
class-descriptions-boxable.csv
```

Example:

```
/m/01g317,Person
/m/0cgh4,Bicycle
/m/04yx4,Car
```


## 7. Download and Access

You can download parts of the dataset directly from Google Cloud or Kaggle.

### Option 1: Google Cloud Storage

Official site: [https://storage.googleapis.com/openimages/web/index.html](https://storage.googleapis.com/openimages/web/index.html)

You can use `gsutil` to download:

```bash
gsutil -m rsync gs://open-images-dataset/train/ ./train/
```

### Option 2: Kaggle

Kaggle mirror: [https://www.kaggle.com/c/open-images-2019-object-detection](https://www.kaggle.com/c/open-images-2019-object-detection)

You can use the Kaggle API:

```bash
kaggle datasets download -d openimages/open-images-dataset
```


## 8. Example Visualization

You can visualize bounding boxes using `matplotlib` and CSV annotations.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image

# Load one image and its bounding boxes
df = pd.read_csv('train-annotations-bbox.csv')
image_id = df['ImageID'].iloc[0]
image_path = f'./train/{image_id}.jpg'
image = Image.open(image_path)

# Draw bounding boxes
fig, ax = plt.subplots(1)
ax.imshow(image)

for _, row in df[df['ImageID'] == image_id].iterrows():
    xmin, xmax = row['XMin'], row['XMax']
    ymin, ymax = row['YMin'], row['YMax']
    width = xmax - xmin
    height = ymax - ymin
    rect = patches.Rectangle((xmin, ymin), width, height, linewidth=2, edgecolor='r', facecolor='none')
    ax.add_patch(rect)

plt.axis('off')
plt.show()

## 9. Evaluation Metrics

Open Images uses **mean Average Precision (mAP)** like COCO but with some differences:

* Evaluates **image-level labels** and **bounding boxes** separately.
* Includes **hierarchical class relationships** (e.g., detecting a “dog” partially satisfies detecting an “animal”).
* Includes **grouped instances** (`IsGroupOf` flag) for crowd-like regions.



## 10. Comparison: Open Images vs COCO vs ImageNet

| Feature                | ImageNet | COCO     | Open Images               |
| ---------------------- | -------- | -------- | ------------------------- |
| **Images**             | 1.2M     | 330K     | 9M                        |
| **Classes**            | 1,000    | 80       | 600                       |
| **Bounding Boxes**     | No       | Yes      | Yes                       |
| **Segmentation Masks** | No       | Yes      | Yes (subset)              |
| **Relationships**      | No       | No       | Yes                       |
| **Image Labels**       | Single   | Multiple | Multiple                  |
| **Scene Complexity**   | Simple   | Moderate | Real-world complex scenes |

**Conclusion:**
Open Images is the most **comprehensive dataset** for large-scale detection and visual relationship understanding, while COCO is smaller but **more precisely annotated**.



## 11. Models Trained on Open Images

| Model         | Task                  | Framework           |
| ------------- | --------------------- | ------------------- |
| EfficientDet  | Object Detection      | TensorFlow          |
| Faster R-CNN  | Detection             | PyTorch             |
| YOLOv5/YOLOv8 | Detection             | PyTorch             |
| DETR          | Detection             | Transformer-based   |
| Mask R-CNN    | Instance Segmentation | PyTorch, Detectron2 |

Many pretrained detection models (e.g., EfficientDet) are first trained on **COCO**, then fine-tuned on **Open Images** for broader coverage.



## 12. Example: Using Open Images with TensorFlow Object Detection API

```python
!pip install tensorflow-object-detection-api

from object_detection.utils import dataset_util
from object_detection.builders import model_builder
from object_detection.utils import config_util

# Load model config and build detector
configs = config_util.get_configs_from_pipeline_file('pipeline.config')
model_config = configs['model']
detection_model = model_builder.build(model_config=model_config, is_training=False)
```

Training or inference on Open Images follows the same structure as COCO — you just change the dataset path and class label map.



## 13. Applications

* **General Object Detection** at scale
* **Multi-label Image Classification**
* **Scene Understanding and Relationships**
* **Transfer Learning** for custom datasets
* **Segmentation and Captioning** tasks


## 14. Summary

| Aspect          | Description                                                     |
| --------------- | --------------------------------------------------------------- |
| **Goal**        | Real-world large-scale detection and relationship understanding |
| **Images**      | ~9 million                                                      |
| **Classes**     | 600+                                                            |
| **Annotations** | Labels, boxes, masks, relationships                             |
| **Format**      | CSV (bounding boxes), hierarchical labels                       |
| **Main Uses**   | Object detection, segmentation, multi-label classification      |
| **Access**      | Google Cloud or Kaggle                                          |



**Official Resources:**

* Website: [https://storage.googleapis.com/openimages/web/index.html](https://storage.googleapis.com/openimages/web/index.html)
* GitHub tools: [https://github.com/cvdfoundation/open-images-dataset](https://github.com/cvdfoundation/open-images-dataset)
* Kaggle dataset: [https://www.kaggle.com/c/open-images-2019-object-detection](https://www.kaggle.com/c/open-images-2019-object-detection)