# Image datasets
In this notebook, we will take a look at some different open source datasets. A subdivision will be made between datasets used for object detection and datasets used for image segmentation. More documentation on the used functions can be found at https://voxel51.com/ . 


In [1]:
# FiftyOne is an open source toolkit for building image datasets and computer vision models.
import fiftyone as fo
import fiftyone.zoo as foz

## Object detection

### Open Images
This dataset from Google contains approximately 9 million images annotated with image-level labels and object bounding boxes. 

In [2]:
# The dataset is constructed from the Open Images v6 dataset. It will contain 25 samples of images where glasses are detected.
dataset = foz.load_zoo_dataset(
    "open-images-v6",
    split="validation", 
    label_types=["detections"],
    classes=["Glasses"],
    max_samples = 25
    )
session = fo.launch_app(dataset)

Downloading split 'validation' to 'C:\Users\matth\fiftyone\open-images-v6\validation' if necessary


Necessary images already downloaded


Existing download of split 'validation' is sufficient


Loading existing dataset 'open-images-v6-validation-25'. To reload from disk, either delete the existing dataset or provide a custom `dataset_name` to use



Could not connect session, trying again in 10 seconds



RuntimeError: Client is not connected

### **ImageNet**
This dataset is organised according to the WordNet hierarchy. Each node of the hierarchy contains hundreds and thousands of images. Object detection can be trained with this dataset, but here the dataset is previewed for image classification. 

In [4]:
# The dataset is constructed from the ImageNet dataset. 
dataset = foz.load_zoo_dataset("imagenet-sample")
session = fo.launch_app(dataset)

Downloading dataset to 'C:\Users\matth\fiftyone\imagenet-sample'
Downloading dataset...
 100% |████|  762.4Mb/762.4Mb [11.3s elapsed, 0s remaining, 64.0Mb/s]      
Extracting dataset...
Parsing dataset metadata
Found 1000 samples
Dataset info written to 'C:\Users\matth\fiftyone\imagenet-sample\info.json'
Loading 'imagenet-sample'
 100% |███████████████| 1000/1000 [955.1ms elapsed, 0s remaining, 1.1K samples/s]       
Dataset 'imagenet-sample' created



Welcome to

███████╗██╗███████╗████████╗██╗   ██╗ ██████╗ ███╗   ██╗███████╗
██╔════╝██║██╔════╝╚══██╔══╝╚██╗ ██╔╝██╔═══██╗████╗  ██║██╔════╝
█████╗  ██║█████╗     ██║    ╚████╔╝ ██║   ██║██╔██╗ ██║█████╗
██╔══╝  ██║██╔══╝     ██║     ╚██╔╝  ██║   ██║██║╚██╗██║██╔══╝
██║     ██║██║        ██║      ██║   ╚██████╔╝██║ ╚████║███████╗
╚═╝     ╚═╝╚═╝        ╚═╝      ╚═╝    ╚═════╝ ╚═╝  ╚═══╝╚══════╝ v0.21.6

If you're finding FiftyOne helpful, here's how you can get involved:

|
|  ⭐⭐⭐ Give the project a star on GitHub ⭐⭐⭐
|  https://github.com/voxel51/fiftyone
|
|  🚀🚀🚀 Join the FiftyOne Slack community 🚀🚀🚀
|  https://slack.voxel51.com
|



### **COCO**
This is a large-scale dataset for object detection and segmentation. It contains 330k images and 80 object categories. 

In [None]:
# The dataset is constructed from the COCO 2017 dataset. It will contain 25 samples of images where cats and/or dogs are detected.
dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    label_types=["detections"],
    classes=["cat", "dog"],
    max_samples=25,
)
session = fo.launch_app(dataset)

### **KITTI**
This is a dataset which is used for autonomous driving systems. It contains images of cars and pedestrians from the driver's view. 

In [None]:
# The dataset is constructed from the KITTI dataset. Note: it takes approximately 25 minutes to download the images.
dataset = foz.load_zoo_dataset("kitti", split="train")
session = fo.launch_app(dataset)

### **PASCAL VOC**
This dataset contain around 12k images, where each image contains a set of objects. There are 20 object classes available. 

In [None]:
# The dataset is constructed from the PASCAL VOC 2012 dataset.
dataset = foz.load_zoo_dataset("voc-2012", split="validation")
session = fo.launch_app(dataset)

## **Image segmentation**

### **Open Images**
This dataset from Google contains approximately 9 million images annotated with image-level labels and object bounding boxes.

In [None]:
# The dataset is constructed from the Open Images v6 dataset. It will contain 25 samples of segmented images.
dataset = foz.load_zoo_dataset(
    "open-images-v6",
    split="train", 
    label_types=["segmentations"],
    max_samples = 25
    )
session = fo.launch_app(dataset)