Data loader for the ImageNet 2012 Classification Dataset (ILSVRC 2012-2017) in Julia.
The ImageNet dataset can be downloaded at image-net.org after signing up and accepting the terms of access. It is therefore required that you download this dataset manually.
Installation instructions can be found in the documentation.
Afterwards, add this package via
julia> ]add https://github.com/adrhill/ImageNetDataset.jl
By default, the ImageNet dataset will be loaded with the CenterCropNormalize
transformation.
This uses JpegTurbo.jl to open the image
and applies a center-cropping view to (224, 224)
resolution to it.
Afterwards, the image is normalized over color channels using normalization constants
which are compatible with most pretrained models from Metalhead.jl.
The output is in WHC[N]
format (width, height, color channels, batchsize).
using ImageNetDataset
dataset = ImageNet(:val) # load validation set
X, y = dataset[1:5] # load features and targets
convert2image(dataset, X) # convert features back to images
class(dataset, y) # obtain class names
The dataset can also be loaded in a custom size with custom normalization parameters
by configuring the preprocessing transformations.
ImageNetDataset.jl currently provides CenterCropNormalize
and RandomCropNormalize
:
output_size = (224, 224)
mean = (0.485f0, 0.456f0, 0.406f0)
std = (0.229f0, 0.224f0, 0.225f0)
tfm = CenterCropNormalize(; output_size, mean, std)
dataset = ImageNet(:val; transform=tfm)
Custom transformations can be implemented by extending AbstractTransformation
.
To apply a transformation outside of the ImageNet
dataset,
e.g. to preprocess a single image add a given path
, run
transform(tfm, path)
Alternatively, ImageNetDataset is compatible with transformations from DataAugmentation.jl:
using ImageNetDataset, DataAugmentation
tfm = CenterResizeCrop((224, 224)) |> ImageToTensor() |> Normalize(mean, std)
dataset = ImageNet(:val; transform=tfm)
Warning
Note that DataAugmentation.jl returns features in HWC[N]
format instead of WHC[N]
.
Transformations from DataAugmentation.jl are also slightly less performant
and not compatible with convert2image
.
- MLDatasets.jl: Utility package for accessing common Machine Learning datasets in Julia
Note
This repository is based on MLDatasets.jl PR #146 and mirrors the MLDatasets API.
Copyright (c) 2015 Hiroyuki Shindo and contributors.