# Image Data IO
This tutorial explains how to prepare, load and train with image data in MXNet. All IO in MXNet is handled via IO.DataIter and its subclasses, which is explained [here](https://github.com/dmlc/mxnet-notebooks/blob/master/scala/basic/data.ipynb). In this tutorial we focus on how to use pre-built data iterators as while as custom iterators to process image data.

There are mainly three ways of loading image data in MXNet:
- IO.ImageRecordIter: implemented in backend (C++), less customizable but can be used in all language bindings, load from .rec files
- Custom iterator by inheriting IO.DataIter

First, we explain the record io file format used by mxnet:


## Jupyter Scala kernel
Add mxnet scala jar which is created as a part of MXNet Scala package installation in classpath as follows:

**Note**: Process to add this jar in your scala kernel classpath can differ according to the scala kernel you are using.

We have used [jupyter-scala kernel](https://github.com/alexarchambault/jupyter-scala) for creating this notebook.

```
classpath.addPath(<path_to_jar>)

e.g
classpath.addPath("mxnet-full_2.11-osx-x86_64-cpu-0.1.2-SNAPSHOT.jar")
```

## RecordIO
Record IO is the main file format used by MXNet for data IO. It supports reading and writing on various file systems including distributed file systems like Hadoop HDFS and AWS S3. First, we download the Caltech 101 dataset that contains 101 classes of objects and convert them into record io format:

Download and unzip the Image Dataset as follows:

In [2]:
// change this to your mxnet location
val MXNET_HOME = "/home/ec2-user/src/mxnet"

[36mMXNET_HOME[0m: [32mString[0m = [32m"/home/ec2-user/src/mxnet"[0m

In [3]:
import sys.process._
"wget http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz -P data/ -q"!

[32mimport [36msys.process._[0m
[36mres2_1[0m: [32mInt[0m = [32m0[0m

In [4]:
"tar -xzf data/101_ObjectCategories.tar.gz -C data/"!

[36mres3[0m: [32mInt[0m = [32m0[0m

Let's take a look at the data. As you can see, under the root folder every category has a subfolder.

Now let's convert them into record io format. First we need to make a list that contains all the image files and their categories:

In [5]:
"python "+MXNET_HOME+"/tools/im2rec.py --list=1 --recursive=1 --shuffle=1 --test-ratio=0.2 data/caltech data/101_ObjectCategories"!

BACKGROUND_Google 0
Faces 1
Faces_easy 2
Leopards 3
Motorbikes 4
accordion 5
airplanes 6
anchor 7
ant 8
barrel 9
bass 10
beaver 11
binocular 12
bonsai 13
brain 14
brontosaurus 15
buddha 16
butterfly 17
camera 18
cannon 19
car_side 20
ceiling_fan 21
cellphone 22
chair 23
chandelier 24
cougar_body 25
cougar_face 26
crab 27
crayfish 28
crocodile 29
crocodile_head 30
cup 31
dalmatian 32
dollar_bill 33
dolphin 34
dragonfly 35
electric_guitar 36
elephant 37
emu 38
euphonium 39
ewer 40
ferry 41
flamingo 42
flamingo_head 43
garfield 44
gerenuk 45
gramophone 46
grand_piano 47
hawksbill 48
headphone 49
hedgehog 50
helicopter 51
ibis 52
inline_skate 53
joshua_tree 54
kangaroo 55
ketch 56
lamp 57
laptop 58
llama 59
lobster 60
lotus 61
mandolin 62
mayfly 63
menorah 64
metronome 65
minaret 66
nautilus 67
octopus 68
okapi 69
pagoda 70
panda 71
pigeon 72
pizza 73
platypus 74
pyramid 75
revolver 76
rhino 77
rooster 78
saxophone 79
schooner 80
scissors 81
scorpion 82
sea_horse 83
snoopy 84
soccer_ball 8

[36mres4[0m: [32mInt[0m = [32m0[0m

The resulting list file is in the format index\t(one or more label)\tpath. In this case there is only one label for each image but you can modify the list to add in more for multi label training.
Then we can use this list to create our record io file:

In [6]:
"python "+MXNET_HOME+"/tools/im2rec.py --num-thread=4 --pass-through=1 data/caltech data/101_ObjectCategories"!

Creating .rec file from /home/ec2-user/mxnet-notebooks/scala/basic/data/caltech.lst in /home/ec2-user/mxnet-notebooks/scala/basic/data
time: 0.00183486938477  count: 0
time: 0.0827140808105  count: 1000
time: 0.0803511142731  count: 2000
time: 0.0845577716827  count: 3000
time: 0.0807552337646  count: 4000
time: 0.0814788341522  count: 5000
time: 0.0812129974365  count: 6000
time: 0.0810561180115  count: 7000
time: 0.0805099010468  count: 8000
time: 0.0889499187469  count: 9000


[36mres5[0m: [32mInt[0m = [32m0[0m

The record io files are now generated in data folder.

## ImageRecordIter
IO.ImageRecordIter can be used for loading image data saved in record io format. It is available in all frontend languages, but as it's implemented in C++, it is less flexible.

To use ImageRecordIter, simply create an instance by loading your record file:

**Parameters**
- **path_imglist** (string, optional, default='') – Dataset Param: Path to image list.
- **path_imgrec** (string, optional, default='./data/imgrec.rec') – Dataset Param: Path to image record file.
- **aug_seq** (string, optional, default='aug_default') – Augmentation Param: the augmenter names to represent sequence of augmenters to be applied, seperated by comma. Additional keyword parameters will be seen by these augmenters.
- **label_width** (int, optional, default='1') – Dataset Param: How many labels for an image.
- **data_shape** (Shape(tuple), required) – Dataset Param: Shape of each instance generated by the DataIter.
- **preprocess_threads** (int, optional, default='4') – Backend Param: Number of thread to do preprocessing.
- **verbose** (boolean, optional, default=True) – Auxiliary Param: Whether to output parser information.
- **num_parts** (int, optional, default='1') – partition the data into multiple parts
- **part_index** (int, optional, default='0') – the index of the part will read
- **shuffle_chunk_size** (long (non-negative), optional, default=0) – the size(MB) of the shuffle chunk, used with shuffle=True, it can enable global shuffling
- **shuffle_chunk_seed** (int, optional, default='0') – the seed for chunk shuffling
- **shuffle** (boolean, optional, default=False) – Augmentation Param: Whether to shuffle data.
- **seed** (int, optional, default='0') – Augmentation Param: Random Seed.
- **batch_size** (int (non-negative), required) – Batch Param: Batch size.
- **round_batch** (boolean, optional, default=True) – Batch Param: Use round robin to handle overflow batch.
- **prefetch_buffer** (long (non-negative), optional, default=4) – Backend Param: Number of prefetched parameters
- **dtype** ({None, 'float16', 'float32', 'float64', 'int32', 'uint8'},optional, default='None') – Output data type. Leave as None to useinternal data iterator’s output type
- **resize** (int, optional, default='-1') – Augmentation Param: scale shorter edge to size before applying other augmentations.
- **rand_crop** (boolean, optional, default=False) – Augmentation Param: Whether to random crop on the image
- **crop_y_start** (int, optional, default='-1') – Augmentation Param: Where to nonrandom crop on y.
- **crop_x_start** (int, optional, default='-1') – Augmentation Param: Where to nonrandom crop on x.
- **max_rotate_angle** (int, optional, default='0') – Augmentation Param: rotated randomly in [-max_rotate_angle, max_rotate_angle].
- **max_aspect_ratio** (float, optional, default=0) – Augmentation Param: denotes the max ratio of random aspect ratio augmentation.
- **max_shear_ratio** (float, optional, default=0) – Augmentation Param: denotes the max random shearing ratio.
- **max_crop_size** (int, optional, default='-1') – Augmentation Param: Maximum crop size.
- **min_crop_size** (int, optional, default='-1') – Augmentation Param: Minimum crop size.
- **max_random_scale** (float, optional, default=1) – Augmentation Param: Maximum scale ratio.
- **min_random_scale** (float, optional, default=1) – Augmentation Param: Minimum scale ratio.
- **max_img_size** (float, optional, default=1e+10) – Augmentation Param: Maximum image size after resizing.
- **min_img_size** (float, optional, default=0) – Augmentation Param: Minimum image size after resizing.
- **random_h** (int, optional, default='0') – Augmentation Param: Maximum random value of H channel in HSL color space.
- **random_s** (int, optional, default='0') – Augmentation Param: Maximum random value of S channel in HSL color space.
- **random_l** (int, optional, default='0') – Augmentation Param: Maximum random value of L channel in HSL color space.
- **rotate** (int, optional, default='-1') – Augmentation Param: Rotate angle.
- **fill_value** (int, optional, default='255') – Augmentation Param: Filled color value while padding.
- **inter_method** (int, optional, default='1') – Augmentation Param: 0-NN 1-bilinear 2-cubic 3-area 4-lanczos4 9-auto 10-rand.
- **pad** (int, optional, default='0') – Augmentation Param: Padding size.
- **mirror** (boolean, optional, default=False) – Augmentation Param: Whether to mirror the image.
- **rand_mirror** (boolean, optional, default=False) – Augmentation Param: Whether to mirror the image randomly.
- **mean_img** (string, optional, default='') – Augmentation Param: Mean Image to be subtracted.
- **mean_r** (float, optional, default=0) – Augmentation Param: Mean value on R channel.
- **mean_g** (float, optional, default=0) – Augmentation Param: Mean value on G channel.
- **mean_b** (float, optional, default=0) – Augmentation Param: Mean value on B channel.
- **mean_a** (float, optional, default=0) – Augmentation Param: Mean value on Alpha channel.
- **scale** (float, optional, default=1) – Augmentation Param: Scale in color space.
- **max_random_contrast** (float, optional, default=0) – Augmentation Param: Maximum ratio of contrast variation.
- **max_random_illumination** (float, optional, default=0) – Augmentation Param: Maximum value of illumination variation.


In [7]:
import ml.dmlc.mxnet._

val dataIter = IO.ImageRecordIter(Map(
    "path_imgrec" -> "data/caltech.rec", // the target record file
    "data_shape" -> "(3, 227, 227)", // output data shape. An 227x227 region will be cropped from the original image.
    "batch_size" -> "4", // number of samples per batch
    "resize" -> "256" // resize the shorter edge to 256 before cropping
    // ... you can add more augumentation options here. check above to see all possible choices
    ))

dataIter.reset()
val batch = dataIter.next()
val data = batch.data(0)


log4j:WARN No appenders could be found for logger (MXNetJVM).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


[32mimport [36mml.dmlc.mxnet._[0m
[36mdataIter[0m: [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mDataIter[0m = non-empty iterator
[36mbatch[0m: [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mDataBatch[0m = ml.dmlc.mxnet.DataBatch@1dee5185
[36mdata[0m: [32mml[0m.[32mdmlc[0m.[32mmxnet[0m.[32mNDArray[0m = ml.dmlc.mxnet.NDArray@496baa2a

## Next Step
- [Record IO](https://github.com/dmlc/mxnet-notebooks/tree/master/scala/basic/record_io_scala.ipynb) Read & Write RecordIO files with scala interface
- [Advanced Image IO](https://github.com/dmlc/mxnet-notebooks/tree/master/scala/basic/advanced_img_io_scala.ipynb) Advanced image IO for detection, segmentation, etc...