# Flux JL

## Flux Model Zoo Examples

# 2. Flux DataLoader Tutorial for MNIST 

**FluxML contributors**

**Source:** https://github.com/FluxML/model-zoo/tree/master/tutorials/dataloader

In this notebook we will go over a trivial example of how dataloaders work in Flux.


In [1]:
using MLDatasets: MNIST
using Flux.Data: DataLoader
using Flux: onehotbatch

### Step 1: Load MNIST data into memory

MNIST is small enough to fit into memory. We will load the whole train and test datasets using Float32 precision. Since stochastic gradient descent is an approximation of the true gradients, we can trade off precision (which doesn't even matter owing to the variance and convergence properties of the algorithm) for more efficient performance.

In total the train dataset consists of 60000 28x28 images (in grayscale). The labels are a vector consisting of the integer (as Float32) value of the actual handwritten digit.

In [2]:
train_x, train_y = MNIST.traindata(Float32)
test_x, test_y = MNIST.testdata(Float32)
;

### Step 2: Pipe dataset through a DataLoader

Before we can use the MNIST data in a machine learning model, we need to reshape the data into a format that makes sense for the mathematical operations of the neural network. 
* The shape of the data is 28 $\times$ 28 $\times$ 60000 which follows a WHB format (width $\times$ height $\times$ batch size). 
* For convolutional neural networks we will prefer a WHCB format (width $\times$ height $\times$ no. of channels $\times$ batch size)
* Since the data is grayscale, the number of channels is just one.

In [3]:
size(train_x)

(28, 28, 60000)

In [5]:
train_x = reshape(train_x, 28, 28, 1, :)
test_x = reshape(test_x, 28, 28, 1, :)
;

Secondly we will encode the labels as *one-hot* vectors to match the expected output dimensions of the convolutional neural network.

In [6]:
train_y = onehotbatch(train_y, 0:9) 
test_y = onehotbatch(test_y, 0:9)
;

For a summary of what we have just done, let's inspect the first sample

In [9]:
train_x[:,:,:,1]

28×28×1 Array{Float32,3}:
[:, :, 1] =
 0.0  0.0  0.0  0.0  0.0  0.0        …  0.0       0.0        0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.0       0.0        0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.0       0.0        0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.0       0.0        0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.215686  0.533333   0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0        …  0.67451   0.992157   0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.886275  0.992157   0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.992157  0.992157   0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.992157  0.831373   0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.992157  0.529412   0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0        …  0.992157  0.517647   0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0           0.956863  0.0627451  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0117647     0.521569  0.0        0

What number is this?

In [11]:
train_y[:,1]

10-element Flux.OneHotVector:
 0
 0
 0
 0
 0
 1
 0
 0
 0
 0

It is a $5$.

Now create a data loader for the train and test sets. This is an iterable which we will be able to iterate over during training and evaluation.

In [13]:
train_dataloader = DataLoader(train_x, train_y, batchsize=128, shuffle=true)
test_dataloader = DataLoader(test_x, test_y)
;

### Step 3: Iterating over the dataset

The dataloaders are iterables which split the dataset into fixed batch sizes (except for the last which may contain remainders). The dataloaders can be iterated over during the training loop

In [14]:
for (x,y) in train_dataloader
    @assert size(x) == (28, 28, 1, 128) || size(x) == (28, 28, 1, 96)
    @assert size(y) == (10, 128) || size(y) == (10, 96)
end