# Load and minibatch MNIST data
(c) Deniz Yuret, 2019

* Objective: Learning the structure of the [MNIST](http://yann.lecun.com/exdb/mnist) dataset, usage of the [Knet.Data](https://github.com/denizyuret/Knet.jl/blob/master/src/data.jl) struct.
* Knet: dir, minibatch, Data
* mnist.jl: mnist, mnistview

In [None]:
# Load packages, import symbols
using Pkg; for p in ("Knet","Images","ImageMagick"); haskey(Pkg.installed(),p) || Pkg.add(p); end
using Knet: Knet, dir, minibatch, Data

In [None]:
# This loads the MNIST handwritten digit recognition dataset:
include(Knet.dir("data","mnist.jl")) # Knet.dir constructs a path relative to Knet root
xtrn,ytrn,xtst,ytst = mnist()
println.(summary.((xtrn,ytrn,xtst,ytst)));

In [None]:
# Here is the first five images from the test set:
using Images
hcat([mnistview(xtst,i) for i=1:5]...)

In [None]:
# Here are their labels (10 is used to represent 0)
println(Int.(ytst[1:5]));

In [None]:
# `minibatch` splits the data tensors to small chunks called minibatches
# It returns an iterator which can be used in a for loop, e.g. `for (x,y) in dtrn`
dtrn = minibatch(xtrn,ytrn,100)
dtst = minibatch(xtst,ytst,100)

In [None]:
# Each minibatch is an (x,y) pair where x is 100 (28x28x1) images and y are the corresponding 100 labels.
# Here is the first minibatch in the test set:
(x,y) = first(dtst)
summary.((x,y))

In [None]:
# dtrn generates 600 minibatches of 100 images (total 60000)
# dtst generates 100 minibatches of 100 images (total 10000)
n = 0
for (x,y) in dtrn
    n += 1
end
n