<a href="https://colab.research.google.com/github/dbolella/s4tf-lenet-mnist/blob/master/lenet_mnist_swift_models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LeNet-5 & MNIST using Swift for Tensorflow
by Danny Bolella

To learn more about how this Colab works, check out the associated Medium article at: 

This Colab is a reworking of the official S4TF Example found at: https://github.com/tensorflow/swift-models/tree/master/Examples/LeNet-MNIST.

## Installing and Importing Libraries
First, we pull in 2 libraries as swift packages from the official S4TF models repo.  We use `%install` to accomplish this.  Once complete, we then import the libraries we'll be using (Tensorflow is already available on Colab, no need to install).

In [1]:
%install '.package(url: "https://github.com/tensorflow/swift-models.git", .branch("master"))' ImageClassificationModels Datasets

import TensorFlow
import Datasets
import ImageClassificationModels

Installing packages:
	.package(url: "https://github.com/tensorflow/swift-models.git", .branch("master"))
		ImageClassificationModels
		Datasets
With SwiftPM flags: []
Working in: /tmp/tmptn8w0m2q/swift-install
Fetching https://github.com/tensorflow/swift-models.git
Cloning https://github.com/tensorflow/swift-models.git
Resolving https://github.com/tensorflow/swift-models.git at master
[1/11] Compiling ImageClassificationModels DenseNet121.swift
[2/11] Compiling ImageClassificationModels LeNet-5.swift
[3/11] Compiling Datasets DatasetUtilities.swift
[4/11] Compiling Datasets MNIST.swift
[5/12] Merging module Datasets
[10/12] Compiling ImageClassificationModels SqueezeNet.swift
[11/12] Compiling ImageClassificationModels WideResNet.swift
[12/13] Merging module ImageClassificationModels
[13/14] Compiling jupyterInstalledPackages jupyterInstalledPackages.swift
[14/15] Merging module jupyterInstalledPackages
[15/15] Linking libjupyterInstalledPackages.so
Initializing Swift...
Installation c

## Model, Dataset, Optimizer... Oh My!
Next, we instantiate the dataset, model, and optimizer we will be using.  We also setup our epochCount (the number of times we'll train our model) and batchSize (how much data we'll train with at a time).

In [2]:
let batchSize = 128

let dataset = MNIST(batchSize: batchSize)

var model = LeNet()

let optimizer = SGD(for: model, learningRate: 0.1)

let epochCount = 12

Loading resource: train-images-idx3-ubyte
Loading local data at: /content/train-images-idx3-ubyte
Succesfully loaded resource: train-images-idx3-ubyte
Loading resource: train-labels-idx1-ubyte
Loading local data at: /content/train-labels-idx1-ubyte
Succesfully loaded resource: train-labels-idx1-ubyte
Loading resource: t10k-images-idx3-ubyte
Loading local data at: /content/t10k-images-idx3-ubyte
Succesfully loaded resource: t10k-images-idx3-ubyte
Loading resource: t10k-labels-idx1-ubyte
Loading local data at: /content/t10k-labels-idx1-ubyte
Succesfully loaded resource: t10k-labels-idx1-ubyte


## Benchmarking Prep
Lastly, we create a `struct` that we will use to hold our training and testing benchmarks per epoch.  Note that we also have a function in our struct to update our `GuessCount` stats.  This eliminates duplicate code in our training and testing loops.

In [0]:

struct Statistics {
    var correctGuessCount: Int = 0
    var totalGuessCount: Int = 0
    var totalLoss: Float = 0
    
    mutating func updateGuessCounts(logits: Tensor<Float>, labels: Tensor<Int32>, batchSize: Int) {
      let correctPredictions = logits.argmax(squeezingAxis: 1) .== labels
      self.correctGuessCount += Int(
            Tensor<Int32>(correctPredictions).sum().scalarized())
      self.totalGuessCount += batchSize
    }
}

## Training Day
Lastly, we run our training!  We run the training loop based on our `epochCount`.  Each time we do, we loop through batches of our data, run it through our model, update our benchmarks, and optimize along the gradients.  

At the end of each epoch, we print out our benchmark data.  We should see our loss decrease and our accuracy increase with each pass of training our model.

In [0]:
print("Beginning training...")

// The training loop.
for epoch in 1...epochCount {
    var trainStats = Statistics()
    var testStats = Statistics()
    Context.local.learningPhase = .training
    for i in 0 ..< dataset.trainingSize / batchSize {
        let images = dataset.trainingImages.minibatch(at: i, batchSize: batchSize)
        let labels = dataset.trainingLabels.minibatch(at: i, batchSize: batchSize)
        // Compute the gradient with respect to the model.
        let (loss, gradients) = valueWithGradient(at: model) { model -> Tensor<Float> in
            let logits = model(images)
            trainStats.updateGuessCounts(logits: logits, labels: labels, batchSize: batchSize)
            return softmaxCrossEntropy(logits: logits, labels: labels)
        }
        trainStats.totalLoss += loss.scalarized()
        optimizer.update(&model, along: gradients)
    }

    Context.local.learningPhase = .inference
    for i in 0 ..< dataset.testSize / batchSize {
        let images = dataset.testImages.minibatch(at: i, batchSize: batchSize)
        let labels = dataset.testLabels.minibatch(at: i, batchSize: batchSize)
        // Compute loss on test set
        let logits = model(images)
        testStats.updateGuessCounts(logits: logits, labels: labels, batchSize: batchSize)
        let loss = softmaxCrossEntropy(logits: logits, labels: labels)
        testStats.totalLoss += loss.scalarized()
    }

    let trainAccuracy = Float(trainStats.correctGuessCount) / Float(trainStats.totalGuessCount)
    let testAccuracy = Float(testStats.correctGuessCount) / Float(testStats.totalGuessCount)
    print("""
          [Epoch \(epoch)] \
          Training Loss: \(trainStats.totalLoss), \
          Training Accuracy: \(trainStats.correctGuessCount)/\(trainStats.totalGuessCount) \
          (\(trainAccuracy)), \
          Test Loss: \(testStats.totalLoss), \
          Test Accuracy: \(testStats.correctGuessCount)/\(testStats.totalGuessCount) \
          (\(testAccuracy))
          """)
}

Beginning training...
[Epoch 1] Training Loss: 957.23724, Training Accuracy: 27154/59904 (0.45329192), Test Loss: 128.8437, Test Accuracy: 8156/9984 (0.81690705)
[Epoch 2] Training Loss: 768.2697, Training Accuracy: 49305/59904 (0.8230669), Test Loss: 126.20526, Test Accuracy: 8441/9984 (0.8454527)
[Epoch 3] Training Loss: 730.51227, Training Accuracy: 54220/59904 (0.9051148), Test Loss: 119.77396, Test Accuracy: 9282/9984 (0.9296875)
[Epoch 4] Training Loss: 715.5964, Training Accuracy: 56055/59904 (0.9357472), Test Loss: 118.913506, Test Accuracy: 9367/9984 (0.9382011)
[Epoch 5] Training Loss: 709.2892, Training Accuracy: 56851/59904 (0.9490351), Test Loss: 117.98292, Test Accuracy: 9490/9984 (0.9505208)
[Epoch 6] Training Loss: 705.69525, Training Accuracy: 57249/59904 (0.95567906), Test Loss: 117.355515, Test Accuracy: 9567/9984 (0.9582332)
[Epoch 7] Training Loss: 702.83966, Training Accuracy: 57592/59904 (0.9614049), Test Loss: 117.02219, Test Accuracy: 9603/9984 (0.96183896)
[Ep