# Machine Learning with Swift for TensorFlow and Plotly

[![GitHubBadge]][GitHubLink] [![ColabBadge]][ColabLink]


[ColabBadge]: https://colab.research.google.com/assets/colab-badge.svg "Run notebook in Google Colab"
[ColabLink]: https://colab.research.google.com/github/vojtamolda/Plotly.swift/blob/master/Examples/Notebooks/Machine%20Learning.ipynb

[GitHubBadge]: https://img.shields.io/badge/|-Edit_on_GitHub-green.svg?logo=github "Edit notebook's source code on GitHub"
[GitHubLink]: https://github.com/vojtamolda/Plotly.swift/blob/master/Examples/Notebooks/Machine%20Learning.ipynb


## Introduction

In this tutorial, we'll take a look at using Swift for TensorFlow in conjunction with Plotly to build a machine learning model and interactively visualize the resulting data in the `swift-jupyter` environment.

### Swift for TensorFlow

With Tensorflow's Python API experiencing performance limitations and lacking features, Swift for TensorFlow began development as a next-generation platform for machine learning, incorporating the latest research across machine learning, compilers, differentiable programming, systems design, and beyond. Read more about it on its [website](https://www.tensorflow.org/swift) and see the source code on [GitHub](https://github.com/tensorflow/swift).

### Plotly

The [Plotly.swift](https://github.com/vojtamolda/Plotly.swift) framework is an interactive plotting library that lets you plot data natively in Swift. It's based on converting the plots to an intermediate JSON representation that's later render via user-side JavaScript in the browser. Read more and inspect the source code on [GitHub](https://github.com/vojtamolda/Plotly.swift).

### Significance

Previously, Swift for Tensorflow projects needed to fallback to Python libraries such as `matplotlib` and `numpy` for data analysis and visualization. However, as more and more features fallback to Python, this results in these projects taking on some (not all) of the previous performance limitations and issues that are supposed to be avoided by using Swift.

This tutorial creates a project to build, train, analyze, and visualize a machine learning model all natively in Swift, with no Python fallback. Hopefully, this will be a step forward towards the goal of having S4TF machine learning researchers using pure Swift in the future.

In [1]:
%install '.package(url: "https://github.com/tensorflow/swift-models", .branch("tensorflow-0.6"))' Datasets
%install '.package(url: "https://github.com/vojtamolda/Plotly.swift", .branch("master"))' Plotly
print("\u{001B}[2J") //removes installation output

%include "EnableIPythonDisplay.swift"




Finally, we'll configure all of the necessary imports.

In [0]:
import TensorFlow
import Datasets
import Plotly

// No PythonKit import! :D

## Initializing the Dataset

For our tutorial, we'll be using the MNIST dataset with a batch size of 500. There are 60,000 images in training and 10,000 in testing, both of which should divide into 500 very nicely.

In [3]:
let batchSize = 500
let mnist = Datasets.MNIST(batchSize: batchSize)

Loading resource: train-images-idx3-ubyte
Loading local data at: /content/train-images-idx3-ubyte
Succesfully loaded resource: train-images-idx3-ubyte
Loading resource: train-labels-idx1-ubyte
Loading local data at: /content/train-labels-idx1-ubyte
Succesfully loaded resource: train-labels-idx1-ubyte
Loading resource: t10k-images-idx3-ubyte
Loading local data at: /content/t10k-images-idx3-ubyte
Succesfully loaded resource: t10k-images-idx3-ubyte
Loading resource: t10k-labels-idx1-ubyte
Loading local data at: /content/t10k-labels-idx1-ubyte
Succesfully loaded resource: t10k-labels-idx1-ubyte


## Building the Model

Next, let's build a very simple model.

First, we define a simple struct, `Model`, that conforms to the `Layer` protocol. This model will take in the input image and return the class output.

We'll use one `Flatten<Float>` layer with hidden and output `Dense<Float>` layers. Feel free to play around with different sizes, layers, etc. to see how the model would change. Note that the first input size must be set at `28 * 28` as the images provided in MNIST are of that size and the last output size must be set to `10` since those are the number of classes in MNIST. Also, the `outputSize` of `hidden` and `inputSize` of `output` should be the same.

The `Layer` protocol requires a function, `callAsFunction`, that is called to pass the `input` through our model.

In [0]:
struct Model: TensorFlow.Layer {
    var flatten = TensorFlow.Flatten<Float>()
    var hidden = TensorFlow.Dense<Float>(inputSize: 28 * 28, outputSize: 20, activation: relu)
    var output = TensorFlow.Dense<Float>(inputSize: 20, outputSize: 10, activation: softmax)
    
    @differentiable
    func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float> {
        return input.sequenced(through: flatten, hidden, output)
    }
}

Lastly, we can initialize an instance with:

In [0]:
var model = Model()

## Training the Model

In order to train the model, we'll have to determine the number of epochs our model trains for. This will be the number of times our model will "pass through" the entire dataset. For our example, we'll set this number equal to 10.

In [0]:
let numEpochs = 10

We also need an optimizer. This will "shape" our model as we train it. Let's use the Adam optimizer, which is an adaptive learning rate optimization algorithm derived from adaptive moment estimation.

In [0]:
let optimizer = TensorFlow.Adam(for: model)

### Benchmarking

We can create a basic `struct` to hold and update some data while training.

In [0]:
struct Stat {
    var correct: Int = 0
    var loss: Float = 0
    mutating func update(logits: Tensor<Float>, labels: Tensor<Int32>) {
        self.correct += Int(Tensor<Int32>(logits.argmax(squeezingAxis: 1) .== labels).sum().scalarized())
    }
}

We'll also create some arrays to hold some values for after training in order to visualize them with Plotly.

In [0]:
var epochs = stride(from: Float(1), to: Float(numEpochs+1), by: Float(1))
var trainLoss: [Float] = []
var trainAccuracy: [Float] = []
var testLoss: [Float] = []
var testAccuracy: [Float] = []

Next, we can start the training model loop.

In [11]:
//training loop
for epoch in epochs {
    //training phase
    var trainStat = Stat()
    Context.local.learningPhase = .training
    for i in 0 ..< mnist.trainingSize / batchSize {
        //get batch of images/labels
        let images = mnist.trainingImages.minibatch(at: i, batchSize: batchSize)
        let labels = mnist.trainingLabels.minibatch(at: i, batchSize: batchSize)
        //compute gradient
        let(loss, gradients) = valueWithGradient(at: model) {model -> Tensor<Float> in
            let logits = model(images)
            trainStat.update(logits: logits, labels: labels)
            return softmaxCrossEntropy(logits: logits, labels: labels)
        }
        trainStat.loss += loss.scalarized()
        //update model
        optimizer.update(&model, along: gradients)
    }
    
    //inference phase
    var testStat = Stat()
    Context.local.learningPhase = .training
    for i in 0 ..< mnist.testSize / batchSize {
        //get batch of images/labels
        let images = mnist.testImages.minibatch(at: i, batchSize: batchSize)
        let labels = mnist.testLabels.minibatch(at: i, batchSize: batchSize)
        //compute loss
        let logits = model(images)
        testStat.update(logits: logits, labels: labels)
        let loss = softmaxCrossEntropy(logits: logits, labels: labels)
        testStat.loss += loss.scalarized()
    }
    
    //calculate and store data
    trainLoss.append(Float(trainStat.loss)/Float(120))
    trainAccuracy.append(Float(trainStat.correct)/Float(60000))
    testLoss.append(Float(testStat.loss)/Float(20))
    testAccuracy.append(Float(testStat.correct)/Float(10000))
    
    //print data
    print("Epoch: \(Int(epoch))")
    print("Train Loss: \(trainLoss[Int(epoch)-1])")
    print("Train Accuracy: \(trainAccuracy[Int(epoch)-1])")
    print("Test Loss: \(testLoss[Int(epoch)-1])")
    print("Test Accuracy: \(testAccuracy[Int(epoch)-1])")
    print("\n")
}

Epoch: 1
Train Loss: 1.9428478
Train Accuracy: 0.6056833
Test Loss: 1.6811168
Test Accuracy: 0.8472


Epoch: 2
Train Loss: 1.6384782
Train Accuracy: 0.87121665
Test Loss: 1.5965374
Test Accuracy: 0.8994


Epoch: 3
Train Loss: 1.5905955
Train Accuracy: 0.9001333
Test Loss: 1.5723908
Test Accuracy: 0.914


Epoch: 4
Train Loss: 1.572142
Train Accuracy: 0.91148335
Test Loss: 1.5605278
Test Accuracy: 0.9195


Epoch: 5
Train Loss: 1.5617635
Train Accuracy: 0.91756666
Test Loss: 1.5532496
Test Accuracy: 0.923


Epoch: 6
Train Loss: 1.5548134
Train Accuracy: 0.9211
Test Loss: 1.5481648
Test Accuracy: 0.9259


Epoch: 7
Train Loss: 1.5496384
Train Accuracy: 0.9246333
Test Loss: 1.5442741
Test Accuracy: 0.9288


Epoch: 8
Train Loss: 1.5454963
Train Accuracy: 0.92775
Test Loss: 1.5412167
Test Accuracy: 0.9304


Epoch: 9
Train Loss: 1.5420247
Train Accuracy: 0.93065
Test Loss: 1.5386671
Test Accuracy: 0.9318


Epoch: 10
Train Loss: 1.5390309
Train Accuracy: 0.9328833
Test Loss: 1.5365183
Test

## Visualizing the Data

Now, let's visualize our data with Plotly! As demonstrated below, Plotly is very flexible, easy, and intuitive to use.

From our arrays, we've gathered the following data:

In [12]:
print("Train Loss: \(trainLoss)")
print("Train Accuracy: \(trainAccuracy)")
print("Test Loss: \(testLoss)")
print("Test Accuracy: \(testAccuracy)")

Train Loss: [1.9428478, 1.6384782, 1.5905955, 1.572142, 1.5617635, 1.5548134, 1.5496384, 1.5454963, 1.5420247, 1.5390309]
Train Accuracy: [0.6056833, 0.87121665, 0.9001333, 0.91148335, 0.91756666, 0.9211, 0.9246333, 0.92775, 0.93065, 0.9328833]
Test Loss: [1.6811168, 1.5965374, 1.5723908, 1.5605278, 1.5532496, 1.5481648, 1.5442741, 1.5412167, 1.5386671, 1.5365183]
Test Accuracy: [0.8472, 0.8994, 0.914, 0.9195, 0.923, 0.9259, 0.9288, 0.9304, 0.9318, 0.9341]


### Accuracy
Next, we'll make two `Scatter` traces. One for training and the other for test accuracy:

In [0]:
var trainAccuracyTrace = Plotly.Scatter(
    name: "Train Accuracy",
    x: epochs,
    y: trainAccuracy,
    line: .init(color: .orange),
    xAxis: .init(uid: 1),
    yAxis: .init(uid: 1)
)
var testAccuracyTrace = Plotly.Scatter(
    name: "Test Accuracy",
    x: epochs,
    y: testAccuracy,
    line: .init(color: .lightBlue),
    xAxis: .init(uid: 1, title: "Epoch"),
    yAxis: .init(uid: 1, title: "Score")
)

We can then configure things such as the title, axes labels, and line settings with a `Layout` object:

In [0]:
let accuracyLayout = Plotly.Layout(
    title: "Accuracy"
)

Finally, we can display an interactive figure in the notebook:

In [15]:
let accuracyFigure = Plotly.Figure(data: [trainAccuracyTrace, testAccuracyTrace], layout: accuracyLayout)
accuracyFigure.display()

From this, we can see that at first, both accuracies started out relatively low, but after each epoch they improved in an inverse exponential pattern.

### Loss

We can do a similar plot for the losses that decrease in an exponential pattern:


In [16]:
var trainLossTrace = Plotly.Scatter(
    name: "Train Loss",
    x: epochs,
    y: trainLoss,
    line: .init(color: .orange),
    xAxis: .init(uid: 1),
    yAxis: .init(uid: 1)
)
var testLossTrace = Plotly.Scatter(
    name: "Test Loss",
    x: epochs,
    y: testLoss,
    line: .init(color: .lightBlue),
    xAxis: .init(uid: 1, title: "Epoch"),
    yAxis: .init(uid: 1, title: "Score")
)

let lossLayout = Plotly.Layout(
    title: "Loss"
)

let lossFigure = Plotly.Figure(data: [trainLossTrace, testLossTrace], layout: lossLayout)
lossFigure.display()

### Confusion Matrix

[Confusion matrix](https://en.wikipedia.org/wiki/Confusion_matrix) is a visualization of the kinds of errors our model makes. For a pair of classes `i` and `j`, the value at `[i][j]` shows how frequently the class `j` was classified as `i`. Correct predictions form the diagonal ofthe matrix where `i` == `j` and off-diagonal values represent errors.

The following code calculates the confusion matrix on the test set:


In [0]:
let digits = Array(0...9)
var confusionMatrix = Tensor<Float>(zeros: [digits.count, digits.count])

Context.local.learningPhase = .inference
for i in 0 ..< mnist.testSize / batchSize {
    let images = mnist.testImages.minibatch(at: i, batchSize: batchSize)
    let labels = mnist.testLabels.minibatch(at: i, batchSize: batchSize)

    let logits = model(images)
    let predictions = logits.argmax(squeezingAxis: 1)

    for (prediction, label) in zip(predictions.scalars, labels.scalars) {
        let iPrediction = TensorRange.index(Int(prediction))
        let iLabel = TensorRange.index(Int(label))
        confusionMatrix[iPrediction, iLabel] += 1
    }
}

Confusion matrix is best visualized as a heatmap with it's entries normalized by the frequency of each label in the dataset:

In [18]:
let datasetLabelFrequency = confusionMatrix.sum(squeezingAxes: 0)
let normalizedConfusionMatrix = confusionMatrix / datasetLabelFrequency * 100

let confusionMatrixHeatmap = Plotly.Heatmap(
    name: "Accuracy",
    z: normalizedConfusionMatrix,
    x: digits, y: digits,
    hoverTemplate: .constant("""
        <span style='font-size: 1.5em'>%{z:.1f}%</span>
        <b>Prediction</b>: <span style='color: red'>%{x}</span> |
        <b>Label</b>: <span style='color: green'>%{y}</span>
        """),
    zMin: 0, zMax: 5,
    colorScale: .blues,
    xAxis: .init(
        title: "Model Prediction",
        dTick: 1
    ),
    yAxis: .init(
        title: "Correct Label",
        dTick: 1
    )
)

let layout = Plotly.Layout(
    title: "Confusion Matrix",
    width: 600, height: 600
)

let confusionMatrixFigure = Plotly.Figure(data: [confusionMatrixHeatmap], layout: layout)
confusionMatrixFigure.display()

###  Examples of Errors

The following code collects all test images that are not correctly classified by our model:


In [0]:
var misclassified :[(image: Tensor<Float>, label: Int, prediction: Int)] = []

Context.local.learningPhase = .inference
for i in 0 ..< mnist.testSize / batchSize {
    let images = mnist.testImages.minibatch(at: i, batchSize: batchSize)
    let labels = mnist.testLabels.minibatch(at: i, batchSize: batchSize)

    let logits = model(images)
    let predictions = logits.argmax(squeezingAxis: 1)

    for i in 0 ..< predictions.scalarCount where predictions[i] != labels[i] {
        let wrong = (image: images[i],
                     label: Int(labels.array[i].scalar!),
                     prediction: Int(predictions.array[i].scalar!))
        misclassified.append(wrong)
    }
}

We display a random sample of the errorousnously classified images in a grid:

In [20]:
let (rows, columns) = (5, 9)
var misclassifiedDigits: [Trace] = []

for row in 1...rows {
    for column in 1...columns {
        let randomlySelected = misclassified.randomElement()!
        let rgbComponents = Array(repeating: 255 * (1 - randomlySelected.image), count: 3)
        let grayscaleImage = Tensor<Float>(concatenating: rgbComponents, alongAxis: -1)
        
        let misclassifiedDigit = Plotly.Image(
            name: "Error",
            z: grayscaleImage,
            hoverTemplate: .constant("""
                <b>Prediction</b>: <span style='color: red;'>\(randomlySelected.prediction)</span> |
                <b>Label</b>: <span style='color: green;'>\(randomlySelected.label)</span>
                """),
            xAxis: .init(uid: UInt(column), ticks: .off, showTickLabels: false,
                         showGrid: false, zeroLine: false),
            yAxis: .init(uid: UInt(row), ticks: .off, showTickLabels: false,
                         showGrid: false, zeroLine: false)
        )
        misclassifiedDigits.append(misclassifiedDigit)
    }
}

let gridLayout = Plotly.Layout(
    title: "Examples of Errors",
    grid: .init(rows: rows, columns: columns)
)

let examplesOfErrors = Plotly.Figure(data: misclassifiedDigits, layout: gridLayout)
examplesOfErrors.display()

## Troubleshooting

Don't be sad if you run into some difficulty with this tutorial, these are some new developments in machine learning and there's bound to be some hiccups along the way :)

### Swift for TensorFlow

If you're having trouble installing/running Swift for TensorFlow, please join the [Google Group](https://groups.google.com/a/tensorflow.org/forum/#!forum/swift) and ask for help! Be as detailed as possible, and nice people will help you find a solution.

### Plotly

If you're having trouble installing/running Plotly, please go to [GitHub Issues](https://github.com/vojtamolda/Plotly.swift/issues) and file an issue! We'll be happy to help you work out any problems you might be facing.

## Conclusion

And we're done! We successfully built, trained, analyzed, and visualized a machine learning model all natively in Swift!

This project was only a step in the right direction to pure Swift machine learning development, as Swift for TensorFlow as well as Plotly are still in early stage active development, with programmers working hard to add to new features, fix bugs, and improve usability. 

Hopefully, through this tutorial it can be seen that Swift for TensorFlow and open source Swift libraries are a real contender for the future of machine learning, and it's entirely possible now to create complete (albeit simple) projects without the need for any Python fallback :P


## Credits/Acknowledgments

This tutorial wouldn't be possible without the previous hard work of other people. It was adapted from the original [SwiftPlot Version](https://github.com/KarthikRIyer/swiftplot/blob/master/Notebooks/Machine%20Learning%20with%20Swift%20for%20TensorFlow%20and%20SwiftPlot.ipynb) for Plotly and is licensed under the Apache 2.0 license. Big thank-you's to the following:

- Swift for TensorFlow team
- SwiftPlot contributors
- Karthik Iyer and Ayush Agrawal for their support and guidance

### References

Here are some references that I found helpful while working on this tutorial:

- [S4TF Tutorial (Wierenga)](https://rickwierenga.com/blog/s4tf/s4tf-mnist.html)
- [S4TF Tutorial (Bolella)](https://heartbeat.fritz.ai/swifty-ml-an-intro-to-swift-for-tensorflow-9edc7045bc0c)
- [Swift for TensorFlow Github](https://github.com/tensorflow/swift)
- [Swift for TensorFlow Documentation](https://www.tensorflow.org/swift)
- [Plotly Github](https://github.com/vojtamolda/Plotly.swift)
- [Plotly Documentation](https://vojtamolda.github.io/Plotly.swift)


Thanks for reading, and have fun playing around with Swift for TensorFlow and Plotly!

Written By: William Zhang

Adapted By: Vojta Molda