#  Complete - Meet TensorFlow! Training a XOR Model

In this example, we assemble a multilayer peceptron network that can perform XOR. 

It's not very useful, but it showcases how you build up a model using layers, and how to execute training with that model. 

It's simple enough that you know whether it's correct... which is why we're doing it!


## Setting up

First, we need to `import` the TensorFlow framework:

In [0]:
import TensorFlow

## Creating the model

To represent our XOR neural network model, we need to create a `struct`, adhering to the  [`Layer` Protocol](https://www.tensorflow.org/swift/api_docs/Protocols/Layer) (which is part of Swift For TensorFlow's API). Ours is called `XORModel`.

Inside the model, we want three layers:
* an input layer, to take the input
* a hidden layer 
* an output layer, to provide the output

All three layers should be a `Dense` layer (a [densely-connected layer](https://www.tensorflow.org/swift/api_docs/Structs/Dense)) that takes an `inputSize` and an `outputSize`. 

The `inputSize` specifies that the input to the layer is of that many values. Likewise `outputSize`, for the out of the layer.

Each will have an activation using an `activation` function determines the output shape of each node in the layer. There are many available activations, but [ReLU](https://www.tensorflow.org/swift/api_docs/Functions#leakyrelu_:alpha:) and [Sigmoid](https://www.tensorflow.org/swift/api_docs/Functions#sigmoid_:) are common. 

For our three layers, we'll use `sigmoid`.

We'll also need to provide a definition of our `@differentiable` `func`, `callAsFunction()`. In this case, we want it to return the `input` sequenced through (passed through) the three layers. 

Helpfully, the `Differentiable` `protocol` that comes with Swift for TensorFlow has a method, [`sequenced()`](https://www.tensorflow.org/swift/api_docs/Protocols/Differentiable#sequencedthrough:_:) that makes this trivial.



In [0]:
// Create a XORModel Struct
struct XORModel: Layer
{
  // define three layers, each of Dense type
  var inputLayer = Dense<Float>(inputSize: 2, outputSize: 2, activation: sigmoid)
  var hiddenLayer = Dense<Float>(inputSize: 2, outputSize: 2, activation: sigmoid)
  var outputLayer = Dense<Float>(inputSize: 2, outputSize: 1, activation: sigmoid)
  
  // provide the differentiable thingo
  @differentiable func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float>
  {
    return input.sequenced(through: inputLayer, hiddenLayer, outputLayer)
  }
}

## Creating an instance of our model

Here we need to create an instance of our XORModel Struct, which we defined above. This will be our model.

In [0]:
var model = XORModel()

## Creating an optimizer

And we need an [optimiser](https://www.tensorflow.org/swift/api_docs/Protocols/Optimizer), in this case we're going to use [stochastic gradient descent (SGD) optimiser](https://www.tensorflow.org/swift/api_docs/Classes/SGD), which we can get from the Swift for TensorFlow library.

Our optimiser is, obviously, for the model instance we defined a moment ago, and wants a learning rate of about 0.02.

In [0]:
let optimiser = SGD(for: model, learningRate: 0.02)

##  Creating and labelling training data



We need an array of type [`Tensor`](https://www.tensorflow.org/swift/api_docs/Structs/Tensor) to hold our training data (`[0, 0], [0, 1], [1, 0], [1, 1]`):

In [0]:
let trainingData: Tensor<Float> = [[0, 0], [0, 1], [1, 0], [1, 1]]

And we need to label the training data so that we know the correct outputs:


In [0]:
let trainingLabels: Tensor<Float> = [[0], [1], [1], [0]]

## Training the model

First, we need a hyperparameter for epochs:

In [0]:
let epochs = 100_000

Then we need a training loop. We train the model by iterating through our epochs, and each time update the gradient (the 𝛁 symbol, nabla, is often used to represent gradient). Our gradient is of type [`TangentVector`](https://www.tensorflow.org/swift/api_docs/Protocols/Differentiable#tangentvector), and represents a differentiable value’s derivatives.

Each epoch, we set the predicted value to be our training data, and the expected value to be our training data, and calculate the loss using [`meanSquaredError()`](https://www.tensorflow.org/swift/api_docs/Functions#meansquarederrorpredicted:expected:).

Every so often we also want to print out the epoch we're in, and the current loss, so we can watch the traning. We also need to return loss.

Finally, we need to use our [optimizer](https://www.tensorflow.org/swift/api_docs/Protocols/Optimizer) to [update](https://www.tensorflow.org/swift/api_docs/Protocols/Optimizer#update_:along:) the differentiable variables, along the gradient.


In [0]:
for epoch in 0..<epochs
{
    // closure for the gradient
    let 𝛁model = model.gradient { model -> Tensor<Float> in

        // predicted value (the training data)
        let ŷ = model(trainingData)

        // loss 
        let loss = meanSquaredError(predicted: ŷ, expected: trainingLabels)

        // sometimes we want to print an update
        if epoch % 5000 == 0
        {
          print("epoch: \(epoch) loss: \(loss)")
        }
        return loss
    }
    // update the model
    optimiser.update(&model, along: 𝛁model)
}

epoch: 0 loss: 0.28061706
epoch: 5000 loss: 0.24945608
epoch: 10000 loss: 0.2490734
epoch: 15000 loss: 0.24835764
epoch: 20000 loss: 0.2469819
epoch: 25000 loss: 0.24420786
epoch: 30000 loss: 0.23800798
epoch: 35000 loss: 0.2237089
epoch: 40000 loss: 0.20193933
epoch: 45000 loss: 0.18637396
epoch: 50000 loss: 0.17841955
epoch: 55000 loss: 0.17337665
epoch: 60000 loss: 0.16862151
epoch: 65000 loss: 0.1626256
epoch: 70000 loss: 0.15436587
epoch: 75000 loss: 0.14475374
epoch: 80000 loss: 0.1311733
epoch: 85000 loss: 0.056418493
epoch: 90000 loss: 0.02172982
epoch: 95000 loss: 0.012243575


## Testing the model

In [0]:
print(round(model.inferring(from: [[0, 0], [0, 1], [1, 0], [1, 1]])))

[[0.0],
 [1.0],
 [1.0],
 [0.0]]
