# Retroactive changes

<ol>
    <li><a href="https://github.com/SpaceAntelope/IfCntk/blob/master/notebooks/cntk-tutorials/Preparing%20workspace.ipynb">Prepare workspace notebook</a> (<a href="https://areslazarus.com/archive/net-deep-learning-stack-cntk-101-logistic-regression/">blog</a>) updated to include <a href="http://fsprojects.github.io/FSharp.Control.AsyncSeq/">FSharp.Control.AsyncSeq</a> dependency.</li>
    <li>Undescore_separated variable names now indicate a global variable original to the Python tutorial</li>
    <li>Comments starting with <code>// #</code> indicate an unchanged comment from the original python code.</li>    
</ol>
    
I'm also moving all reuseable chart related helper functions into their own script file at <strong>NbVisual.fsx</strong>

# Preparing the workspace for CNTK in jupyter, as always

If referencing CNTK fails, make sure you have followed the instructions in my [Preparing Workspace.ipynb](https://github.com/SpaceAntelope/IfCntk/tree/master/notebooks/cntk-tutorials/Preparing%20workspace.ipynb) notebook. The current notebook assumes that all necessary CNTK nuget DLLs have been copied to a folder named **bin** in the same path. 

In [1]:
#r "netstandard"
#r @"bin\Cntk.Core.Managed-2.6.dll"
#load @".paket\load\main.group.fsx"
 
open System
open System.IO

Environment.GetEnvironmentVariable("PATH")
|> fun path -> sprintf "%s%c%s" path (Path.PathSeparator) (Path.GetFullPath("bin"))
|> fun path -> Environment.SetEnvironmentVariable("PATH", path)

open CNTK
DeviceDescriptor.UseDefaultDevice().Type
|> printfn "Congratulations, you are using CNTK for: %A" 

Congratulations, you are using CNTK for: CPU


Additionally, let's reference any helper functions from the previous notebook:

In [2]:
// Setup display support
#load "AsyncDisplay.fsx"
#load "XPlot.Plotly.fsx"

let device = CNTK.DeviceDescriptor.CPUDevice
let dataType = CNTK.DataType.Float

#load "fsx/NBHelpers.fsx"
#load "fsx/MiscellaneousHelpers.fsx"
#load "fsx/NbVisual.fsx"
#load "fsx/CntkHelpers.fsx"

open NBHelpers
open MiscellaneousHelpers
open NbVisual
open CntkHelpers
open XPlot.Plotly
open MathNet.Numerics

<div class="alert alert-warning">
    <p>Since we are using the <code>storage: none</code> option in paket (see <a href="">Preparing the workspace</a> notebook) attempting to load <strong>AsyncDisplay.fsx</strong> as is produces errors. This is because the script looks to resolve its dependencies within the IfSharp.exe directory tree.</p>

<p>Running <strong>AsyncDisplay.Paket.fsx</strong> just like in the <a href="https://github.com/fsprojects/IfSharp/blob/master/FSharp_Jupyter_Notebooks.ipynb">IfSharp feature notebook</a> would make the library available there, but it has two important drawbacks:
    <ol type="i"> 
        <li>You are locked into using a particular version of the library</li>
        <li><i>IfSharp/.paket/load/main.group.fsx</i> now overrides the scope of this notebook, meaning we lose access to the libraries in the global .nuget folder unless we delete that particular script.</li>
    </ol>
</p>
<p>A simple solution is to just remove the first <i>#load</i> directive from <strong>AsyncDisplay.fsx</strong> to prevent it from looking for <strong>FSharp.Control.AsyncSeq</strong> in the IfSharp tree.</p></div>

<div class="alert alert-info">For more info on setting the device descriptor see the <a href="https://github.com/SpaceAntelope/IfCntk/notebooks/cntk-tutorials/101-LogReg-CPUOnly.ipynb">previous notebook</a> under the heading <strong><a href="https://github.com/SpaceAntelope/IfCntk/blob/master/notebooks/cntk-tutorials/101-LogReg-CPUOnly.ipynb#Global-variables">Global variables</a></strong></div>

# CNTK 102: Feed Forward Network with Simulated Data

[Link to original python notebook](https://github.com/Microsoft/CNTK/blob/master/Tutorials/CNTK_102_FeedForward.ipynb)

## Introduction

ML theory aside, the gist of this tutorial is:
<ol type="i">
    <li>Learn to connect multiple layers like the one in CNTK 101 in order to form a deep learning model and
    <li>Formalize a machine learning pipeline from data to model.
</ol>        

The pipeline consists of Data access, Data transformation, Model creation, Training and Evaluating. This is explictly modelled in [ML.NET](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet).

Apart from that, CNTK-102 is just CNTK-101 reloaded, with much of the same code reappearing. So in order to keep things interesting I embelished a bit: 
<ol type="i" start="3">
    <li> I added a section on using the freshly introduced hidden layer architecture to perform classification on a non linearly separable version of the data generator introduced in 101.</li>
    <li> I added an example of using <strong><a href="http://fsprojects.github.io/FSharp.Control.AsyncSeq/">FSharp.Control.AsyncSeq</a></strong> and IfSharp's <strong>AsyncDisplay.fsx</strong> to display training data on the notebook while training is ongoing.
        </ol>

In [3]:
// # Figure 1
 ImageUrl "https://www.cntk.ai/jup/cancer_data_plot.jpg" 400

## Feed forward network model


In [4]:
// # Figure 2
ImageUrl "https://upload.wikimedia.org/wikipedia/en/5/54/Feed_forward_neural_net.gif" 200

## Data Generation

In [5]:
// # Ensure we always get the same amount of randomness
MiscellaneousHelpers.seed <- 42

// # Define the data dimensions
let input_dim = 2
let num_output_classes = 2

## Input and Labels

In [6]:
// # Create the input variables denoting the features and the label data. Note: the input 
// # does not need additional info on number of observations (Samples) since CNTK first create only 
// # the network topology first 
let mysamplesize = 64
let features, labels = generateRandomDataSample mysamplesize input_dim num_output_classes

In [7]:
simpleScatterPlot features labels "Scaled age (in yrs)" "Tumor size (in cm)" |> Display

This expression was expected to have type
    'string'    
but here has type
    'LinearAlgebra.Matrix<float32>'    
This expression was expected to have type
    'string'    
but here has type
    'LinearAlgebra.Matrix<float32>'    
The type 'string' is not compatible with the type 'LinearAlgebra.Matrix<float32>'
The type 'string' is not compatible with the type 'LinearAlgebra.Matrix<float32>'

## Model creation

In [8]:
ImageUrl "http://cntk.ai/jup/feedforward_network.jpg" 200

In [9]:
let num_hidden_layers = 2
let hidden_layers_dim = 50

In [10]:
// # The input variable (representing 1 observation, in our example of age and size) x, which 
// # in this case has a dimension of 2. 
// #
// # The label variable has a dimensionality equal to the number of output classes in our case 2. 

let input = Variable.InputVariable(shape [|input_dim|], dataType, "Features")
let label = Variable.InputVariable(shape [|num_output_classes|], dataType, "Labels")
let initialization = CNTKLib.GlorotUniformInitializer(1.0)

<div class="alert alert-info">Unlike the python API, weight initialization is not automatic in the managed API</div>

## Feed forward network setup

In [11]:
"z_1 = W·x+b" |> Util.Math

{Latex = "$$z_1 = W·x+b$$";}

In C# Variables, Functions and Parameters types are mostly interchangeable when used as function arguments. These tiny helpers will help make the explicit conversion from CNTK.Function to CNTK.Variable easier to integrate in F# pipelines:

In [12]:
/// Convert Function to Variable
/// <remarks> CNTK helper </remarks>
let inline Var (x : CNTK.Function) = new Variable(x)

/// Convert Variable to Function
/// <remarks> CNTK helper </remarks>
let inline Fun (x : CNTK.Variable) = x.ToFunction()

In [13]:
/// Create a new linear layer in the W·x+b pattern
/// <remarks> CNTK helper </helper>
let linearLayer (inputVar : Variable) outputDim =
    let inputDim = inputVar.Shape.[0] 
    
    // Note that unlike the python example, the dimensionality of the output
    // goes first in the parameter declaration, otherwise the connection 
    // cannot be propagated.
    let weightParam = new Parameter(shape [outputDim; inputDim], dataType, initialization, device, "Weights")
    let biasParam = new Parameter(shape [outputDim], dataType, 0.0, device, "Bias")    
    
    let dotProduct = CNTKLib.Times(weightParam, inputVar, "Weighted input")
    CNTKLib.Plus(Var dotProduct, biasParam, "Layer")    

In [14]:
/// Create a new linear layer and fully connect it to 
/// an existing one through a specified differentiable function
/// <remarks> CNTK helper </helper>
let denseLayer (nonlinearity: Variable -> Function) inputVar outputDim  =
    linearLayer inputVar outputDim
    |> Var |> nonlinearity |> Var

I changed the order parameter order of the **denseLayer** generator function by putting the activation function parameter first, in order to facilitate composition (it will become obvious how very shortly). Also, since the purpose of **denseLayer** is to connect two layers together, and the existing layer is presented as a <code>CNTK.Variable</code>, it seems prudent to preemptively convert the result from Function to Variable as well.

#### Just F# problems
It's a matter of time before all this back and forth converting from Function to Variable and back gets tiring, but for now let's just persevere. If you're not inclined to, I urge you to check Mathias Brandewinder's [CNTK.FSharp](https://github.com/mathias-brandewinder/CNTK.FSharp) project, where he presents an elegant abstraction in the form of a [*Tensor* discriminated union](https://github.com/mathias-brandewinder/CNTK.FSharp/blob/master/CNTK.FSharp/Core.fs) that encapsulates Functions and Variables in a single type and a lot more on top.

In [15]:
// # Define a multilayer feedforward classification model
let fullyConnectedClassifierNet inputVar numOutputClasses hiddenLayerDim numHiddenLayers nonlinearity =
    let mutable h = denseLayer nonlinearity inputVar hiddenLayerDim
    for i in 1..numHiddenLayers do
        h <- denseLayer nonlinearity h hiddenLayerDim 
    
    // Note that we don't feed the output layer through 
    // the selected nonlinearity/activation function    
    linearLayer h numOutputClasses    

In [16]:
// # Create the fully connected classifier
let z = fullyConnectedClassifierNet input num_output_classes hidden_layers_dim num_hidden_layers (CNTKLib.Sigmoid)

#### Let's get idiomatic
Sadly there seem to be no managed helpers to facilitate dense layer creation. 

So instead, here's a more functional version of our linear layer composing function, without mutable variables or for loops, and with the added bonus of enabling you to arbitrarily set the number and dimension of hidden layers in a single parameter.

In [17]:
/// Fully connected linear layer composition function
/// <remarks> CNTK helper </remarks>
let fullyConnectedClassifierNet' inputVar (hiddenLayerDims: int seq) numOutputClasses nonlinearity =
    (inputVar, hiddenLayerDims) 
    ||> Seq.fold (denseLayer nonlinearity)
    |> fun model -> linearLayer model numOutputClasses

In addition, passing an empty sequence instead of a list of hidden layer dimensions produces a linear model, just like if you had run <code>linearLayer</code> with the same params.

And here's how we would use this to produce a model identical to <code>z</code>:

In [18]:
let z' = 
    fullyConnectedClassifierNet' 
        input [hidden_layers_dim;hidden_layers_dim]  
        num_output_classes (CNTKLib.Sigmoid)

## Learning model parameters

In [19]:
"p = softmax(z_{final~layer})" |> Util.Math

{Latex = "$$p = softmax(z_{final~layer})$$";}

## Training

In [20]:
"H(p)=-\sum_{j=1}^C y_j log(p_j)" |> Util.Math

{Latex = "$$H(p)=-\sum_{j=1}^C y_j log(p_j)$$";}

In [21]:
let loss = CNTKLib.CrossEntropyWithSoftmax(Var z, label)

## Evaluation

In [22]:
let eval_error = CNTKLib.ClassificationError(Var z, label)

## Configure training

In [23]:
// # Instantiate the trainer object ot drive the model training
let learning_rate = 0.05
let lr_schedule = new CNTK.TrainingParameterScheduleDouble(learning_rate, uint32 CNTK.DataUnit.Minibatch)
let learner = CNTKLib.SGDLearner(z.Parameters() |> ParVec, lr_schedule)
let trainer = CNTK.Trainer.CreateTrainer(z, loss, eval_error, ResizeArray<CNTK.Learner>([learner]))

The helper functions for reporting training progress first seen in the previous notebook have already been referenced with <code>open MiscellaneousHelpers</code> and can also be found [here](https://github.com/SpaceAntelope/IfCntk/blob/master/notebooks/cntk-tutorials/fsx/MiscellaneousHelpers.fsx)

## Run the trainer

In [24]:
// # Initialize the parameters for the trainer
let minibatch_size = 25
let num_samples = 20000
let num_minibatches_to_train = num_samples / minibatch_size

In [25]:
// # Run the trainer and perform model training
let training_progress_output_freq = 20

let plotdata = {
    BatchSize = ResizeArray<int>()
    Loss = ResizeArray<float>()
    Error = ResizeArray<float>()
}

for i in [0..num_minibatches_to_train] do
    let features,labels =
        generateRandomDataSample minibatch_size input_dim num_output_classes
        |> fun (x,y) -> matrixToBatch x, matrixToBatch y
    
    // # Specify the input variables mapping in the model to actual minibatch data for training
    let trainingBatch = [(input, features);(label, labels)] |> dict
    let status = trainer.TrainMinibatch(trainingBatch, true, device)
    
    // log training data
    match (printTrainingProgress trainer i training_progress_output_freq true) with
    | Some (i,loss,eval) ->         
        plotdata.BatchSize.Add <| i
        plotdata.Loss.Add <| loss
        plotdata.Error.Add <| eval
    | None -> ()

Minibatch: 0, Loss: 0.6857, Error: 0.40
Minibatch: 20, Loss: 0.8130, Error: 0.56
Minibatch: 40, Loss: 0.6675, Error: 0.40
Minibatch: 60, Loss: 0.6490, Error: 0.36
Minibatch: 80, Loss: 0.7596, Error: 0.64
Minibatch: 100, Loss: 0.6838, Error: 0.56
Minibatch: 120, Loss: 0.6588, Error: 0.44
Minibatch: 140, Loss: 0.7918, Error: 0.64
Minibatch: 160, Loss: 0.5768, Error: 0.12
Minibatch: 180, Loss: 0.5144, Error: 0.40
Minibatch: 200, Loss: 0.4737, Error: 0.28
Minibatch: 220, Loss: 0.4621, Error: 0.16
Minibatch: 240, Loss: 0.3006, Error: 0.08
Minibatch: 260, Loss: 0.2793, Error: 0.00
Minibatch: 280, Loss: 0.2932, Error: 0.12
Minibatch: 300, Loss: 0.3017, Error: 0.12
Minibatch: 320, Loss: 0.1820, Error: 0.12
Minibatch: 340, Loss: 0.6820, Error: 0.28
Minibatch: 360, Loss: 0.2221, Error: 0.08
Minibatch: 380, Loss: 0.3375, Error: 0.16
Minibatch: 400, Loss: 0.2038, Error: 0.12
Minibatch: 420, Loss: 0.3467, Error: 0.24
Minibatch: 440, Loss: 0.4062, Error: 0.24
Minibatch: 460, L

In [26]:
// # Compute the moving average loss to smooth out the noise in SGD
trainingResultPlotSmoothed plotdata |> Display

## Run evaluation / testing
This time, instead of reproducing the identical functionality of the python tutorial let's save our future selves a bit of time and create a reusable version of the evaluation/testing process.

In [27]:
open MathNet.Numerics.LinearAlgebra

/// Evaluation of a Matrix dataset for a trained model
/// <remarks> CNTK helper </remarks>
let testMinibatch (trainer: CNTK.Trainer) (features: Matrix<float32>) (labels: Matrix<float32>) =
    let x,y = matrixToBatch features, matrixToBatch labels
    
    // It should be interesting to see if this convention
    // will hold for any other topography     
    let input = trainer.Model().Arguments |> Seq.head
    let label = trainer.LossFunction().Arguments |> Seq.last
    
    let testBatch =
        [ (input, x);(label, y) ]
        |> dict
        |> AsUnorderedMapVariableValue
    
    trainer.TestMinibatch(testBatch , device)  

In [28]:
// # Generate new data
let test_minibatch_size = 25
let x_test,y_test = generateRandomDataSample test_minibatch_size input_dim num_output_classes

testMinibatch trainer x_test y_test

0.24

In [29]:
// # Figure 4
ImageUrl "http://cntk.ai/jup/feedforward_network.jpg" 200

In [30]:
let out = CNTKLib.Softmax(Var z)
let inputMap = [input, matrixToBatch x_test] |> dict
let outputMap = [(out.Output, null)] |> dataMap

In [31]:
let predicted_label_probs = out.Evaluate(inputMap, outputMap, device)

/// Get index of maximum value
/// <remarks> Helper function </remarks>
let argMax<'T when 'T : comparison and 'T : equality>(source: 'T seq) = 
    let max = source |> Seq.max 
    Seq.findIndex ((=)max) source

In [32]:
let result = outputMap.[out.Output].GetDenseData<float32>(out.Output)

y_test |> Matrix.toRowArrays |> Array.map argMax |> printfn "Label    : %A"    
result |> Seq.map argMax |> Array.ofSeq |> printfn "Predicted: %A"  

Label    : [|0; 0; 0; 1; 1; 1; 0; 0; 0; 0; 0; 0; 1; 0; 0; 0; 1; 1; 0; 0; 1; 1; 1; 0; 0|]
Predicted: [|0; 1; 0; 1; 1; 1; 0; 0; 0; 1; 0; 0; 1; 0; 1; 0; 1; 1; 1; 1; 1; 1; 1; 0; 1|]


<div class="alert alert-warning"> Some times when trying to evaluate <code>result</code> in a shell an exception of type <code>Expression evaluation failed: Value::CopyFrom is currently unsupported for PackedValue objects</code> occurs. The solution is to take it from the top, and reset the model and the input/output maps. </div>

## Vizualization

In [46]:
modelSoftmaxOutputHeatmap "Scaled age (in yrs)" "Tumor size (in cm)" [|1. .. 0.1 .. 10.|] z 

<img src="jupyheader.png"/>

# <span class="label label-success">Extra!</span> Non linear separation example

Seems a shame to go through all this trouble to build a hidden layer topography and not even try it on a more challenging example.

## Data Generation

In [34]:
open MathNet.Numerics.LinearAlgebra

// We achieve non linear separation by stealthily adding another output class,
// that we then assign to the first class, thus encircling the rest of the data.
let generateRandomNonlinearlySeparableDataSample sampleCount featureCount labelCount =     
    let x,y = generateRandomDataSample sampleCount featureCount (labelCount+1)
    let y' = 
        y 
        |> Matrix.toRowArrays 
        |> Array.map(
            fun line -> 
                if line.[labelCount] = 1.f 
                then line.[0] <- 1.f
                
                line.[0..labelCount-1])
        |> matrix
    
    x,y'

<div class="alert alert-warning"><p>If you are using the version of MathNet linked in the <a href="https://www.nuget.org/packages/FsLab/">FsLab</a> nuget package as of the time of this writing you will get a conversion error when creating the new matrix from the transformed label data.</p>
    <p></p>
<blockquote>Type mismatch. Expecting a
    'float32 [][] -> 'a'    
but given a
    'int list list -> Matrix<int>'    
    The type 'seq<float32 []>' does not match the type 'int list list'</blockquote>
    
<p>To resolve this make sure your paket dependencies point the latest version of <strong>MathNet.Numerics.FSharp</strong> </p>
    </div>

## Data Transformation

<p>Adding a new class and then assigning it to one of the previous classes after the fact has the side effect of creating and imbalanced dataset, since we now have a class with twice as many samples as any of the others.</p><p>In case of a two-class dataset this is the most pronounced, since samples from the two classes will be produced at a 2:1 rate relative to each other. This makes convergence harder than it has to be and also produces misleading evaluation results since for instance always predicting class 1 gives 67% success.</p>

In [35]:
let rnd = new Random()
let shuffle = Seq.sortBy (fun _ -> rnd.Next())

// This slightly awkard function truncates the overpopulated class to match
// the size of the others, and makes sure the selected subset is randomly 
// distributed between the two clusters (i.e. the original doubled class
// and the spurious additional set)
let stratifiedSampling (features: Matrix<float32>) (labels: Matrix<float32>) =    
    let minLength = 
        labels 
        |> Matrix.toRowArrays 
        |> Array.countBy id 
        |> Array.map snd 
        |> Array.min
    
    Seq.zip (features.ToRowArrays()) (labels.ToRowArrays())
    |> shuffle
    |> Seq.groupBy snd
    |> Seq.map (fun (key, grp) -> grp |> Seq.take minLength)
    |> Seq.collect id
    |> shuffle
    |> Seq.map (fun (f,l) -> Seq.append f l)
    |> matrix
    |> fun mtx -> mtx.[*,..1], mtx.[*,2..]

Let's compare:
### Raw data

In [36]:
generateRandomNonlinearlySeparableDataSample 128 input_dim num_output_classes    
||> simpleScatterPlot "feature 1" "feature 2"

### Balanced data:

In [37]:
generateRandomNonlinearlySeparableDataSample 128 input_dim num_output_classes    
||> stratifiedSampling 
||> simpleScatterPlot "feature 1" "feature 2"

## Model Creation

Here are all the parameters necessary for training gathered in one place for your convenience. I've placed them under their own module so we don't mess with the global scope to much, i.e. so you won't have to restart the kernel every time you want to experiment with the parameters, which I very much ecourage you to do.

In [92]:
module NonLinear = 
    let inputDim, numOutputClasses = 2,2
    let learningRate = 0.001
    let minibatchSize = 100   
    let trainingCycles = 15000
    let reportSampleRate = 25
    let input = Variable.InputVariable(shape [|inputDim|], dataType, "Features")
    let label = Variable.InputVariable(shape [|numOutputClasses|], dataType, "Labels")
    let z = fullyConnectedClassifierNet' input [50;50] numOutputClasses (CNTKLib.Sigmoid)
    let loss = CNTKLib.CrossEntropyWithSoftmax(Var z, label)
    let error = CNTKLib.ClassificationError(Var z, label)
    let lrSchedule = new CNTK.TrainingParameterScheduleDouble(learningRate, uint32 CNTK.DataUnit.Minibatch)
    let learner = CNTKLib.SGDLearner(z.Parameters() |> ParVec, lrSchedule)
    let trainer = CNTK.Trainer.CreateTrainer(z, loss, error, ResizeArray<CNTK.Learner>([learner]))

##  <span class="label label-success">Extra!</span> Training (with live progress report!)

In [93]:
// Data logger structure
let plotdata = { 
    BatchSize = ResizeArray<int>()
    Loss = ResizeArray<float>()
    Error = ResizeArray<float>()
}

In [94]:
for i in 0..NonLinear.trainingCycles do
    let features,labels =
        generateRandomNonlinearlySeparableDataSample 
            NonLinear.minibatchSize (NonLinear.input.Shape.[0]) NonLinear.numOutputClasses
        ||> stratifiedSampling
        |> fun (x,y) -> matrixToBatch x, matrixToBatch y
    
    let trainingBatch = [(NonLinear.input, features);(NonLinear.label, labels)] |> dict
    let status = NonLinear.trainer.TrainMinibatch(trainingBatch, true, device)
    
    match (printTrainingProgress NonLinear.trainer i NonLinear.reportSampleRate false) with
    | Some (i,loss,eval) ->         
        if plotdata.BatchSize.Count > 0 
        then plotdata.BatchSize 
             |> Seq.last 
             |> (+) NonLinear.reportSampleRate 
             |> plotdata.BatchSize.Add
        else plotdata.BatchSize.Add 1
        plotdata.Loss.Add <| loss
        plotdata.Error.Add <| eval        
    | None -> ()

trainingResultPlotSmoothed plotdata

Converting our synchronous training loop is as easy as placing it in an <code>async</code> computational expression. We could stop there and have a very serviceable Async<> object to iterate over with AsyncSeq, but I thought I should pass a few arguments both to make the iterations per cycle to allow for some customization, and to make some additional info available about the general progress of the training.

From then on, displaying an updateable label is as simple as returning a string from Async<> and applying IfSharp <code>Display</code> to the resulting AsyncSeq. And you can even return renderable html strings!

In fact, here's a simple Bootstrap progress bar to get you started:

In [95]:
/// Bootstrap progress bars for training data reporting
/// <remarks> Helper function </remarks>
let reportHtml info progress loss error =
    let progressBar kind label value =    
        System.String.Format(
            """<div class='progress' style='margin-top:5px; width: 700px'>
                   <div class='progress-bar progress-bar-{0} progress-bar-striped' 
                         role='progressbar' aria-valuenow='{0:f2}'
                         aria-valuemin='0' aria-valuemax='100' style='width: {1:f2}%'>
                        <span>{1:f2}% ({2})</span>
                   </div>
                </div>""", kind, value, label)

    [ progressBar "info" "Progress" progress
      progressBar "warning" "Loss" (loss * 100.)
      progressBar "danger" "Error" (error * 100.) ]
    |> List.reduce (+)
    |> sprintf """<div class='container'><h2>%s</h2>%s</div>""" info

And here's the asynchronous training loop:

In [96]:
open FSharp.Control

let trainCycle iterations finalCycle currentCycle htmlReport =
    async {
        // Our training cycle
        for i in 0..iterations do
            let features,labels =
                generateRandomNonlinearlySeparableDataSample NonLinear.minibatchSize 
                    (NonLinear.input.Shape.[0]) NonLinear.numOutputClasses
                ||> stratifiedSampling
                |> fun (x,y) -> matrixToBatch x, matrixToBatch y

            let trainingBatch = 
                [(NonLinear.input, features);(NonLinear.label, labels)] |> dict
            
            NonLinear.trainer.TrainMinibatch(trainingBatch, true, device)
            |> ignore
            
            (* Let's skip the logging code to keep things shorter *)
           
        
        // Calculate training info
        let lossAverage = NonLinear.trainer.PreviousMinibatchLossAverage()
        let evaluationAverage = NonLinear.trainer.PreviousMinibatchEvaluationAverage()
        let current = 100. * (float currentCycle + 1.)/(float finalCycle)
        
        // Create report text
        let progress = 
            sprintf "[%s] %.1f%%" ("".PadLeft(int current,'=').PadRight(100,' ')) current
        let info =
            sprintf "Minibatch: %d of %d, Loss: %.4f, Error: %.2f" 
                ((currentCycle+1)*iterations) (finalCycle * iterations) lossAverage evaluationAverage;
        let progressBar = 
            if htmlReport then 
                reportHtml info current lossAverage evaluationAverage
            else 
                sprintf "<pre>%s\n %s</pre>" progress info
        
        // Send result to AsyncSeq
        return progressBar |> Util.Html
    }

In [97]:
let totalCycles = 128
let iterationsPerCycle = NonLinear.trainingCycles / totalCycles

AsyncSeq.initAsync (int64 totalCycles) 
    (fun i -> trainCycle iterationsPerCycle totalCycles (int i) true)  
|> Display

## Evaluation

In [106]:
let test_minibatch_size = 128
let x_test,y_test = generateRandomNonlinearlySeparableDataSample test_minibatch_size input_dim num_output_classes

testMinibatch NonLinear.trainer x_test y_test

0.2265625

## Visualization

In [100]:
modelSoftmaxOutputHeatmap "feature 1" "feature 2" [|0. .. 0.1 .. 15.|] NonLinear.z 