# Train a Simple model for running on the BFree

(This code is based on [Tensorflow Lite example](https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/hello_world/train/train_hello_world_model.ipynb))

This notebook demonstrates the process of training a 4 kB model using TensorFlow and converting it for use with TensorFlow Lite for usage with the BFree.

Deep learning networks learn to model patterns in underlying data. Here, we're going to train a network to model data generated by a [sine](https://en.wikipedia.org/wiki/Sine) function. This will result in a model that can take a value, `x`, and predict its sine, `y`.


## Configure Defaults

In [None]:
import os
# Ensure models directory exists
model_dir = "netdemo/models"
os.makedirs(model_dir, exist_ok=True)

# Define layers for each model
modelNames = ["Sine-16-16.py", "Sine-16-16-16.py", "Sine-32-32.py", "Sine-64-32.py", "Sine-64-64.py"]
models = {
    modelNames[0]: [{"units": 16, "activation": 'relu', "input_shape": (1,)},
                   {"units": 16, "activation": 'relu'}],
    modelNames[1]: [{"units": 16, "activation": 'relu', "input_shape": (1,)},
                   {"units": 16, "activation": 'relu'},
                   {"units": 16, "activation": 'relu'}],
    modelNames[2]: [{"units": 32, "activation": 'relu', "input_shape": (1,)},
                   {"units": 32, "activation": 'relu'}],
    modelNames[3]: [{"units": 64, "activation": 'relu', "input_shape": (1,)},
                   {"units": 32, "activation": 'relu'}],
    modelNames[4]: [{"units": 64, "activation": 'relu', "input_shape": (1,)},
                   {"units": 64, "activation": 'relu'}],
}

In [None]:
def exportModel(md):
  """Export a keras model to an nngb string representation"""
  result = ""
  result += str(len(md.layers)) + "\n"
  for l in md.layers:
    insize = l.input.shape[1]
    outsize = l.output.shape[1]
    result += "fc_layer(" + str(insize) + ", " + str(outsize)
    if (l.activation == keras.activations.relu):
      result += ", activator = lambda x:max(0, x)"
    result += ")" + "\n"
    for n in l.weights[0].numpy().flat:
      result += str(n) + "\n"
    for n in l.weights[1].numpy().flat:
      result += str(n) + "\n"
  return result

## Setup Environment

Install Dependencies

In [None]:
! pip install tensorflow==2.4.0

Import Dependencies

In [None]:
# TensorFlow is an open source machine learning library
import tensorflow as tf

# Keras is TensorFlow's high-level API for deep learning
from tensorflow import keras
# Numpy is a math library
import numpy as np
# Pandas is a data manipulation library 
import pandas as pd
# Matplotlib is a graphing library
import matplotlib.pyplot as plt
# Math is Python's math library
import math

# Set seed for experiment reproducibility
seed = 1
np.random.seed(seed)
tf.random.set_seed(seed)

## Dataset

### 1. Generate Data

The code in the following cell will generate a set of random `x` values, calculate their sine values, and display them on a graph.

In [None]:
# Number of sample datapoints
SAMPLES = 2000

# Generate a uniformly distributed set of random numbers in the range from
# 0 to 2π, which covers a complete sine wave oscillation
x_values = np.random.uniform(
    low=0, high=2*math.pi, size=SAMPLES).astype(np.float32)

# Shuffle the values to guarantee they're not in order
np.random.shuffle(x_values)

# Calculate the corresponding sine values
y_values = np.sin(x_values).astype(np.float32)

# Plot our data. The 'b.' argument tells the library to print blue dots.
plt.plot(x_values, y_values, 'b.')
plt.show()

### 2. Add Noise
Since it was generated directly by the sine function, our data fits a nice, smooth curve.

However, machine learning models are good at extracting underlying meaning from messy, real world data. To demonstrate this, we can add some noise to our data to approximate something more life-like.

In the following cell, we'll add some random noise to each value, then draw a new graph:

In [None]:
# Add a small random number to each y value
x_values += 0.1 * np.random.randn(*x_values.shape)

# Plot our data
plt.plot(x_values, y_values, 'b.')
plt.show()

In [None]:
# Trim values outside the valid range
valid_x = [x_values[i] > 0 and x_values[i] < 2*math.pi for i in range(len(x_values))]
x_values = x_values[valid_x]
y_values = y_values[valid_x]

# Plot our trimmed data
plt.plot(x_values, y_values, 'b.')
plt.show()

### 3. Split the Data
We now have a noisy dataset that approximates real world data. We'll be using this to train our model.

To evaluate the accuracy of the model we train, we'll need to compare its predictions to real data and check how well they match up. This evaluation happens during training (where it is referred to as validation) and after training (referred to as testing) It's important in both cases that we use fresh data that was not already used to train the model.

The data is split as follows:
  1. Training: 60%
  2. Validation: 20%
  3. Testing: 20% 

The following code will split our data and then plots each set as a different color:


In [None]:
# We'll use 60% of our data for training and 20% for testing. The remaining 20%
# will be used for validation. Calculate the indices of each section.
TRAIN_SPLIT =  int(0.6 * len(x_values))
TEST_SPLIT = int(0.2 * len(x_values) + TRAIN_SPLIT)

# Use np.split to chop our data into three parts.
# The second argument to np.split is an array of indices where the data will be
# split. We provide two indices, so the data will be divided into three chunks.
x_train, x_test, x_validate = np.split(x_values, [TRAIN_SPLIT, TEST_SPLIT])
y_train, y_test, y_validate = np.split(y_values, [TRAIN_SPLIT, TEST_SPLIT])

# Plot the data in each partition in different colors:
plt.plot(x_train, y_train, 'b.', label="Train")
plt.plot(x_test, y_test, 'r.', label="Test")
plt.plot(x_validate, y_validate, 'y.', label="Validate")
plt.legend()
plt.show()

## Training the model

## 1. Build the model

First we choose a model, then we prepare it with Keras.

In [None]:
modelName = modelNames[4]
model = tf.keras.Sequential()

# Generate layers using model parameter data
for layer in models[modelName]:
  model.add(keras.layers.Dense(**layer))

# Final layer is a single neuron, since we want to output a single value
model.add(keras.layers.Dense(1))

# Compile the model using the standard 'adam' optimizer and the mean squared error or 'mse' loss function for regression.
model.compile(optimizer='adam', loss="mse", metrics=["mae"])

model.summary()

### 2. Train the Model ###

We'll now train the new model.

In [None]:
# Train the model
history = model.fit(x_train, y_train, epochs=200, batch_size=64,
                    validation_data=(x_validate, y_validate))

### 3. Plot Metrics
Each training epoch, the model prints out its loss and mean absolute error for training and validation. You can read this in the output above (note that your exact numbers may differ): 

```
Epoch 200/200
19/19 [==============================] - 0s 3ms/step - loss: 0.0052 - mae: 0.0556 - val_loss: 0.0065 - val_mae: 0.0631

```


In [None]:
# Draw a graph of the loss, which is the distance between
# the predicted and actual values during training and validation.
train_loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1, len(train_loss) + 1)

# Exclude the first few epochs so the graph is easier to read
SKIP = 100

plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)

plt.plot(epochs[SKIP:], train_loss[SKIP:], 'g.', label='Training loss')
plt.plot(epochs[SKIP:], val_loss[SKIP:], 'b.', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()

plt.subplot(1, 2, 2)

# Draw a graph of mean absolute error, which is another way of
# measuring the amount of error in the prediction.
train_mae = history.history['mae']
val_mae = history.history['val_mae']

plt.plot(epochs[SKIP:], train_mae[SKIP:], 'g.', label='Training MAE')
plt.plot(epochs[SKIP:], val_mae[SKIP:], 'b.', label='Validation MAE')
plt.title('Training and validation mean absolute error')
plt.xlabel('Epochs')
plt.ylabel('MAE')
plt.legend()

plt.tight_layout()

Great results! From these graphs, we can see several exciting things:

*   The overall loss and MAE are much better than our previous network
*   Metrics are better for validation than training, which means the network is not overfitting

The reason the metrics for validation are better than those for training is that validation metrics are calculated at the end of each epoch, while training metrics are calculated throughout the epoch, so validation happens on a model that has been trained slightly longer.

This all means our network seems to be performing well! To confirm, let's check its predictions against the test dataset we set aside earlier:


In [None]:
# Calculate and print the loss on our test dataset
test_loss, test_mae = model.evaluate(x_test, y_test)

# Make predictions based on our test dataset
y_test_pred = model.predict(x_test)

# Graph the predictions against the actual values
plt.clf()
plt.title('Comparison of predictions and actual values')
plt.plot(x_test, y_test, 'b.', label='Actual values')
plt.plot(x_test, y_test_pred, 'r.', label='TF predicted')
plt.legend()
plt.show()

## 4. Save the model

If it looks good, we can now save the model to a nngb string format.

In [None]:
with open(f"{model_dir}/{modelName}", 'w') as fh:
    fh.write(exportModel(model))