<a href="https://colab.research.google.com/github/JohnTaylor2000/models/blob/master/Test_fit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Explore the Universal approximation Theorem by fitting a range of functions using a neural network. We also explore extrapolation i.e. beyond the range of data used to fit the model** **also known as the ability to generalise.**

Author: John Taylor 07/2020

Here we demonstrate that a neural network has the potential to fit any function. Examples use different functions of increasing complexity.

We also investigate extrapolation and the role of the activation function in extrapolation. You will find that the activation function largely determines the fit to extrapolations beyond the training range.

You can easily vary the number of layers and hidden units to explore the role these play in training a model of increasing complexity.

The impact of key parameters on the computation time, such as the batch size and the number of hidden units can also be investigated. The batch size can be increased to match the Epoch size so that the fit is not stochastic.

Running with and without a GPU/TPU makes a big difference to training times. We also illustrate how you can add timers to code in addition to what is currently available. 


Start by loading **Nvidia GPU environment variables** that suppress warning messages. Only needed when running on nvidia GPUs.

In [None]:
import os
os.environ["TF_DISABLE_NVTX_RANGES"] = "1"
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
os.environ["NCCL_DEBUG"] = "WARN"

Add **Matplotlib** plotting library and **Numpy** maths library


In [None]:
import matplotlib as mpl
import matplotlib.pyplot as plt

import numpy as np


Add the **time** library so we can time code

In [None]:
import time


Add **TensorFlow** - we will use Keras to create the model.



In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.activations import elu

**Set the key program parameters**

1. Set the **X range** (xmin and xmax) for the function and the **range for predictions** (x_pred_min and x_pred_max) which includes **extrapolation** beyond the range used to fit the model (xmin and xmax). Set the **number of predictions** over the range (num_pred).

In [None]:
xmin = -10.
xmax =  10.

x_pred_min = -20.  # extrapolate beyond xmin
x_pred_max =  20.  # extrapolate beyond xmax
num_pred   = 100

2. **Number of data points** between X min and max. This number will define the number of data points with which to fit the model ie an epoch. We also set the **batch size** and the **number of epochs**. The batch size must be smaller than number of points (the epoch).



In [None]:
num_points = 50000  # this is the total data size for model fitting - an Epoch
batch_size = 500
num_epochs = 50


3. Set the **activation** to use in each layer of the model.

In [None]:
act_num = 4     # see below

if act_num   == 0:
  activation = 'linear'

elif act_num == 1:
  activation = 'relu'

elif act_num == 2:
  activation = 'tanh'

elif act_num == 3:
  activation = 'sigmoid'

elif act_num == 4:
  activation = 'swish'

elif act_num == 5:
  activation = 'elu'

elif act_num == 6:
  activation = lambda xv: elu(xv, alpha=1.2)

4. Select a **simple function** with which to train a model. First generate a sequence of random uniformly spaced x values and then calulate the function f(x).

In [None]:
num_func = 2  # see below for definitions

x = np.random.uniform(low = xmin, high = xmax, size=(num_points,))
y = np.zeros(x.shape[0])

def my_fun (x, num_func):

  if num_func == 0:
    y = x      # simple constant function 

  elif num_func == 1:
    y = 5.*x + 1.    # simple linear function

  elif num_func == 2:
    y = x**2   # x square function  

  elif num_func == 3:
    y = 0.01*x**3 - 0.1*x**2 + 2*x + 1   # quadratic function  

  elif num_func == 4: 
    y = np.sin(x)  # sine function
 
  elif num_func == 5:
    y = np.sin(x) + np.sin(2*x) + np.sin(0.5*x)  # mixture of sine functions

  elif num_func == 6:
    y = np.sin(x) # mixture of sine functions
    scale = 0.
    for i in range (40):
      scale = i/10.
      y = y + np.sin(scale*x)

  return y  

y = my_fun (x, num_func)

5. **Select the number of layers** in the model, then define and compile the model.

In [None]:
num_layers = 2      # in addition to input and output layer
hidden_units = 64   # base number of hidden units
scale_fac = 1       # increase the number of hidden units per layer

model = Sequential()

model.add(Dense(hidden_units, input_dim=1, activation=activation))

if num_layers > 0:
  scale = scale_fac
  for i in range (num_layers):
    model.add(Dense(hidden_units*scale,  activation=activation))
    scale = scale * scale_fac

model.add(Dense(1,activation='linear'))

model.compile(loss='mse', optimizer='adam')

**Print a summary of the model**

In [None]:
model.summary()

**Time the Model fit and save the history**

In [None]:
t0 = time.time()

history = model.fit(x, y, validation_split=0.20, epochs=num_epochs, batch_size=batch_size)

elapsed_time = time.time() - t0
print (' Model Train Elapsed Time (sec) = ', elapsed_time)


**Plot the train and test model history**

In [None]:
plt.rcParams['figure.figsize'] = [10, 10]
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper right')
plt.show()

**Generate a series of predictions** using the model and compare with actual values based on evaluation of the function.
We also print predicted vs actual values and calculate the mean squared error.

In [None]:
x_pred = np.linspace(x_pred_min, x_pred_max, num_pred)

y_pred = np.zeros(x_pred.shape[0])
y_pred = my_fun (x_pred, num_func)   # actual function values
                
pred = model.predict(x_pred)   # ML model predictions

sum = 0.
for i in range(len(y_pred)):
  sum = sum + (y_pred[i]-pred[i])**2.
  print(' Predicted = ', pred[i],' Actual = ', y_pred[i], 'Difference = ', pred[i]-y_pred[i])

sum = sum/len(y_pred)
print (' MSE = ', sum)


**Plot the function and the corresponding ML Model predictions**

In [None]:
plt.rcParams['figure.figsize'] = [10, 10]
plt.plot(x_pred,y_pred,label='Actual Function')
plt.plot(x_pred,pred,label='ML Model Fit')
plt.xlabel('X')
plt.ylabel('F(X)')
plt.legend()
plt.grid(True)
plt.show()