# Polynomial fitting with Neural Networks

Neural networks generally won't do a good job in extrapolating polynomial functions. However, if your training and testing data are from the same range, you could achieve quite nice results.

Reference: https://stackoverflow.com/questions/44998910/keras-model-to-fit-polynomial

## Imports

Let's start with our imports. Here we are importing TensorFlow and calling it tf for ease of use.

We then import a library called numpy, which helps us to represent our data as lists easily and quickly.

The framework for defining a neural network as a set of Sequential layers is called keras, so we import that too.

In [None]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

In [None]:
print(tf.__version__)
print(np.__version__)

In [None]:
import matplotlib.pyplot as plt
% matplotlib inline

In [None]:
x_train=np.random.rand(10000)
y_train=x_train**4+x_train**3-x_train
x_train=x_train.reshape(len(x_train),1)

x_test=np.linspace(0,1,100)
y_test=x_test**4+x_test**3-x_test
x_test=x_test.reshape(len(x_test),1)

In [None]:
# Note: because x_train is generated in random way, and not in order. Hence scatter is used instead of plot.
plt.scatter(x_train, y_train)

## Define and Compile the Neural Network

Note that I increased epochs to 40 to get more iterations and more accurate results. I also set verbose=1 to be able to see how the loss behaves. The loss is indeed decreasing down to 7.4564e-04, and below is the result that I got. The red line is the prediction of the network, and the blue line is the correct value. You can see that they are quite close to each other.

How about change 'relu' to other activation function, for example, sigmoid? In this example, sigmoid is by far wrose than relu.  
In a single layer in 1d Relus are performing piecewise linear regression.

## Dense: Units
    
The units are the most basic parameter to understand. This parameter is a positive integer that denotes the output size of the layer. It’s the most important parameter we can set for this layer. The unit parameter actually dictates the size of the weight matrix and bias vector (the bias vector will be the same size, but the weight matrix will be calculated based on the size of the input data so that the dot product will produce data that is of output size, units).

In [None]:
# model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
# Below is equivalent.
#layer_0 = keras.layers.Dense(units=1, input_shape=[1])
#model = tf.keras.Sequential([layer_0])

model = keras.Sequential()
model.add(keras.layers.Dense(units=200, input_dim=1))
model.add(keras.layers.Activation('relu'))
#model.add(keras.layers.Activation('sigmoid'))
model.add(keras.layers.Dense(units=45))
model.add(keras.layers.Activation('relu'))
#model.add(keras.layers.Activation('sigmoid'))
model.add(keras.layers.Dense(units=1))


In [None]:
model.compile(loss='mean_squared_error',optimizer='sgd')
model.fit(x_train, y_train, epochs=40, batch_size=50, verbose=1)

In [None]:
loss_and_metrics = model.evaluate(x_test, y_test, batch_size=100, verbose=0)
print(loss_and_metrics)

In [None]:
classes = model.predict(x_test, batch_size=1)

test=x_test.reshape(-1)
plt.plot(test,classes,c='r')
plt.plot(test,y_test,c='b')
plt.show()

# Fitting y = a * x^2 + b * x + c

## Example1: a = 0.1; b = 0.3; c = 0.5  
With these parameters, y has roughly the same range as x, the model behaves well.

In [None]:
a = 0.5; b = 0.3; c = 0.5

xs = np.array(np.linspace(0,1,1000))
np.random.shuffle(xs)

ys  = np.array([a * xs * xs + b * xs + c]).T
plt.scatter(xs,ys)

In [None]:
model.fit(xs, ys, epochs=40, batch_size=50, verbose=1)

In [None]:
model.evaluate(xs, ys, batch_size=100, verbose=0)

## Example2: a = 1; b = 3; c = 5  
With these parameters, y has by far larger range than x, the model behavior become worse.  
This is an example to indicate the importance of normalization.

In [None]:
a = 3; b = 3; c = 5
#ys  = np.array([a * xs[:,0] + b * xs[:,1] + c]).T
ys  = np.array([a * xs * xs + b * xs + c]).T
plt.scatter(xs,ys)

In [None]:
model.fit(xs, ys, epochs=40, batch_size=50, verbose=1)

In [None]:
model.evaluate(xs, ys, batch_size=100, verbose=0)