<h1>Getting started with TensorFlow</h1>
<p>The following notebook assumes limited to no knowledge of TensorFlow, but it will require basic Python literacy and for the TensorFlow, Numpy, and Matplotlib modules to be installed (They should be unless you maliciously uninstalled them.)</p>

In [None]:
import tensorflow as tf
import numpy as np

<h2>Lesson Goals:</h2>
<ol>
    <li>Understanding Machine Learning Models as Function Approximators.</li>
    <li>Defining ML Models in TensorFlow.</li>
    <li>Training ML Models.</li>
</ol>
<a name="FunctionApproximator"></a>
<h3>Understanding Machine Learning Models as Function Approximators</h3>
<p>In general, most ML models are made with a very simple purpose: to produce 'intelligent' output for a given problem. This is an over simplified way of thinking of things, but it reveals some of our key goals: we want to make a model which produces output we like based on a set of input(s). In this sense, we can imagine our ML model as a universal function approximator: something which mimics the output we would like given a set of examples. For this lesson, we will start by loading a set of data and making a model which can accurately predict values for our unknown function.</p>
<h3>Loading the data</h3>
<p>We begin by using Python's pickle module to load the binary data file containing our unknown function's input and output examples.</p>

In [None]:
import pickle
file_name = 'data.p'
data = None
with open(file_name, 'rb') as out_file:
    data = pickle.load(out_file)
if data is not None:
    print('Loaded data from {}.\nData:{}'.format(file_name, data))

<p>The previous code should print out the data for our inputs and outputs, but it would probably be helpful to display the data on a graph so we can get a better idea of what kind of function we are trying to model. The following code plots our data as a series of red points.</p>

In [None]:
import matplotlib.pyplot as plt
plt.plot(data['inputs'], data['outputs'], 'ro')
plt.show()

<p>Since we have a set of input/output examples showing the expected output of our model, we could classify this ML problem as a <i>supervised learning</i> problem. In this case, we know that the output of our model should closely match that of the graph above. In order to determine which models we use to approximate this function are better, we will need to define a loss function to quantify how far off our model is. For continuous, regression type problems like this, a common loss function is to average the squares of the differences between the true value (denoted $y$) and the approximated value (denoted $\hat{y}$).</p>
$$loss=\frac{1}{n}\sum_0^n{(y_i-\hat{y_i})^2}$$
<p>This is often referred to as the Mean Squared Error (MSE).</p>

In [None]:
def MSE(ys,ys_approx):
    loss = 0
    for y, y_approx in zip(ys, ys_approx):
        loss += (y - y_approx) ** 2
    return loss/len(ys)

<h3>Linear Approximation</h3>
<p>We can try simply approximating our model using the slope-intercept formula:</p>
$$y = m*x + b$$
<p>Try to find best line of fit possible (lowest loss). Do not spend more than 5 minutes on this portion.</p>

In [None]:
m = -1
b = 3

y = lambda x: m * x + b
print('Loss for function approximation: ', MSE(data['outputs'], y(data['inputs'])))

plt.plot(data['inputs'], y(data['inputs']), 'b-')
plt.plot(data['inputs'], data['outputs'], 'ro')
plt.show()

<p>We may  have managed to find a line which works, but in practice, we do not want to have to figure out the ideal values for $m$ or $b$. This is what we have ML and gradient descent for. Next, we will set up a simple linear approximator which we will train with TensorFlow. The following code begins by resetting the TensorFlow graph. This is typically run before creating your networks to ensure that there are no duplicate tensors. Afterwards, we define a placeholder for our X input(s). Placeholders allow us to fill in their places later with our data set for training the model.</p>

In [None]:
tf.reset_default_graph()
XS = tf.placeholder(tf.float32, shape=[None, 1], name='Xs')
#TO-DO: Define a placeholder for our expected output: TrueYS

M = tf.Variable(-1, dtype=tf.float32, name='Slope')
#TO-DO: Define a variable for bias: B

#Our TensorFlow Linear Approximator: YApprox
YApprox = M * XS + B

<p>With this taken care of, we can initialize and run our model with the following code. Note that whenever we want to access the output of a TensorFlow operation, we use 'sess.run()' to obtain the output of the operation. Additionally, we need to specify what takes the place of our XS placeholder in a dictionary.</p>

In [None]:
#Start a new TF session to run our model
with tf.Session() as sess:
    #Initialize our variables
    sess.run(tf.global_variables_initializer())
    #Determine our outputs for the given data inputs:
    y_approx = sess.run(YApprox, {XS : np.transpose([data['inputs']])})
    print('Loss for function approximation: ', MSE(data['outputs'], y_approx))
    

plt.plot(data['inputs'], y_approx, 'b-')
plt.plot(data['inputs'], data['outputs'], 'ro')
plt.show()

<p>Now that we have defined our simple linear approximator, lets try to train it against the data to figure out the best values for $m$ and $b$. We do this by defining a loss function which compares our model's prediction to the true value for $y$. Running this code should produce nearly the same estimated loss as our MSE function:</p>

In [None]:
loss = tf.losses.mean_squared_error(TrueYS, YApprox)
with tf.Session() as sess:
    #Initialize our variables
    sess.run(tf.global_variables_initializer())
    #Compute our losses
    losses = sess.run(loss, {
        XS : np.transpose([data['inputs']]),
        TrueYS : np.transpose([data['outputs']])})
    print('Loss for function approximation:', losses)

<p>Now that we can user TensorFlow to calculate the losses of our model, we will try to train our model to minimize this loss function and find the ideal values of $m$ and $b$. We begin by defining our Gradient Descent Optimizer, which will be used to determine how to change our variables. From there, we define our train operation as using our optimizer to minimize our function approximator's loss. Once this is taken care of, we simply need to train the agent with the data we have to work with.</p>

In [None]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
train_op = optimizer.minimize(loss)
training_epochs =  3
with tf.Session() as sess:
    #Initialize our variables
    sess.run(tf.global_variables_initializer())
    for epoch in range(training_epochs):
        feed_dict = {
            XS : np.transpose([data['inputs']]),
            TrueYS : np.transpose([data['outputs']])
        }
        losses, m, b, _ = sess.run([loss, M, B, train_op], feed_dict)
        print('Loss after {} epochs: {}'.format(epoch+1, losses))
        print('y = {} * x + {}\n'.format(m, b))
        plt.plot(data['inputs'], y(data['inputs']), label='Epoch {}'.format(epoch))
    plt.plot(data['inputs'], data['outputs'], 'ro')
    plt.legend()
    plt.show()

<p>Try messing with the learning_rate and training_epochs variables to train the line to better fit the data.</p>

<h3>Nonlinear layers?</h3>
<p>Even if we trained our model perfectly, it still falls short of accurately modeling our data. This is because our data and the function we are trying to approximate is not linear. In order to make our ML model more accurate, we will use Neural Networks with nonlinear transformations/activations.</p>

In [None]:
tf.reset_default_graph()
number_hidden_neurons = 128
XS = tf.placeholder(tf.float32, shape=[None, 1], name='Xs')
hidden1 = tf.layers.dense(XS, number_hidden_neurons, activation=tf.nn.sigmoid)
#TO-DO: add additional hidden layers. Try changing the activation functions or number of hidden neurons
YApprox = tf.layers.dense(hidden1, 1)
TrueYS = tf.placeholder(tf.float32, shape=[None, 1], name='TrueYs')
loss = tf.losses.mean_squared_error(TrueYS, YApprox)

In [None]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0001)
train_op = optimizer.minimize(loss)
training_epochs =  100000
#Try changing the number of training epochs
with tf.Session() as sess:
    #Initialize our variables
    sess.run(tf.global_variables_initializer())
    for epoch in range(training_epochs):
        feed_dict = {
            XS : np.transpose([data['inputs']]),
            TrueYS : np.transpose([data['outputs']])
        }
        losses, _ = sess.run([loss, train_op], feed_dict)
    print('Loss after {} epochs: {}'.format(epoch+1, losses))
    xs = np.transpose([np.linspace(0, max(data['inputs']), 100)])
    ys_approx = sess.run([YApprox], {XS : xs})
    xs = np.transpose(xs)[0]
    ys_approx = np.transpose(ys_approx)[0]
    plt.plot(xs, ys_approx, label='Epoch {}'.format(epoch))
    plt.plot(data['inputs'], data['outputs'], 'ro')
    plt.legend()
    plt.show()