In [None]:
#The Beauty and Joy of Tensorflow!

In [None]:
"""This tutorial will give you a low level look into how tensorflow operates so you feel comfortable with the API
and run you through a basic NN tutorial. From there you should be able to pick up future models and use them with
tensorflow functions. This is also assuming you understand basic NN architecture and optimization
(you can check the education doc to review if needed)."""

In [None]:
#First we need to import tensorflow and helper libraries
import tensorflow as tf
import numpy as np
%matplotlib inline 
import matplotlib.pyplot as plt

In [None]:
"""Sweet! Now what? Well first let's start by 
importing the mother of all ML data: MNIST. We will use for our example for a basic NN.
We will import the dataset from keras and load in the training the testing separately"""
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
#Here Keras automatically splits our dataset into training and testing along with inputs and labels. Let's see what they look like
"""(x_train, y_train), (x_test, y_test) tf.keras.datasets.cifar10.load_data()
In case you want to try CIFAR (NOTE: It would be good to check shape of Cifar and properly adjust since it is )
""""""

In [None]:
#Let's see what the shape of our data looks like
print("shape of our training input: {}".format(x_train.shape))
print("shape of our training labels: {}".format(y_train.shape))
print(x_train[0])


In [None]:
#TODO: Find a way to check which number the image x_train[0] represents without seeing the image itself.

In [None]:
"""So we have 60,000 training examples (which is a lot!). But I want to see the digits themselves!
This is where we can use some matplotlib, which will be useful when we want to check say our weather images
predictions"""
plt.imshow(x_train[3])
plt.colorbar()
plt.show()

In [None]:
"""Looks like a 1! There are other ways to display images with PIL and Scipy 
but MatPlotLib tends to have more features. Ok but something to notice is that the labels are normal digits.
Usually when have a neural model for classification, it will output a vector of probabilities for each category
EX with 5 categories: [0.01, 0.12, 0.84, 0.0, 0.03]
Because of this, we want our labels themselves to be a vector of probabilites to train the model with.
EX if label is 2: [0, 0, 1, 0, 0, 0, 0, 0, 0, 0] since ideally we want the 100% probability for category 2 and 0 elsewhere
In order to do this we do an operation called one_hot."""


#NOTE: I originally had this as a to-do where you use tf.one_hot. Unfortunately that function returns a tensor instead of a
#numpy matrix which we cannot input into our model. It's still usable in roundabout ways but for the purposes of this
#tutorial, I just one-hoted the labels for you. It would be good to look at what tf.one_hot does though.
numpy_zero = np.zeros((y_train.shape[0], 10))
numpy_zero[np.arange(y_train.shape[0]), y_train] = 1
y_train = numpy_zero
numpy_zero2 = np.zeros((y_test.shape[0], 10))
numpy_zero2[np.arange(y_test.shape[0]), y_test] = 1
y_test = numpy_zero2




In [None]:
"""
'Ok Pardhu this is great and all but I came here to learn tensorflow, not look at handwritten digits'
Alright well I won't hold you off any longer! Let's get into what a tensorflow model will look like.
While Arsh will have a more complete workshop on tensorflow, there are two important stages to building
a model on tensorflow.
1. Define the abstract graph of the model along with all important variables inside the graph and how it will train.
2. Open up a tensorflow session to then feed in the inputs and train over epochs
Let's start with part one and see how that looks:
"""
#Define our model through a function:
def ourModel(inputs):
    #We assume the shape of inputs is [None, 784]. None represents an undefined, variable size for our number of images. 
    #But why 784?.
    #Let's create our first neural layer that creates a [None, 128] matrix from the inputs matrix.
    first_layer = tf.layers.dense(inputs=inputs, units=128)
    
    #But wait! The whole point of neural networks is our activation function. Let's use relu in the next line.
    #If confused why, check out https://www.analyticsvidhya.com/blog/2017/10/fundamentals-deep-learning-activation-functions-when-to-use-them/
    first_layer = tf.nn.relu(first_layer)
    
    #And that's it! Though it seems a bit unclean to have separate lines like that. Let's simplify for the next layer
    second_layer = tf.layers.dense(inputs=first_layer, units=64, activation=tf.nn.relu)
    """Sweet! We have two layers that each have relu activation functions.
    TODO: Now it is up to you to finish the model. Add one more hidden layer with 32 units and then the output layer"""
    
    
    """NOTE: We want our output layer to be of size [None, 10] since there are 10 digits and we can use the vector as
    a probability output. But we want the probability to be between 0 to 1 which relu doesn't do. What other activation
    funciton can we use? HINT: Look at the above article"""
    return output_layer


    

In [None]:
"""
Here we have defined the operation of our model through a function. But this isn't actually where we create
the tensorflow graph. We can do that right below like a normal python script. The important thing is that this
is recognized as a graph model for tf because our operations are done on what we define as tf placeholders
When we actually run our session. We will feed in numpy matrices into the place holders.
It is important to recognize that our placeholders have defined shapes. Tensorflow will automatically compute
the shapes at each level of the model based on this, essentially making it a "fixed" graph
"""
inputs=tf.placeholder(tf.float32, shape=[None, 784])
#Because normal dense layers take in 1D data to connect to each neuron, we can't feed in the image as a 2D matrix.
#Instead for each image we will feed in each pixel: 28x28 = 784. Did you catch that when we built the model?

labels = 0 #TODO: Make a placeholder for the labels which will be a vector of size 10 for each image.

#Now we can create the abstract graph by essentially using these placeholders. Nothing actually runs in the following lines.
#All that happens is tf will use the placeholder shapes to then create space based on the defined model, so that each
#step has defined tensor shapes
outputs = ourModel(inputs)

#That was easy! Just one line in this case! Outputs stores predictions based on the given inputs

#Ok now let's define our loss and how we will train it. It's ok if you don't understand these specific loss functions
#and optimizers. You can look into these in your own time but for this notebook, you can consider these standard for classification
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=outputs, labels=labels))

#And now we have to optimize this loss
trainer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)

#Let's also define accuracy for our viewing
_, accuracy = tf.metrics.accuracy(tf.argmax(labels, 1), tf.argmax(outputs, 1))

#Let's set some hyperparamaters for batch size and epochs
batch_size = 128
num_epochs = 10
    

In [None]:
"""
NOTE: You might have gotten some warning from the previous cell. You can just ignore that :)
We have constructed our model! Now let's move on to actually training it. This is where we have to run
a tensorflow session in the following way
"""
#First let's scale the values of our images from [0, 255] (as we saw earlier) to [0, 1]
x_train = x_train / 255.0
x_test = x_test / 255.0 
x_train = 0 #TODO: keep reading until sess.run then uncomment and finish this. There are many ways of doing this task but look into a reshape operation in numpy
x_test = 0

init=tf.global_variables_initializer() #initializer of all variables (The random weights of the model before training).
local_init = tf.local_variables_initializer() #This isn't always needed but tf.metrics.accuracy has local variables to initialize


In [None]:
with tf.Session() as sess:
    #each call to sess.run() runs the approriate input within our tf session
    sess.run(init)
    sess.run(local_init)
    print("Starting to run session...")
    for epoch in range(num_epochs): #iterate through each epoch
        for i in range(x_train.shape[0]//batch_size): #iterate through the number of batches we can make from our samples
            batch_images = x_train[i*batch_size:(i+1)*batch_size]
            batch_labels = y_train[i*batch_size:(i+1)*batch_size]
            #We have a batch of training images and labels. 
            #Now we need to run the model by inputing these into the place holders we made
            _ = sess.run([trainer], feed_dict={inputs: batch_images, labels: batch_labels})
            """The output of sess.run will be the input array. Essentially, the model will run
            up until that variable by feeding in the corresponding placeholders in feed_dict.
            EX: model_loss = sess.run([loss], feed_dict={inputs: batch_images, labels: batch_labels}) will run
            until the loss variable of the model is calculated, stop, and return it.
            
            Your return array can have multiple variables: model_loss, _ = sess.run([loss, trainer], feed_dict={inputs: batch_images, labels: batch_labels})
            
            By having trainer has an output variable, we ensure that the model is run until the trainer which means
            the model will work to minimize loss
            
            TODO: The sess.run line will error out. Why? Because we defined inputs as a placeholder with shape [None, 784]
            But we are feeding in something of shape [None, 28, 28]. How can we fix this? (Look at the above TODO now)."""
            
        """Once you have that fixed, we are done! Our model will train on its own
        TODO: Ok but I want to see how well the model is doing. Here, for each epoch, let's feed in testing images
        and output the accuracy as we defined earlier in the model. For each epoch test a different segment 
        of test data (similar to what we did in the batching).
        NOTE: If you run into an error where your accuracy is outputting something involving <Tensor...>, 
        make sure the variable you are using to store your accuracy is NOT 'accuracy' like we defined earlier. Tensorflow, unfortunately,
        will not store the variables separately and will confused 'accuracy' with the one defined in the graph"""
        #YOUR CODE HERE