# Tensorflow 1

Google Colaboratory link: 
https://colab.research.google.com/drive/1giF0GHVgFTGEKCQho9iKd4p1osa80xGw



In [None]:
!pip install tensorflow-gpu==1.13.1

In [None]:
import tensorflow as tf

import numpy as np

from sklearn import datasets
from sklearn.model_selection import train_test_split

TensorFlow is a deep learning framework brought to you by Google. It allows you to build computational graphs from tensors and operations on them and then helps those _tensors flow_.

As in PyTorch, we'll start with the Iris dataset.

In [None]:
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris["data"], iris["target"], test_size=0.2)
iris["feature_names"], iris["target_names"]

A computational graph is made of:
* placeholders (inputs to the graph)
* variables
* operations on them and their results

In [None]:
# of course, you can also define your own operations - tensorflow's syntax is in many ways similar to numpy's 
def relu(activation):
    return activation * tf.cast((activation > 0), dtype=tf.float32)

In [None]:
D_in, H, D_out = 4, 10, 3

X = tf.placeholder(tf.float32, shape=(None, D_in))
y = tf.placeholder(tf.int32, shape=(None))

L = tf.one_hot(y, D_out)
W1 = tf.Variable(tf.random_uniform((D_in, H)))
W2 = tf.Variable(tf.random_uniform((H, D_out)))

# and our graph is buit here:
y_pred = relu(X @ W1) @ W2 

print(L.shape)

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=L, logits=y_pred))


# this is a more explicit, but also impractical AF method of performing gradient descent
grad_W1, grad_W2 = tf.gradients(loss, [W1, W2])

lr = 1e-2

new_W1 = W1.assign(W1 - lr * grad_W1)
new_W2 = W2.assign(W2 - lr * grad_W2)
updates = tf.group(new_W1, new_W2)


In Tensorflow 1, as opposed to PyTorch graphs are constructed statically - which means you need to define them in your code and cannot change later.

An interesting thing to note is that a graph can have many inputs and many outputs, such as `y_pred` and `loss` here

Another important thing to know is that nothing has been calculated or initialized yet!

In order to actually run the computations, we'll have to run them in tensorflow's `Session`.

In [None]:
train_dict = {X: X_train, y: y_train}

num_iterations = 500

with tf.Session() as sess:
#     with tf.device("/gpu:0"): #"/cpu:0" or "/gpu:0"
    tf.global_variables_initializer().run()

    for i in range(num_iterations):
        loss_val, _ = sess.run([loss, updates], feed_dict=train_dict)
        if i % 50 == 0: print(loss_val)
        

Let's train again, but without perfoming Gradient Descent manually

In [None]:
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

train_dict = {X: X_train, y: y_train}

num_iterations = 500

with tf.Session() as sess:
    with tf.device("/gpu:0"): #"/cpu:0" or "/gpu:0"
        tf.global_variables_initializer().run()
        for i in range(num_iterations):
            train_step.run(feed_dict=train_dict)
            loss_val = loss.eval(feed_dict=train_dict)
            if i % 50 == 0: print(loss_val)

## We don't have to write everything manually, do we?

Tensorflow 1 provides implementations of typical  layers in it's `layers` module. However, as opposed to PyTorch, it doesn't have one single way to minimize the amount of written code when creating the model.

Some of the high-level wrappers include:

* `tf.layers`
* `TFLearn`
* `Estimator API`
* `Pretty Tensor`
* `Keras`

Of the above, `tf.Estimator` has for a long tme been the most preferred approach.


In [None]:
def my_estimator(features, labels, mode):
    
    X = tf.cast(features, tf.float32)    
    
    hidden1 = tf.layers.dense(X, 10, activation=tf.nn.relu)
    
    hidden2 = tf.layers.dense(hidden1, 10, activation=tf.nn.relu)
    
    hidden3 = tf.layers.dense(hidden2, 3, activation=tf.nn.softmax)
    
    y_out = hidden2
        
    predictions = {
        "classes": tf.argmax(input=y_out, axis=1),
        "probabilities": y_out
    }
    
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
    
    labels_onehot = tf.one_hot(indices=labels, depth=10)
    softmax_loss = tf.losses.softmax_cross_entropy(logits=y_out, onehot_labels=labels_onehot)
  
    loss = softmax_loss 
    
    if mode == tf.estimator.ModeKeys.TRAIN:
        global_step = tf.train.get_global_step()
        start_lr = 1e-2
        lr = tf.train.exponential_decay(start_lr, global_step, 500, 0.9, staircase=True) # we can use tricks, such as a decaying learning rate
        optimizer = tf.train.AdamOptimizer(lr)
        train_op = optimizer.minimize(loss=loss, global_step=global_step)
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
    
    eval_metric = {
        "accuracy": tf.metrics.accuracy(labels=labels, predictions=predictions["classes"])
    }
    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metric)

In order for estimator to work, we must provide input functions for it:

In [None]:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x=X_train,
    y=y_train,
    batch_size=256,
    num_epochs=None,
    shuffle=True    
)

test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x=X_test,
    y=y_test,
    num_epochs=1,
    shuffle=False    
)

# this differs from test_input_fn in its number of epochs
# if num_epochs == None, input_fn returns data for as long as we want it to (so it's good for training)
train_test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x=X_train,
    y=y_train,
    batch_size=256,
    num_epochs=1,
    shuffle=True    
)

In [None]:
model = tf.estimator.Estimator(model_fn=my_estimator, model_dir='/tmp/my_cnn')

In [None]:
model.train(input_fn=train_input_fn, steps=100)

In [None]:
test_results = model.evaluate(input_fn=test_input_fn)
train_results = model.evaluate(input_fn=train_test_input_fn)
print('train', train_results)
print('test', test_results)


Over the last few years, Tensorflow has become the most popular go-to Deep Learning framework, especially for professionals who need their models production-ready. 

Apart from the basics we've covered here, it provides awesome tools such as Tensorboard and Tensorflow serving. Moreover, Tensorflow is also available in other programming laguages, not 

The one group who doesn't seem to have been won over are researchers:

https://twitter.com/karpathy/status/868178954032513024

Indeed, many people often complain about Tensorflow's non-intuitiveness. This is why Tensorflow 2.0 has been created, with `eager mode` as it's first-class citizen.

Let's now see what's changed in TF 2.0 !