# TensorFlow Tutorial - Basics #
###  Štěpán Procházka, April 2019

This notebook is written in Python3 and to run it, install following dependencies (preferably in virtual environment).

```
# create and activate virtual environment
$ python3 -m venv env
$ . env/bin/activate

# install dependencies
$ pip install numpy matplotlib scipy tqdm tensorflow  # or tensorflow-gpu
```

In [None]:
# run this cell to effectively disable dedicated GPU (if present)
%env CUDA_VISIBLE_DEVICES=""

In [None]:
import matplotlib.pyplot as plt
import numpy as np

## ToC ##
1. Tensorflow basics
    - Installation
    - [`Hello, World!`](#Hello%2C-World!-in-TF)
    - [`Hello, ML!`](#Hello%2C-ML!)
    - [TensorFlow, Graph, Operation, Tensor, Session](#TF-basic-concepts)
    - [kNN classifier](#kNN-classifier)
    - [Variables](#tf.Variable)
    
----
## How to install ##

```sh
$ python3 -m venv env
$ . env/bin/activate

$ pip install tensorflow
# alternatively (needs nvidia driver, CUDA and cuDNN installation)
$ pip install tensorflow-gpu
```

----
## Hello, World! in TF##

In [None]:
import tensorflow as tf

a = tf.constant(6)
b = tf.constant(7)

# result = a * b
c = tf.multiply(a, b)

with tf.Session():
    result = c.eval()
    print(result)

----
## Hello, ML! ##

In [None]:
import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
%matplotlib widget

fig, (ax_im, ax_pred) = plt.subplots(ncols=2, figsize=(8.5, 4))
fig.set_label('MNIST Classifier')

def show_example(i, is_train=True):
    ax_im.clear()
    x, y = (x_train[i], y_train[i]) if is_train else (x_test[i], y_test[i])
    ax_im.set_axis_off()
    ax_im.set_title('{}[{}]; gold label {}'.format(
        'train' if is_train else 'test', i, y
    ))
    ax_im.imshow(x, cmap='gray')
    
    ax_pred.clear()
    probas = model.predict(x[None, ...])[0]
    ax_pred.set_xticks(np.arange(len(probas)))
    ax_pred.set_ylim(0, 1)
    ax_pred.set_title('predicted {} ({:.3f})'.format(
        np.argmax(probas), np.max(probas)
    ))
    ax_pred.bar(np.arange(len(probas)), probas)

show_example(0, is_train=False)

In [None]:
model.evaluate(x_test, y_test)  # random as expected

In [None]:
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

In [None]:
show_example(0, is_train=False)

In [None]:
del model

----
## TF basics concepts ##
- library for numerical computation
    - with emphasis on numerical optimization for the purpose of machine learning
- supports transparent GPU accelerated computation (in properly set environment)
- well maintained (open-source + actively developed for internal use by Google)

### tf.Graph ###
- computational **graph** (~ DAG and is treated that way)
- [more ...](https://www.tensorflow.org/guide/graphs)

- #### tf.Operation ####
    - **nodes** of the graph
    - input tensors -> output tensors
    
- #### tf.Tensor ####
    - **edges**
    - rank, shape, dtype
    - serve as inputs - arguments of operation to be constructed
    - serve as ouputs - results of operation construction
    - plain descriptors without the actual values
    - [more ...](https://www.tensorflow.org/guide/tensors)

### tf.Session ###
- runtime environment for a graph
- resource **manager** (memmory management for values associated to tensors)
----

In [None]:
import tensorflow as tf

a = tf.constant(6)     # to which graph the operations belong to?
b = tf.constant(7)
c = tf.multiply(a, b)

with tf.Session():     # which graph the session manages?
    result = c.eval()
    print(result)
    
# => implicitly we are working with default graph (one is created by TF on import)

In [None]:
a, b, c, result  # as expected, variables `a`, `b` and `c` contain tensor descriptors, not the actual values

In [None]:
graph = tf.Graph()
with graph.as_default():                   # explicit control of graph context
    a = tf.constant(6)
    b = tf.constant(7)

    c = tf.multiply(a, b)

with tf.Session(graph=graph) as sess:  # explicit control of graph to be managed
    result = c.eval(session=sess)      # explicit control of runtime environment for computation
    print(result)

In [None]:
graph = tf.Graph()
with graph.as_default():
    with tf.Session() as sess:      # session allows for changes of underlying graph
        a = tf.constant(6)
        b = tf.constant(7)
        c = tf.multiply(a, b)

        result = c.eval()
        print(result)

In [None]:
graph = tf.Graph()
with graph.as_default():
    a = tf.constant(6)
    b = tf.constant(7)
    c = a * b

    with tf.Session():
        result = c.eval()
        print(result)

In [None]:
a, b, c, result

### Lessons learned ###
- `tf.Graph.as_default()` context manager sets given graph to be the default one
    - operations created in this context belong to this graph
    - sessions created in this context manage this graph
    - context can be re-entered
- `tf.Session` manages a given graph (implicitly default one)
    - cannot be reopened - its lifetime is `init -> close`
    - managed graph can be modified during its lifetime

----
**Note:** tensor is just an edge in the graph, once its value is computed, it remains the same through the whole run

In [None]:
graph = tf.Graph()
with graph.as_default():
    vec = tf.random_uniform(shape=[])
    rand_1 = vec + 1
    rand_2 = vec + 1

In [None]:
with tf.Session(graph=graph) as sess:
    print(rand_1.eval())

In [None]:
with tf.Session(graph=graph) as sess:
    print(rand_2.eval())

In [None]:
with tf.Session(graph=graph) as sess:
    print(sess.run((rand_1, rand_2)))  # sess.run for joint evaluation of multiple tensors

---
### tf.placeholder ###
- inputs/parameters of the computation
- passed to session on run (via feed_dict)

In [None]:
graph = tf.Graph()
with graph.as_default():
    a = tf.placeholder(tf.int32)
    b = tf.placeholder(tf.int32)
    c = tf.multiply(a, b)
        
with tf.Session(graph=graph) as sess:
    print('(a={}, b={}) -> c={}'.format(*sess.run((a, b, c), feed_dict={a: 6, b: 7})))
    print('(a={}, b={}) -> c={}'.format(*sess.run((a, b, c), feed_dict={a: 6, b: 111})))

In [None]:
with tf.Session(graph=graph) as sess:
    print('(a={}, b={}, c={}) -> c={}'.format(*sess.run((a, b, c, c), feed_dict={a: 6, b: 7, c: 13})))

# any tensor can be fed, and its value is not computed (as there is no reason for it)

In [None]:
with tf.Session(graph=graph):
    c.eval()
    
# all predecessors of value we are trying to obtain must be fed

----
## kNN classifier ##
Knowing the basics of TF we will implement k-NearestNeighbours and compare the performance of various implementations

In [None]:
import numpy as np
from scipy.spatial.distance import cdist
from scipy.stats import mode
from tqdm import tqdm_notebook as tqdm
from tensorflow.keras.datasets import mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

class NumpyKNN:
    def __init__(self, k, metric='cosine'):
        self.k = k
        self.metric = metric
    
    def _preprocess_data(self, x_data):
        x_data = x_data.copy()
        return x_data.reshape(len(x_data), -1)
    
    def fit(self, x_train, y_train):
        self.x_train = self._preprocess_data(x_train)
        self.y_train = y_train
        
    def predict_batch(self, x_batch):
        x_batch = self._preprocess_data(x_batch)
        dists = cdist(self.x_train, x_batch, metric=self.metric)
        
        neighbours_idxs = np.argpartition(dists, self.k, axis=0)[:self.k]
        neighbours_y = self.y_train[neighbours_idxs]
        choices, counts = mode(neighbours_y, axis=0)
        
        return choices, counts / self.k
    
    def predict(self, x_data, batch_size=64):
        y_data = np.empty(shape=len(x_data), dtype=np.uint8)
        for begin_i in tqdm(range(0, len(x_data), batch_size)):
            end_i = begin_i + batch_size
            y_data[begin_i:end_i] = self.predict_batch(x_data[begin_i:end_i])[0]
        return y_data

    def evaluate(self, x_test, y_test, batch_size=64):
        y_pred = self.predict(x_test, batch_size=batch_size)
        return np.mean(y_test == y_pred)

In [None]:
kNN = NumpyKNN(3)
kNN.fit(x_train, y_train)

In [None]:
kNN.evaluate(x_test, y_test)

# ~5 min to test 10000 test examples against 60000 train examples

In [None]:
del kNN

In [None]:
import numpy as np
from scipy.stats import mode
from tqdm import tqdm_notebook as tqdm
from tensorflow.keras.datasets import mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

class FastNumpyKNN:
    def __init__(self, k, metric='cosine'):
        self.k = k
        self.metric = metric
    
    def _preprocess_data(self, x_data):
        x_data = x_data.copy()
        x_flat = x_data.reshape(len(x_data), -1)
        x_norm2 = np.square(x_flat).sum(axis=-1)
        if self.metric == 'cosine':
            x_flat /= np.sqrt(x_norm2)[:, None]
        elif self.metric == 'euclid':
            pass
        else:
            raise NotImplementedError
        
        return x_flat, x_norm2
    
    def fit(self, x_train, y_train):
        self.x_train, self.x_train_norm2 = self._preprocess_data(x_train)
        self.y_train = y_train
        
    def predict_batch(self, x_batch):
        x_batch, x_batch_norm2 = self._preprocess_data(x_batch)
        
        pairwise_dot = np.matmul(self.x_train, x_batch.T)
        if self.metric == 'cosine':
            dists = 1 - pairwise_dot
        elif self.metric == 'euclid':
            dists = self.x_train_norm2[:, None] - 2 * pairwise_dot
        
        neighbours_idxs = np.argpartition(dists, self.k, axis=0)[:self.k]
        neighbours_y = self.y_train[neighbours_idxs]
        choices, counts = mode(neighbours_y, axis=0)
        
        return choices, counts / self.k
    
    def predict(self, x_data, batch_size=64):
        batch_size = len(x_data) if (batch_size is None) else batch_size
        y_data = np.empty(shape=len(x_data), dtype=np.uint8)
        for begin_i in tqdm(range(0, len(x_data), batch_size)):
            end_i = begin_i + batch_size
            y_data[begin_i:end_i] = self.predict_batch(x_data[begin_i:end_i])[0]
        return y_data

    def evaluate(self, x_test, y_test, batch_size=64):
        y_pred = self.predict(x_test, batch_size=batch_size)
        return np.mean(y_test == y_pred)

In [None]:
fast_kNN = FastNumpyKNN(3, metric='cosine')
fast_kNN.fit(x_train, y_train)

In [None]:
fast_kNN.evaluate(x_test, y_test, batch_size=64)

# ~1 min -> 5 times speedup compared to naive

In [None]:
del fast_kNN

In [None]:
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
from tqdm import tqdm_notebook as tqdm
from tensorflow.keras.datasets import mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

sklearn_kNN = KNeighborsClassifier(n_neighbors=3, metric='cosine', n_jobs=-1)
sklearn_kNN.fit(x_train.reshape(len(x_train), -1), y_train)

In [None]:
batch_size = 256
x_flat = x_test.reshape(len(x_test), -1)
y_data = np.empty(shape=len(x_test), dtype=np.uint8)
for begin_i in tqdm(range(0, len(x_test), batch_size)):
    end_i = begin_i + batch_size
    y_data[begin_i:end_i] = sklearn_kNN.predict(x_flat[begin_i:end_i])
print((y_test == y_data).mean())

# ~1 min => comparable to fast numpy implementation but not that customizable

In [None]:
del sklearn_kNN

----

In [None]:
import numpy as np
import tensorflow as tf
from scipy.stats import mode
from tqdm import tqdm_notebook as tqdm
from tensorflow.keras.datasets import mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class TfKNN:
    def __init__(self, k, metric='cosine'):
        self.k = k
        self.metric = metric
        self.graph = tf.Graph()  # define TensorFlow resources
        self.session = tf.Session(graph=self.graph)
        
    def __del__(self):  # handle releasing resources
        self.session.close()

    def fit(self, x_train_np, y_train_np):
        # we are going to build the inference graph
        # - input - batch to be predicted
        # - preprocessing - normalization, norm computation
        # - core - similarity computation

        with self.graph.as_default():
            # definition of inputs
            self.x_tf = tf.placeholder(dtype=tf.float32, shape=((None,) + x_train_np.shape[1:]), name='x_data')
            
            # definition of preprocessing of data
            x_preprocessed_tf = tf.reshape(self.x_tf, shape=(tf.shape(self.x_tf)[0], -1))
            x_norm2_tf = tf.reduce_sum(tf.square(x_preprocessed_tf), axis=-1, keepdims=True)
            if self.metric == 'cosine':
                x_preprocessed_tf = x_preprocessed_tf / tf.sqrt(x_norm2_tf)
            elif self.metric == 'euclid':
                pass
            else:
                raise NotImplementedError
            
            # actual preprocessing of training data using just defined preprocessing operations
            x_train_prep_np, x_train_norm2_np = self.session.run(
                (x_preprocessed_tf, x_norm2_tf),
                feed_dict={self.x_tf: x_train_np}
            )
            
            # embedding of preprocessed training data as constants to TF graph
            x_train_tf = tf.constant(x_train_prep_np, dtype=tf.float32, name='x_train')
            x_train_norm2_tf = tf.constant(x_train_norm2_np, dtype=tf.float32, name='x_train_norm2')
            
            # definition of knn lookup
            pairwise_dot_tf = tf.matmul(x_train_tf, tf.transpose(x_preprocessed_tf))
            if self.metric == 'cosine':
                dists_tf = 1 - pairwise_dot_tf
            elif self.metric == 'euclid':
                dists_tf = x_train_norm2_tf - 2 * pairwise_dot_tf
            
            # result of TF computation - indices of neighbours (the expensive part of computation)
            self.neighbours_idxs_tf = tf.transpose(tf.math.top_k(- tf.transpose(dists_tf), self.k, sorted=False)[1])
            
            self.y_train_np = y_train_np
            
            # force lazy loaded structures to be load (just for convenience)
            self.session.run(self.neighbours_idxs_tf, feed_dict={self.x_tf: x_train_np[:1]})
    
    def predict_batch(self, x_batch_np):
        neighbours_idxs_np = self.session.run(self.neighbours_idxs_tf, feed_dict={self.x_tf: x_batch_np})
        neighbours_y_np = self.y_train_np[neighbours_idxs_np]
        choices_np, counts_np = mode(neighbours_y_np, axis=0)
        
        return choices_np, counts_np / self.k
    
    def predict(self, x_data, batch_size=64):
        batch_size = len(x_data) if (batch_size is None) else batch_size
        y_data = np.empty(shape=len(x_data), dtype=np.uint8)
        for begin_i in tqdm(range(0, len(x_data), batch_size)):
            end_i = begin_i + batch_size
            y_data[begin_i:end_i] = self.predict_batch(x_data[begin_i:end_i])[0]
        return y_data

    def evaluate(self, x_test, y_test, batch_size=64):
        y_pred = self.predict(x_test, batch_size=batch_size)
        return np.mean(y_test == y_pred)

In [None]:
tf_kNN = TfKNN(3, metric='cosine')
tf_kNN.fit(x_train, y_train)

In [None]:
tf_kNN.evaluate(x_test, y_test)

# CPU ~ 15 sec => 4x speedup compared to sklearn/fast numpy, 20x speedup overall
# GPU ~ 4 sec => ~15x speedup compared to sklearn/fast numpy

In [None]:
del tf_kNN

----
#### tf.Variable ####
- can be reassigned new value
- lifecycle
    - needs to be initialized
    - keeps its value between session runs
    - is deallocated when session closes
- an operation with behaviour of tensor

In [None]:
graph = tf.Graph()
with graph.as_default():
    with tf.Session() as sess:
        x = tf.placeholder(tf.int32, shape=(), name="x")
        acc = tf.get_variable("acc", shape=(), dtype=tf.int32, initializer=tf.zeros_initializer())
        add_x = acc.assign_add(x)

        tf.global_variables_initializer().run()
        print(acc.eval())
        for i in range(1, 11):
            print("+ {:2d}:".format(i), add_x.eval(feed_dict={x: i}))

In [None]:
import numpy as np
import tensorflow as tf
from scipy.stats import mode
from tqdm import tqdm_notebook as tqdm
from tensorflow.keras.datasets import mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class TfOnlineKNN:
    def __init__(self, k, metric='cosine'):
        self.k = k
        self.metric = metric
        self.graph = tf.Graph()
        self.session = tf.Session(graph=self.graph)
        self._is_built = False
    
    def __del__(self):  # handle releasing resources
        self.session.close()

    def _build(self, ex_shape):
        if self._is_built:
            return
        
        with self.graph.as_default():
            # definition of placeholders
            self.x_tf = tf.placeholder(dtype=tf.float32, shape=((None,) + tuple(ex_shape)), name='x_data')
            
            # definition of preprocessing of data
            x_preprocessed_tf = tf.reshape(self.x_tf, shape=(tf.shape(self.x_tf)[0], -1))
            x_norm2_tf = tf.reduce_sum(tf.square(x_preprocessed_tf), axis=-1, keepdims=True)
            if self.metric == 'cosine':
                x_preprocessed_tf = x_preprocessed_tf / tf.sqrt(x_norm2_tf)
            elif self.metric == 'euclid':
                pass
            else:
                raise NotImplementedError
            
            # definition of reference data (train) as mutable variables
            x_train_tf = tf.get_variable("x_train", shape=(0, np.product(ex_shape)), dtype=tf.float32, trainable=False, validate_shape=False)
            x_train_norm2_tf = tf.get_variable("x_train_norm2", shape=(0, 1), dtype=tf.float32, trainable=False, validate_shape=False)
            
            self.enlarge_train = tf.group((
                tf.assign(x_train_tf, tf.concat((x_train_tf.read_value(), x_preprocessed_tf), axis=0), validate_shape=False),
                tf.assign(x_train_norm2_tf, tf.concat((x_train_norm2_tf.read_value(), x_norm2_tf), axis=0), validate_shape=False)
            ))
            
            # definition of knn lookup
            pairwise_dot_tf = tf.matmul(x_train_tf, tf.transpose(x_preprocessed_tf))
            if self.metric == 'cosine':
                dists_tf = 1 - pairwise_dot_tf
            elif self.metric == 'euclid':
                dists_tf = x_train_norm2_tf - 2 * pairwise_dot_tf
            
            self.neighbours_idxs_tf = tf.transpose(tf.math.top_k(- tf.transpose(dists_tf), self.k, sorted=False)[1])
            
            # initialize variables
            self.session.run(tf.global_variables_initializer())

        self._is_built = True
        
    def fit_append(self, x_train_np, y_train_np):
        self._build(x_train_np.shape[1:])
        self.session.run(self.enlarge_train, feed_dict={self.x_tf: x_train_np})
        self.y_train_np = np.concatenate((self.y_train_np, y_train_np)) if hasattr(self, "y_train_np") else y_train_np
    
    def predict_batch(self, x_batch_np):
        neighbours_idxs_np = self.session.run(self.neighbours_idxs_tf, feed_dict={self.x_tf: x_batch_np})
        neighbours_y_np = self.y_train_np[neighbours_idxs_np]
        choices_np, counts_np = mode(neighbours_y_np, axis=0)
        
        return choices_np, counts_np / self.k

    def predict(self, x_data, batch_size=64):
        batch_size = len(x_data) if (batch_size is None) else batch_size
        y_data = np.empty(shape=len(x_data), dtype=np.uint8)
        for begin_i in tqdm(range(0, len(x_data), batch_size), leave=False):
            end_i = begin_i + batch_size
            y_data[begin_i:end_i] = self.predict_batch(x_data[begin_i:end_i])[0]
        return y_data

    def evaluate(self, x_test, y_test, batch_size=64):
        y_pred = self.predict(x_test, batch_size=batch_size)
        return np.mean(y_test == y_pred)

In [None]:
tfo_kNN = TfOnlineKNN(3, metric='cosine')

In [None]:
grow_size = 500
for begin in range(0, 10000, grow_size):
    end = begin + grow_size
    tfo_kNN.fit_append(x_train[begin:end], y_train[begin:end])
    print("{:5d} samples:".format(end), tfo_kNN.evaluate(x_test, y_test))