## TensorFlow Estimators

Сегодня научимся рабоатть с:
1. tf.estimators
2. tf.data
3. tensorboard
4. tf.layers

In [1]:
import os
import tensorflow as tf
import numpy as np

from tensorflow.python.keras.datasets import imdb
from tensorflow.python.keras.preprocessing import sequence

tf.enable_eager_execution()
print(tf.__version__)

  from ._conv import register_converters as _register_converters


1.9.0


### Eager Execution
[TensorFlow's eager execution](https://www.tensorflow.org/guide/eager) is an imperative programming environment that evaluates operations immediately, without building graphs.

<img src="http://www.netlore.ru/upload/files/19/large_p19hom1f751nk1c40ml57hu2skj.jpg" width=300>

### Как проверять размерности?

При прототипировании новых архитектур, нужно следить за размерностями тензоров. Как это делать в tensorflow? 

**numpy-like** экспериментирование:

In [2]:
batch_size = 64
time_steps = 20
emb_size = 100

# создаем игрушечный датасет
x_toy = np.random.rand(batch_size, time_steps, emb_size)
y_toy = np.random.randint(0, 9, batch_size)

print('Y shape: {}'.format(y_toy.shape))
print('X shape: {}'.format(x_toy.shape))

Y shape: (64,)
X shape: (64, 20, 100)


<img src="https://ai2-s2-public.s3.amazonaws.com/figures/2017-08-08/73d826d4c2363701b88e3e234fe3b8756c0f9671/3-Figure1-1.png" width=700>

Реализуем операцию свертки над нашими данными. 

In [3]:
# переводим переменные в тензоры

x_toy = tf.convert_to_tensor(x_toy, dtype=tf.float32)
y_toy = tf.convert_to_tensor(y_toy, dtype=tf.int64)

In [4]:
# случайно дропаем 10% слов (зануляем)

x_toy = tf.layers.dropout(x_toy,
                          rate=0.1,
                          noise_shape=[batch_size, time_steps, 1],
                          training=True)

In [5]:
# можем посмотреть на сам тензор

print(x_toy)

tf.Tensor(
[[[0.         0.         0.         ... 0.         0.         0.        ]
  [0.60919577 1.0656664  0.31323364 ... 0.38649768 0.85007447 0.5085799 ]
  [0.40480652 0.5662459  0.7426091  ... 0.32881692 0.69997364 0.7571073 ]
  ...
  [0.7822716  0.07132865 0.02138079 ... 0.50230515 0.5138934  1.0532569 ]
  [0.3639544  0.40157005 0.9155454  ... 0.09712479 1.089555   1.0157101 ]
  [0.67328674 1.0308605  0.0679797  ... 0.78731126 0.8427862  0.91051745]]

 [[0.6855246  1.0526739  1.0584143  ... 0.05824028 0.08301507 0.2710551 ]
  [0.58707875 0.89950144 0.7581761  ... 0.15624158 0.06752342 0.70731944]
  [0.8280965  0.7006203  0.61123955 ... 0.06830905 0.02702304 0.38327578]
  ...
  [0.5886233  0.20313919 0.17277178 ... 0.5318525  0.536158   0.4839056 ]
  [0.         0.         0.         ... 0.         0.         0.        ]
  [0.397312   0.6482189  0.7136935  ... 0.635741   0.13885258 0.18561363]]

 [[0.81494445 0.11731581 0.2807727  ... 0.2318563  0.9170693  0.10700139]
  [1.011232

In [6]:
# применяем свертку

conv_2 = tf.layers.conv1d(x_toy,
                          filters=8,
                          kernel_size=3,
                          strides=2)

print(conv_2.shape)

# макспулинг
max_pool = tf.reduce_max(conv_2, axis=1)
print(max_pool.shape)

# денс-слой
fc = tf.layers.dense(max_pool, 2)
print(fc.shape)

(64, 9, 8)
(64, 8)
(64, 2)


In [7]:
# softmax на выходе

fc_soft = tf.nn.softmax(fc)

# предсказания
preds = tf.argmax(fc_soft, axis=1)

print(preds.shape)

(64,)


### Подготовка данных

Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers)

In [8]:
vocab_size = 5000
sentence_size = 200
model_dir = 'model_dir'

pad_id = 0
start_id = 1
oov_id = 2
index_offset = 2

print("Loading data...")
(x_train_variable, y_train), (x_test_variable, y_test) = imdb.load_data(num_words=vocab_size,
                                                                        start_char=start_id,
                                                                        oov_char=oov_id,
                                                                        index_from=index_offset)

word_index = imdb.get_word_index()

print(len(y_train), "train sequences")
print(len(y_test), "test sequences")

print("Pad sequences (samples x time)")
x_train = sequence.pad_sequences(x_train_variable, 
                                 maxlen=sentence_size,
                                 truncating='post',
                                 padding='post',
                                 value=pad_id)

x_test = sequence.pad_sequences(x_test_variable, 
                                maxlen=sentence_size,
                                truncating='post',
                                padding='post', 
                                value=pad_id)

print("x_train shape:", x_train.shape)
print("x_test shape:", x_test.shape)

Loading data...
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
x_train shape: (25000, 200)
x_test shape: (25000, 200)


## tf.data

[The tf.data API](https://www.tensorflow.org/api_docs/python/tf/data) enables you to build complex input pipelines from simple, reusable pieces. The tf.data API makes it easy to deal with large amounts of data, different data formats, and complicated transformations.


he tf.data module contains a collection of classes that allows you to easily load data, manipulate it, and pipe it into your model. [This document](https://www.tensorflow.org/guide/datasets_for_estimators) introduces the API by walking through two simple examples.

In [9]:
params = {
    'batch_size': 256,
    'num_epochs': 5,
    'train_size': int(len(x_train) * 0.9)
}

def input_fn(data, labels, params, is_training):
    # tf.data.TextLineDataset
    # tf.data.TFRecordDataset
    dataset = tf.data.Dataset.from_tensor_slices((data, labels))

    if is_training:
        # перемешиваем данные для каждой эпохи
        dataset = dataset.shuffle(buffer_size=params['train_size'])
        dataset = dataset.repeat(count=params['num_epochs'])

    dataset = dataset.batch(params['batch_size'])
    dataset = dataset.map(lambda x, y: ({'data': x}, y))
    # можно попросить tensorflow заранее считать батчи, чтобы GPU не простаивала
    dataset = dataset.prefetch(buffer_size=100)
    return dataset


### How to debug?

Как посмотреть на то, что выдает tf.data.Dataset?

In [None]:
# RESTART THE KERNEL
# DISABLE EAGER EXECUTION

dataset = input_fn(x_train, y_train, params=params, is_training=True)

In [None]:
iterator = dataset.make_initializable_iterator()
batch = iterator.get_next()
init = iterator.initializer

In [None]:
with tf.Session() as sess:
    sess.run(init)
    for _ in range(5):
        x, y = sess.run(batch)
        print(x['data'].shape)
        print(y.shape)
        print()

### Premade estimators

<img src="https://www.tensorflow.org/images/tensorflow_programming_environment.png" width=600>

**An Estimator** is TensorFlow's high-level representation of a complete model. It handles the details of initialization, logging, saving and restoring, and many other features so you can concentrate on your model.

Estimators encapsulate the following actions:

* training
* evaluation
* prediction
* export for serving


Four steps to become an estimator master:

1. ~~Write one or more dataset importing functions~~
2. Define the [feature columns](https://www.tensorflow.org/guide/feature_columns)
3. Instantiate the relevant pre-made Estimator
4. Call a training, evaluation, or inference method

### Logistic Regression

В качестве входа для логистической регрессии используем Bag-of-Words

<img src="https://cdn-images-1.medium.com/max/1600/1*j3HUg18QwjDJTJwW9ja5-Q.png" width=500>

In [None]:
all_classifiers = {}

# Определяем функцию, которая будет запускать обучение и валидацию
def train_and_evaluate(classifier):
    all_classifiers[classifier.model_dir] = classifier
    classifier.train(lambda: input_fn(x_train, y_train, params=params, is_training=True))
    results = classifier.evaluate(lambda: input_fn(x_test, y_test, params=params, is_training=False))

    print()
    for key, value in results.items():
        print(f'{key}: {value}')
    
    # ресетим граф
    tf.reset_default_graph()

In [None]:
bow = tf.feature_column.categorical_column_with_identity(key='data',
                                                         num_buckets=vocab_size)

In [None]:
config = tf.estimator.RunConfig(tf_random_seed=123,
                                model_dir=os.path.join(model_dir, 'bow_sparse'),
                                save_summary_steps=5)

classifier = tf.estimator.LinearClassifier(feature_columns=[bow],
                                           config=config,
                                           optimizer='Adam',
                                           n_classes=2)

In [None]:
train_and_evaluate(classifier)

### Embeddings

Из sparse представления делаем dense

Меняем вход **(bs, vocab_size) -> (bs, time_steps, embedding_size)**

In [None]:
embedding_size = 50

# (bs, time_steps, embedding_size) -> (bs, embedding_size)
word_embedding_column = tf.feature_column.embedding_column(categorical_column=bow,
                                                           dimension=embedding_size,
                                                           combiner='mean',
                                                           initializer=tf.truncated_normal_initializer)

In [None]:
params['num_epochs'] = 10

config = tf.estimator.RunConfig(tf_random_seed=123,
                                model_dir=os.path.join(model_dir, 'embeddings'),
                                save_summary_steps=5)

classifier = tf.estimator.DNNClassifier(
    hidden_units=[32, 16],
    activation_fn=tf.nn.tanh,
    feature_columns=[word_embedding_column],
    n_classes=2,
    config=config)

In [None]:
train_and_evaluate(classifier)

### Custom Estimators

<img src="https://sun1-2.userapi.com/c831409/v831409088/1596d6/3ZNzHyVKY_w.jpg" width=350>

Своя архитектура, свои метрики, свои оптимайзеры и так далее.

In [None]:
# определяем архитектуру модели

def build_model(features, params, is_training):
    with tf.name_scope('embeddings'):
        emb_matrix = tf.get_variable('embedding_matrix',
                                     shape=[vocab_size, sentence_size],
                                     dtype=tf.float32)

        # (batch_size, time_steps, emb_dim)
        embeddings = tf.nn.embedding_lookup(emb_matrix, features['data'])
        # (batch_size, emb_dim)
        mean_embs = tf.reduce_mean(embeddings, axis=1)
    
    with tf.name_scope('fc_1'):
        out = tf.layers.dense(mean_embs, 50)
        # out = tf.layers.batch_normalization(out, training=is_training)
        out = tf.nn.tanh(out)

    with tf.name_scope('fc_2'):
        out = tf.layers.dense(out, 32)
        out = tf.nn.tanh(out)
        
    with tf.name_scope('fc_3'):
        out = tf.layers.dense(out, 2)

    return out

In [None]:
# определяем Estimator. Говорим, какая у нас функция потерь, метрики и оптимайзер

def model_fn(features, labels, mode, params):
    
    is_training = (mode == tf.estimator.ModeKeys.TRAIN)
    
    with tf.variable_scope('model'):
        logits = build_model(features, params, is_training)
        
    preds = tf.argmax(logits, axis=1)
    
    if mode == tf.estimator.ModeKeys.PREDICT:
        predictions = {'preds': preds}
        return tf.estimator.EstimatorSpec(mode=mode,
                                          predictions=predictions)
    
    accuracy = tf.reduce_mean(tf.cast(tf.equal(preds, labels), tf.float32))
    labels = tf.one_hot(labels, 2)
    loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
    
    if mode == tf.estimator.ModeKeys.EVAL:
        with tf.variable_scope('metrics'):
            eval_metrics = {'accuracy': tf.metrics.mean(accuracy)}
        
        return tf.estimator.EstimatorSpec(mode, loss=loss, eval_metric_ops=eval_metrics)
    
    tf.summary.scalar('accuracy', accuracy)
    tf.summary.scalar('loss', loss)
    
    optimizer = tf.train.AdamOptimizer()
    
    global_step = tf.train.get_global_step()
    train_op = optimizer.minimize(loss, global_step=global_step)
    
    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

In [None]:
# конфиг для модели

config = tf.estimator.RunConfig(tf_random_seed=123,
                               model_dir=os.path.join(model_dir, 'custom'),
                               save_summary_steps=5)

# Estimator object
estimator = tf.estimator.Estimator(model_fn,
                                   params=params,
                                   config=config)

In [None]:
# запускаем обучение

train_and_evaluate(estimator)

#### Как предсказывать?

В **tf.estimator.EstimatorSpec** можно передавать любые тензоры, которые будут доступны при предсказании.

In [None]:
predictions = estimator.predict(lambda: input_fn(x_test, y_test, params=params, is_training=False))

In [None]:
preds = []

for p in predictions:
    preds.append(p['preds'])

preds = np.array(preds, int)

### TensorBoard: Visualizing Learning

Чтобы посмотреть на графики изменения лосса и метрик, в терминале:

> tensorboard --logdir **model_dir_path**