# TensorFlow Tutorial - Part B

There are lots of projects out there that are implemented in TensorFlow.

In this assignment, we are going to focus on:

- Transfer Learning
- Sequence to Sequence modelling

### 1. Transfer Learning

#### 1.1 Review the theory of transfer learning here: http://cs231n.github.io/transfer-learning/ (Step Through)

#### 1.2 TensorFlow Hub (Step Through)

TensorFlow Hub is a library to foster the publication, discovery, and consumption of reusable parts of machine learning models. A module is a self-contained piece of a TensorFlow graph, along with its weights and assets, that can be reused across different tasks in a process known as transfer learning.

Modules contain variables that have been pre-trained for a task using a large dataset. By reusing a module on a related task, you can:

train a model with a smaller dataset,
improve generalization, or
significantly speed up training.

#### 1.3 Image Retraining (Transfer Learning) on Flowers Dataset (Step Through)

##### 1.3.1 Download the flowers dataset

In [None]:
!curl -LO http://download.tensorflow.org/example_images/flower_photos.tgz
!tar xzf flower_photos.tgz

##### 1.3.2 Download the retraining script

In [None]:
!curl -LO https://github.com/tensorflow/hub/raw/r0.1/examples/image_retraining/retrain.py

##### 1.3.3 Check the available options in that script

In [None]:
!python retrain.py -h

##### 1.3.4 Run the transfer learning training on flowers. (Note: it will take some time for data caching. So be patient ^_^). 

In [None]:
!python retrain.py --image_dir /workspace/flower_photos --how_many_training_steps 1000

##### 1.3.5 When you see the training steps, run tensorboard from the terminal by either using byobu/tmux or going to Jupyter home and create a terminal and running tensorboard from there. Run tensorboard using the command `tensorboard --logdir /tmp/retrain_logs`. The logs are stored in `/tmp/retrain_logs by default`. This is just so you can see visually what transfer learning is doing.

##### 1.3.6 Run inference on the trained model

In [None]:
# download the inference script
!curl -LO https://github.com/tensorflow/tensorflow/raw/master/tensorflow/examples/label_image/label_image.py

# run the inference. not that the training procedure gave us an exported model in /tmp/output_graph.pb    
!python label_image.py \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--input_layer=Placeholder \
--output_layer=final_result \
--image=/workspace/flower_photos/daisy/21652746_cc379e0eea_m.jpg

##### 1.3.7 Check the available options in label_image.py

In [None]:
!python label_image.py -h

#### 1.4 Task 1 - Do transfer learning to classify dogs and cats! (Total 40 points)

##### 1.4.1 First study the tranfer learning scripts you used in 1.3. You can find a detailed article on the script here: https://www.tensorflow.org/tutorials/image_retraining. You can also just read the "Key Concepts" part of TensorFlow Hub (https://www.tensorflow.org/hub/) and study the retrain.py and label_image.py script line by line.

##### 1.4.2 Download the dogs and cats dataset

In [None]:
!curl -LO http://people.rit.edu/~sxa1056/doggo_cattos.tar
!tar xvf doggo_cattos.tar

##### 1.4.3 Train using the following specifications:
- Pick an image module from here and pass it to the --tfhub_module argument: https://www.tensorflow.org/hub/modules/image. Note that the input height and input width for inference might vary based on which module you choose. (10 points)
- image_dir /workspace/doggo_cattos 
- training_steps 1000
- Do some data augmentation by passing some random distortions to the script (10 points) (Only one distortion flag is fine. More distortitions will take more time to train).
- Include screenshots of the accuracy, loss and the graph when zipping the assignment (double click on the "Module box" and take the screenshot to show the name of the graph you used) from TensorBoard (Same command as the previous section). (10 points)
- Do an inference on one of the training images (10 points). For instance: infer on `/workspace/doggo_cattos/dog/dog.102.jpg`

In [None]:
# put training command here:


In [None]:
# put inference command here
# run the inference. not that the training procedure gave us an exported model in /tmp/output_graph.pb    


### 2. Tensor2Tensor Framework

Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research. T2T is actively used and maintained by researchers and engineers within the Google Brain team and a community of users. You can read more about it here: https://github.com/tensorflow/tensor2tensor

#### 2.1 Generating Poetry using T2T Framework (Step Through)

From: https://cloud.google.com/blog/big-data/2018/02/cloud-poetry-training-and-hyperparameter-tuning-custom-text-models-on-cloud-ml-engine

Let’s say we want to train a machine learning model to complete poems. Given one line of verse, the model should generate the next line. This is a hard problem—poetry is a sophisticated form of composition and wordplay. It seems harder than translation because there is no one-to-one relationship between the input (first line of a poem) and the output (the second line of the poem). It is somewhat similar to a model that provides answers to questions, except that we’re asking the model to be a lot more creative.

##### 2.1.1 Download the data

We will get some poetry anthologies from Project Gutenberg.

In [None]:
!curl -LO http://people.rit.edu/~sxa1056/raw.txt

In [None]:
!wc -l raw.txt # word count of the dataset

##### 2.1.2 Create training dataset
We are going to train a machine learning model to write poetry given a starting point. We'll give it one line, and it is going to tell us the next line. So, naturally, we will train it on real poetry. Our feature will be a line of a poem and the label will be next line of that poem.

Our training dataset will consist of two files. The first file will consist of the input lines of poetry and the other file will consist of the corresponding output lines, one output line per input line.

In [None]:
with open('raw.txt', 'r') as rawfp,\
  open('input.txt', 'w') as infp,\
    open('output.txt', 'w') as outfp:
    
        prev_line = ''
        for curr_line in rawfp:
            curr_line = curr_line.strip()
            # poems break at empty lines, so this ensures we train only
            # on lines of the same poem
            if len(prev_line) > 0 and len(curr_line) > 0:       
                infp.write(prev_line + '\n')
                outfp.write(curr_line + '\n')
            prev_line = curr_line

In [None]:
!head -5 *.txt

**We do not need to generate the data beforehand - instead, we can have Tensor2Tensor create the training dataset for us. So, in the code below, I will use only `raw.txt` - obviously, this allows us to productionize our model better. Simply keep collecting raw data and generate the training/test data at the time of training.**

#### 2.1.3 Set up problem
The Problem in tensor2tensor is where you specify parameters like the size of your vocabulary and where to get the training data from.

In [None]:
!rm -rf poetry
!mkdir -p poetry/trainer

In [None]:
%%writefile poetry/trainer/problem.py
import os
import tensorflow as tf
from tensor2tensor.utils import registry
from tensor2tensor.models import transformer
from tensor2tensor.data_generators import problem
from tensor2tensor.data_generators import text_encoder
from tensor2tensor.data_generators import text_problems
from tensor2tensor.data_generators import generator_utils


@registry.register_problem
class PoetryLineProblem(text_problems.Text2TextProblem):
  """Predict next line of poetry from the last line. From Gutenberg texts."""

  @property
  def approx_vocab_size(self):
    return 2**13  # ~8k

  @property
  def is_generate_per_split(self):
    # generate_data will NOT shard the data into TRAIN and EVAL for us.
    return False

  @property
  def dataset_splits(self):
    """Splits of data to produce and number of output shards for each."""
    # 10% evaluation data
    return [{
        "split": problem.DatasetSplit.TRAIN,
        "shards": 90,
    }, {
        "split": problem.DatasetSplit.EVAL,
        "shards": 10,
    }]

  def generate_samples(self, data_dir, tmp_dir, dataset_split):
    with open('raw.txt', 'r') as rawfp:
      prev_line = ''
      for curr_line in rawfp:
        curr_line = curr_line.strip()
        # poems break at empty lines, so this ensures we train only
        # on lines of the same poem
        if len(prev_line) > 0 and len(curr_line) > 0:       
            yield {
                "inputs": prev_line,
                "targets": curr_line
            }
        prev_line = curr_line          


# Smaller than the typical translate model, and with more regularization
@registry.register_hparams
def transformer_poetry():
  hparams = transformer.transformer_base()
  hparams.num_hidden_layers = 2
  hparams.hidden_size = 128
  hparams.filter_size = 512
  hparams.num_heads = 4
  hparams.attention_dropout = 0.6
  hparams.layer_prepostprocess_dropout = 0.6
  hparams.learning_rate = 0.05
  return hparams

# hyperparameter tuning ranges
@registry.register_ranged_hparams
def transformer_poetry_range(rhp):
  rhp.set_float("learning_rate", 0.05, 0.25, scale=rhp.LOG_SCALE)
  rhp.set_int("num_hidden_layers", 2, 4)
  rhp.set_discrete("hidden_size", [128, 256, 512])
  rhp.set_float("attention_dropout", 0.4, 0.7)

In [None]:
%%writefile poetry/trainer/__init__.py
from . import problem

In [None]:
%%writefile poetry/setup.py
from setuptools import find_packages
from setuptools import setup

REQUIRED_PACKAGES = [
  'tensor2tensor'
]

setup(
    name='poetry',
    version='0.1',
    author = 'Google',
    author_email = 'training-feedback@cloud.google.com',
    install_requires=REQUIRED_PACKAGES,
    packages=find_packages(),
    include_package_data=True,
    description='Poetry Line Problem',
    requires=[]
)

In [None]:
!touch poetry/__init__.py

In [None]:
!find poetry

##### 2.1.4 Generate training data
Our problem (translation) requires the creation of text sequences from the training dataset. This is done using t2t-datagen and the Problem defined in the previous section.

In [None]:
!rm -rf /workspace/t2t_data /workspace/t2t_data/tmp
!mkdir -p /workspace/t2t_data /workspace/t2t_data/tmp
# Generate data
!t2t-datagen \
    --t2t_usr_dir=./poetry/trainer \
    --problem="poetry_line_problem" \
    --data_dir=/workspace/t2t_data \
    --tmp_dir=/workspace/t2t_data/tmp

In [None]:
!ls t2t_data | head

##### 2.1.5 Run training (This will take about 25 minutes. So sit and relax and write your own poetry!)

In [None]:
!rm -rf /workspace/t2t_train
!mkdir -p /workspace/t2t_train

In [None]:
!t2t-trainer \
    --data_dir=/workspace/t2t_data \
    --t2t_usr_dir=./poetry/trainer \
    --problems="poetry_line_problem" \
    --model=transformer \
    --hparams_set=transformer_poetry \
    --output_dir=/workspace/t2t_train \
    --job-dir=/workspace/t2t_train \
    --train_steps=7500 

##### 2.1.6 Run tensorboard with the logdir=/workspace/t2t_train for visualizations

**Notice that accuracy_per_sequence is 0 -- Considering that we are asking the NN to be rather creative, that doesn't surprise me. Why am I looking at accuracy_per_sequence and not the other metrics? This is because it is more appropriate for problem we are solving; metrics like Bleu score are better for translation.**

**For our class's purpose, this accuracy is fine. To achieve better accuracy, we need to train longer**

##### 2.1.7 Batch-predict
How will our poetry model do when faced with Rumi's spiritual couplets?

In [None]:
%%writefile rumi.txt
Where did the handsome beloved go?
I wonder, where did that tall, shapely cypress tree go?
He spread his light among us like a candle.
Where did he go? So strange, where did he go without me?
All day long my heart trembles like a leaf.
All alone at midnight, where did that beloved go?
Go to the road, and ask any passing traveler — 
That soul-stirring companion, where did he go?
Go to the garden, and ask the gardener — 
That tall, shapely rose stem, where did he go?
Go to the rooftop, and ask the watchman — 
That unique sultan, where did he go?
Like a madman, I search in the meadows!
That deer in the meadows, where did he go?
My tearful eyes overflow like a river — 
That pearl in the vast sea, where did he go?
All night long, I implore both moon and Venus — 
That lovely face, like a moon, where did he go?
If he is mine, why is he with others?
Since he’s not here, to what “there” did he go?
If his heart and soul are joined with God,
And he left this realm of earth and water, where did he go?
Tell me clearly, Shams of Tabriz,
Of whom it is said, “The sun never dies” — where did he go?

**Let's write out the odd-numbered lines. We'll compare how close our model can get to the beauty of Rumi's second lines given his first.**

In [None]:
!awk 'NR % 2 == 1' rumi.txt | tr '[:upper:]' '[:lower:]' | sed "s/[^a-z\'-\ ]//g" > rumi_leads.txt
!head -3 rumi_leads.txt

In [None]:
!t2t-decoder \
    --data_dir=/workspace/t2t_data \
    --problems="poetry_line_problem" \
    --model=transformer \
    --hparams_set=transformer_poetry \
    --output_dir=/workspace/t2t_train \
    --t2t_usr_dir=./poetry/trainer \
    --decode_hparams="beam_size=4,alpha=0.6" \
    --decode_from_file="rumi_leads.txt"

In [None]:
!cat rumi_leads.txt.*.decodes

**Oh well! we have some output! Some say Art is subjective.**

**You can check out more avaiable models in the framework using the following:**

In [None]:
from tensor2tensor.utils import registry
registry.list_models()

**You can check out avaiable hparams in the framework using the following**:

In [None]:
registry.list_hparams()

#### 2.2 Task 2 - Answer the following questions and train the poetry generator using LSTM Seq2Seq. (Total 30 points)

##### 2.2.1 What kind of model is being used in 2.1? (2 points)

**Answer**: 

##### 2.2.2 What is the name of the model and the hparams_set in the framework if you were to use a Seq2Seq model using LSTM cells (no attention)? (2 points)

**Answer**: 

##### 2.2.3 What is the purpose of an encoder in a sequence to sequence model? (2 points)

**Answer**: 

##### 2.2.4 What are embeddings in a sequence to sequence model used in the context of a translation task? (2 points)

**Answer**: 

##### 2.2.5 What does attention mechanism do in a translation task? (2 points)

**Answer**: 

##### 2.2.6 Train the poetry generator using the LSTM Seq2Seq model.  (Total 10 points)
- Train for 1000 steps
- Take a screenshot of the accuracy from TensorBoard and include in the zip file.
- Take a screenshot of the Graph from Tensorboard (double click on the "training" box so that lstm model is visible) and include in the zip file.

In [None]:
!rm -rf /workspace/t2t_lstm_train
!mkdir -p /workspace/t2t_lstm_train

In [None]:
# put training command here:


##### 2.2.7 Infer using the LSTM Seq2Seq model.  (Total 10 points)

In [None]:
# put inference command here


In [None]:
!cat rumi_leads.txt.*.decodes