## Text generation using tensor2tensor

MODIFIED FROM: https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/09_sequence/poetry.ipynb

This notebook uses the <a href="https://github.com/tensorflow/tensor2tensor">tensor2tensor</a> library to do from-scratch training of a lyric generating model. Then, the trained model is used to complete new songs.

In [1]:
import os

# this is what this notebook is demonstrating
PROBLEM= 'lyric_generation_line_problem'

# for bash
os.environ['PROBLEM'] = PROBLEM

## Create training dataset

We are going to train a machine learning model to write poetry given a starting point. We'll give it one line, and it is going to tell us the next line.  So, naturally, we will train it on real poetry. Our feature will be a line of a poem and the label will be next line of that poem.
<p>
Our training dataset will consist of two files.  The first file will consist of the input lines of poetry and the other file will consist of the corresponding output lines, one output line per input line.

In [2]:
with open('data/structure/all_merged.txt', 'r') as rawfp,\
  open('data/structure/input.txt', 'w') as infp,\
  open('data/structure/output.txt', 'w') as outfp:
    
    prev_line = ''
    for curr_line in rawfp:
        curr_line = curr_line.strip()
        # poems break at empty lines, so this ensures we train only
        # on lines of the same poem
        if len(prev_line) > 0 and len(curr_line) > 0:       
            infp.write(prev_line + '\n')
            outfp.write(curr_line + '\n')
        prev_line = curr_line      

In [3]:
!head -5 data/structure/*.txt

==> data/structure/all_merged.txt <==


[Verse 1]
(check, yeah)
We was walking away

==> data/structure/input.txt <==
[Verse 1]
(check, yeah)
We was walking away
Repeated all we can say
Depart with a hug

==> data/structure/output.txt <==
(check, yeah)
We was walking away
Repeated all we can say
Depart with a hug
And it's a public display


We do not need to generate the data beforehand -- instead, we can have Tensor2Tensor create the training dataset for us. So, in the code below, I will use only data/structure/raw.txt -- obviously, this allows us to productionize our model better.  Simply keep collecting raw data and generate the training/test data at the time of training.

### Set up problem
The Problem in tensor2tensor is where you specify parameters like the size of your vocabulary and where to get the training data from.

In [4]:
%%bash
rm -rf lyric_generation
mkdir -p lyric_generation/trainer

In [5]:
%%writefile lyric_generation/trainer/problem.py
import os
import tensorflow as tf
from tensor2tensor.utils import registry
from tensor2tensor.models import transformer
from tensor2tensor.data_generators import problem
from tensor2tensor.data_generators import text_encoder
from tensor2tensor.data_generators import text_problems
from tensor2tensor.data_generators import generator_utils


@registry.register_problem
class LyricGenerationLineProblem(text_problems.Text2TextProblem):
  """Predict next line of poetry from the last line. From Gutenberg texts."""

  @property
  def approx_vocab_size(self):
    return 2**13  # ~8k

  @property
  def is_generate_per_split(self):
    # generate_data will NOT shard the data into TRAIN and EVAL for us.
    return False

  @property
  def dataset_splits(self):
    """Splits of data to produce and number of output shards for each."""
    # 10% evaluation data
    return [{
        "split": problem.DatasetSplit.TRAIN,
        "shards": 90,
    }, {
        "split": problem.DatasetSplit.EVAL,
        "shards": 10,
    }]

  def generate_samples(self, data_dir, tmp_dir, dataset_split):
    with open("data/structure/all_merged.txt", 'r') as rawfp:
      prev_line = ''
      for curr_line in rawfp:
        curr_line = curr_line.strip()
        # poems break at empty lines, so this ensures we train only
        # on lines of the same poem
        if len(prev_line) > 0 and len(curr_line) > 0:       
            yield {
                "inputs": prev_line,
                "targets": curr_line
            }
        prev_line = curr_line          


# Smaller than the typical translate model, and with more regularization
@registry.register_hparams
def transformer_lyric_generation():
  hparams = transformer.transformer_base()
  hparams.num_hidden_layers = 2
  hparams.hidden_size = 128
  hparams.filter_size = 512
  hparams.num_heads = 4
  hparams.attention_dropout = 0.6
  hparams.layer_prepostprocess_dropout = 0.6
  hparams.learning_rate = 0.05
  return hparams

# hyperparameter tuning ranges
@registry.register_ranged_hparams
def transformer_lyric_generation_range(rhp):
  rhp.set_float("learning_rate", 0.05, 0.25, scale=rhp.LOG_SCALE)
  rhp.set_int("num_hidden_layers", 2, 4)
  rhp.set_discrete("hidden_size", [128, 256, 512])
  rhp.set_float("attention_dropout", 0.4, 0.7)

Writing lyric_generation/trainer/problem.py


In [6]:
%%writefile lyric_generation/trainer/__init__.py
from . import problem

Writing lyric_generation/trainer/__init__.py


In [7]:
%%writefile lyric_generation/setup.py
from setuptools import find_packages
from setuptools import setup

REQUIRED_PACKAGES = [
  'tensor2tensor'
]

setup(
    name='lyric_generation',
    version='0.1',
    author = 'Google',
    author_email = 'training-feedback@cloud.google.com',
    install_requires=REQUIRED_PACKAGES,
    packages=find_packages(),
    include_package_data=True,
    description='Lyric Generation Problem',
    requires=[]
)

Writing lyric_generation/setup.py


In [8]:
!touch lyric_generation/__init__.py

In [9]:
!find lyric_generation

lyric_generation
lyric_generation/__init__.py
lyric_generation/setup.py
lyric_generation/trainer
lyric_generation/trainer/problem.py
lyric_generation/trainer/__init__.py


## Generate training data 

Our problem (translation) requires the creation of text sequences from the training dataset.  This is done using t2t-datagen and the Problem defined in the previous section.

(Ignore any runtime warnings about change in size of numpy.dtype. they are harmless).

In [10]:
%%bash
DATA_DIR=data/t2t_data
TMP_DIR=$DATA_DIR/tmp
rm -rf $DATA_DIR $TMP_DIR
mkdir -p $DATA_DIR $TMP_DIR
# Generate data
t2t-datagen \
  --t2t_usr_dir=lyric_generation/trainer \
  --problem=$PROBLEM \
  --data_dir=$DATA_DIR \
  --tmp_dir=$TMP_DIR

:::MLPv0.5.0 transformer 1544054274.786434889 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/data_generators/text_problems.py:311) preproc_tokenize_training
:::MLPv0.5.0 transformer 1544054296.207107067 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/data_generators/text_problems.py:311) preproc_num_train_examples: 50162


  from ._max_len_seq_inner import _max_len_seq_inner
  from ._upfirdn_apply import _output_len, _apply
  from ._spectral import _lombscargle
  from ._peak_finding_utils import (_argmaxima1d, _select_by_peak_distance,
INFO:tensorflow:Importing user module trainer from path /notebooks/workspace/lyric_generation
INFO:tensorflow:Generating problems:
    lyric:
      * lyric_generation_line_problem
INFO:tensorflow:Generating data for lyric_generation_line_problem.
INFO:tensorflow:Generating vocab file: data/t2t_data/vocab.lyric_generation_line_problem.8192.subwords
INFO:tensorflow:Trying min_count 500
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 1595
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 961
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 988
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 988
INFO:tensorflow:Trying min_count 250
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 2631
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_si

In [11]:
!ls data/t2t_data | head

lyric_generation_line_problem-dev-00000-of-00010
lyric_generation_line_problem-dev-00001-of-00010
lyric_generation_line_problem-dev-00002-of-00010
lyric_generation_line_problem-dev-00003-of-00010
lyric_generation_line_problem-dev-00004-of-00010
lyric_generation_line_problem-dev-00005-of-00010
lyric_generation_line_problem-dev-00006-of-00010
lyric_generation_line_problem-dev-00007-of-00010
lyric_generation_line_problem-dev-00008-of-00010
lyric_generation_line_problem-dev-00009-of-00010


## Train model

Let's run it locally on a subset of the data to make sure it works.

Note: the following will work only if you are running this notebook on a reasonably powerful machine. Don't be alarmed if your process is killed.

In [13]:
%%bash
DATA_DIR=data/t2t_data
OUTDIR=t2t_model
rm -rf $OUTDIR
t2t-trainer \
  --data_dir=$DATA_DIR \
  --t2t_usr_dir=lyric_generation/trainer \
  --problem=$PROBLEM \
  --model=transformer \
  --hparams_set=transformer_lyric_generation \
  --output_dir=$OUTDIR --job-dir=$OUTDIR \
  --train_steps=7500

:::MLPv0.5.0 transformer 1544054473.870245934 (/usr/local/bin/t2t-trainer:28) run_set_random_seed
:::MLPv0.5.0 transformer 1544054474.282380104 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/data_generators/problem.py:759) input_max_length: 256
:::MLPv0.5.0 transformer 1544054474.285362959 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/data_generators/problem.py:872) input_order
:::MLPv0.5.0 transformer 1544054474.837483883 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:59) model_hp_embedding_shared_weights: {"vocab_size": 8154, "hidden_size": 128}
:::MLPv0.5.0 transformer 1544054474.910545111 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/t2t_model.py:228) model_hp_initializer_gain: 1.0
:::MLPv0.5.0 transformer 1544054475.214632034 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:186) model_hp_layer_postprocess_dropout: 0.6
:::MLPv0.5.0 transformer 1544054475.222029924 (/usr/local/lib/python2.7/dist-pac

  from ._max_len_seq_inner import _max_len_seq_inner
  from ._upfirdn_apply import _output_len, _apply
  from ._spectral import _lombscargle
  from ._peak_finding_utils import (_argmaxima1d, _select_by_peak_distance,
INFO:tensorflow:Importing user module trainer from path /notebooks/workspace/lyric_generation
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:Configuring DataParallelism to replicate the model.
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
INFO:tensorflow:Using config: {'_save_checkpoints_secs': None, '_num_ps_replicas': 0, '_keep_checkpoint_max': 20, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fbb8751a5d0>, '_tf_config':

The job took about <b>11 minutes</b> for me and ended with these evaluation metrics:
<pre>
Saving dict for global step 7500: global_step = 7500, loss = 4.82295, metrics-lyric_generation_line_problem/targets/accuracy = 0.24801539, metrics-lyric_generation_line_problem/targets/accuracy_per_sequence = 0.0, metrics-lyric_generation_line_problem/targets/accuracy_top5 = 0.43814975, metrics-lyric_generation_line_problem/targets/approx_bleu_score = 0.035196595, metrics-lyric_generation_line_problem/targets/neg_log_perplexity = -4.7052927, metrics-lyric_generation_line_problem/targets/rouge_2_fscore = 0.059274863, metrics-lyric_generation_line_problem/targets/rouge_L_fscore = 0.25639904
</pre>
Notice that accuracy_per_sequence is 0 -- Considering that we are asking the NN to be rather creative, that doesn't surprise me. Why am I looking at accuracy_per_sequence and not the other metrics? This is because it is more appropriate for problem we are solving; metrics like Bleu score are better for translation.

In [14]:
%%bash
ls t2t_model

checkpoint
eval
events.out.tfevents.1544054481.3877c36ce4b9
flags.txt
flags_t2t.txt
graph.pbtxt
hparams.json
model.ckpt-0.data-00000-of-00002
model.ckpt-0.data-00001-of-00002
model.ckpt-0.index
model.ckpt-0.meta
model.ckpt-1000.data-00000-of-00002
model.ckpt-1000.data-00001-of-00002
model.ckpt-1000.index
model.ckpt-1000.meta
model.ckpt-2000.data-00000-of-00002
model.ckpt-2000.data-00001-of-00002
model.ckpt-2000.index
model.ckpt-2000.meta
model.ckpt-3000.data-00000-of-00002
model.ckpt-3000.data-00001-of-00002
model.ckpt-3000.index
model.ckpt-3000.meta
model.ckpt-4000.data-00000-of-00002
model.ckpt-4000.data-00001-of-00002
model.ckpt-4000.index
model.ckpt-4000.meta
model.ckpt-5000.data-00000-of-00002
model.ckpt-5000.data-00001-of-00002
model.ckpt-5000.index
model.ckpt-5000.meta
model.ckpt-6000.data-00000-of-00002
model.ckpt-6000.data-00001-of-00002
model.ckpt-6000.index
model.ckpt-6000.meta
model.ckpt-7000.data-00000-of-00002
model.ckpt-7000.data-00001-of-00002
model.ckpt-7000.index
mode

## Training longer

Let's train for 75,000 steps. Note the change in the last line of the job.

In [17]:
%%bash

DATA_DIR=data/t2t_data
OUTDIR=t2t_model_full2
rm -rf $OUTDIR
t2t-trainer \
  --data_dir=$DATA_DIR \
  --t2t_usr_dir=lyric_generation/trainer \
  --problem=$PROBLEM \
  --model=transformer \
  --hparams_set=transformer_lyric_generation \
  --output_dir=$OUTDIR \
  --train_steps=75000

:::MLPv0.5.0 transformer 1544055593.587341070 (/usr/local/bin/t2t-trainer:28) run_set_random_seed
:::MLPv0.5.0 transformer 1544055593.972029924 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/data_generators/problem.py:759) input_max_length: 256
:::MLPv0.5.0 transformer 1544055593.975709915 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/data_generators/problem.py:872) input_order
:::MLPv0.5.0 transformer 1544055594.517838955 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:59) model_hp_embedding_shared_weights: {"vocab_size": 8154, "hidden_size": 128}
:::MLPv0.5.0 transformer 1544055594.592364073 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/t2t_model.py:228) model_hp_initializer_gain: 1.0
:::MLPv0.5.0 transformer 1544055594.946847916 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:186) model_hp_layer_postprocess_dropout: 0.6
:::MLPv0.5.0 transformer 1544055594.954674006 (/usr/local/lib/python2.7/dist-pac

  from ._max_len_seq_inner import _max_len_seq_inner
  from ._upfirdn_apply import _output_len, _apply
  from ._spectral import _lombscargle
  from ._peak_finding_utils import (_argmaxima1d, _select_by_peak_distance,
INFO:tensorflow:Importing user module trainer from path /notebooks/workspace/lyric_generation
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:Configuring DataParallelism to replicate the model.
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
INFO:tensorflow:Using config: {'_save_checkpoints_secs': None, '_num_ps_replicas': 0, '_keep_checkpoint_max': 20, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fbbae926590>, '_tf_config':

This job took <b>2 hours</b> for me and ended with these metrics:
<pre>
global_step = 75000, loss = 4.6236544, metrics-lyric_generation_line_problem/targets/accuracy = 0.27194387, metrics-lyric_generation_line_problem/targets/accuracy_per_sequence = 0.00978044, metrics-lyric_generation_line_problem/targets/accuracy_top5 = 0.4787635, metrics-lyric_generation_line_problem/targets/approx_bleu_score = 0.060011823, metrics-lyric_generation_line_problem/targets/neg_log_perplexity = -4.4925637, metrics-lyric_generation_line_problem/targets/rouge_2_fscore = 0.08117228, metrics-lyric_generation_line_problem/targets/rouge_L_fscore = 0.27145436
</pre>
At least the accuracy per sequence is no longer zero. It is now 0.00978044 ... note that we are using a relatively small dataset (63K lines) and this is *tiny* in the world of natural language problems.
<p>
In order that you have your expectations set correctly: a high-performing translation model needs 400-million lines of input and takes 1 whole day on a TPU pod!

## Batch-predict

How will our poetry model do when faced with Kanye's lyrics?

In [6]:
%%writefile data/test_song.txt
[Hook: Kid Cudi]
Ain’t no question if I want it, I need it
I can feel it slowly drifting away from me
I’m on the edge, so why you playing? I’m saying
I will never ever let you live this down, down, down
Not for nothing I’ve foreseen it, I dream it
I can feel it slowly drifting away from me
No more chances if you blow this, you bogus
I will never ever let you live this down, down, down
[Verse 1: Kanye West]
Penitentiary chances, the devil dances
And eventually answers to the call of Autumn
All of them fallin’ for the love of ballin’
Got caught with 30 rocks, the cop look like Alec Baldwin
Inter century anthems based off inner city tantrums
Based off the way we was branded
Face it, Jerome get more time than Brandon
And at the airport they check all through my bag
And tell me that it’s random
But we stay winning, this week has been a bad massage
I need a happy ending and a new beginning
And a new fitted, and some job opportunities that's lucrative
This the real world, homie, school finished
They done stole your dreams, you dunno who did it
I treat the cash the way the government treats AIDS
I won’t be satisfied til all my niggas get it, get it?
[Hook: Kid Cudi]
Ain’t no question if I want it, I need it
I can feel it slowly drifting away from me
I’m on the edge, so why you playing? I’m saying
I will never ever let you live this down, down, down
[Verse 2: Kanye West]
Is hip hop just a euphemism for a new religion?
The soul music of the slaves that the youth is missing
This is more than just my road to redemption
Malcolm West had the whole nation standing at attention
As long as I’m in Polo smiling, they think they got me
But they would try to crack me if they ever see a black me
I thought I chose a field where they couldn’t sack me
If a nigga ain't shootin' a jump shot, running a track meet
But this pimp is, at the top of Mount Olympus
Ready for the World’s game, this is my Olympics
We make ‘em say ho cause the game is so pimpish
Choke a South Park writer with a fishstick
I insisted to get up offa this dick
And these drugs, niggas can't resist it
Remind me of when they tried to have Ali enlisted
If I ever wasn't the greatest nigga, I must have missed it!
[Hook: Kid Cudi]
Ain’t no question if I want it, I need it
I can feel it slowly drifting away from me
I’m on the edge, so why you playing? I’m saying
I will never ever let you live this down, down, down
[Verse 3: Kanye West]
I need more drinks and less lights
And that American Apparel girl in just tights
She told the director she tryna get in a school
He said “take them glasses off and get in the pool”
It’s been a while since I watched the tube
Cause like a Crip set, I got way too many blues for any more bad news
I was looking at my resume feeling real fresh today
They rewrite history, I don’t believe in yesterday
And what’s a black Beatle anyway, a fucking roach?
I guess that's why they got me sitting in fucking coach
My guy said I need a different approach
Cause people is looking at me like I’m sniffing coke
It's not funny anymore, try different jokes
Tell ‘em hug and kiss my ass, x and o
And kiss the ring while they at it, do my thing while I got it
Play strings for the dramatic ending of that wack shit
Act like I ain't had a belt in two classes
I ain't got it I’m coming after whoever who has it
I’m coming after whoever. Who has it?
You blowin' up, that’s good, fantastic
That y’all, it's like that y'all
I don’t really give a fuck about it at all
Cause the same people that tried to black ball me
Forgot about two things, my black balls
[Hook: Kid Cudi]
Ain’t no question if I want it, I need it
I can feel it slowly drifting away from me
I’m on the edge, so why you playing? I’m saying
I will never ever let you live this down, down, down
[Verse 4: Raekwon]
I done copped Timb's, lived in lenses, kid
Armani suits, fresh fruits, Bally boots and Benzes
Counting up, smoking, one cuff
Live as a red Jag, a Louis bag, grabbing a blunt, fuck it
Steam about a hundred and one L's
Kites off to jails, buying sweats, running up in Stetson
Nigga hat game was special
It matched every black pair of Nikes, throwing dice for decimals
The older head, bolder head, would train a soldier head
Make sure he right in the field, not a soldier dead
That meant code red, bent off the black skunk
The black dutch, back of the old shed
If you can’t live, you dying, you give or buy in
Keep it real or keep it moving, keep grinding
Keep shining, to every young man, this is a plan
Learn from others like your brothers Rae and Kanye
[Outro: Kid Cudi]
Not for nothing I've forseen it, I dream it
I can feel it slowly dripping away from me
No more chances if you blow this, you bogus
I will never ever let you live this down, down, down

Overwriting data/test_song.txt


Let's write out the odd-numbered lines. We'll compare how close our model can get to the beauty of Rumi's second lines given his first.

In [12]:
%%bash
awk 'NR % 2 == 1' data/test_song.txt | tr '[:upper:]' '[:lower:]' | sed "s/^\[.*\]//g" > data/test_song_leads.txt


i can feel it slowly drifting away from me
i will never ever let you live this down, down, down


In [15]:
%%bash
head -3 data/test_song_leads.txt

i can feel it slowly drifting away from me
i will never ever let you live this down, down, down
i can feel it slowly drifting away from me


In [18]:
%%bash
# same as the above training job ...
DATA_DIR=data/t2t_data
OUTDIR=t2t_model_full2
MODEL=transformer
HPARAMS=transformer_lyric_generation

# the file with the input lines
DECODE_FILE=data/test_song_leads.txt

BEAM_SIZE=4
ALPHA=0.6

t2t-decoder \
  --data_dir=$DATA_DIR \
  --problem=$PROBLEM \
  --model=$MODEL \
  --hparams_set=transformer_lyric_generation \
  --output_dir=$OUTDIR \
  --t2t_usr_dir=lyric_generation/trainer \
  --decode_hparams="beam_size=$BEAM_SIZE,alpha=$ALPHA" \
  --decode_from_file=$DECODE_FILE

:::MLPv0.5.0 transformer 1544062447.294462919 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:59) model_hp_embedding_shared_weights: {"vocab_size": 8154, "hidden_size": 128}
:::MLPv0.5.0 transformer 1544062447.580358982 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/expert_utils.py:231) model_hp_layer_postprocess_dropout: 0.0
:::MLPv0.5.0 transformer 1544062447.581224918 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:101) model_hp_hidden_layers: 2
:::MLPv0.5.0 transformer 1544062447.581873894 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:101) model_hp_attention_dropout: 0.0
:::MLPv0.5.0 transformer 1544062447.582518101 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:101) model_hp_attention_dense: {"num_heads": 4, "use_bias": "false", "hidden_size": 128}
:::MLPv0.5.0 transformer 1544062447.785921097 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/layers/t

  from ._max_len_seq_inner import _max_len_seq_inner
  from ._upfirdn_apply import _output_len, _apply
  from ._spectral import _lombscargle
  from ._peak_finding_utils import (_argmaxima1d, _select_by_peak_distance,
INFO:tensorflow:Importing user module trainer from path /notebooks/workspace/lyric_generation
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:Configuring DataParallelism to replicate the model.
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
INFO:tensorflow:Using config: {'_save_checkpoints_secs': None, '_num_ps_replicas': 0, '_keep_checkpoint_max': 20, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f39cf084b10>, '_tf_config':

<b> Note </b> if you get an error about "AttributeError: 'HParams' object has no attribute 'problems'" please <b>Reset Session</b>, run the cell that defines the PROBLEM and run the above cell again.

In [22]:
%%bash  
DECODE_FILE=data/test_song_leads.txt
cat ${DECODE_FILE}.*.decodes

And I'm a lot of the same thing
Get down, get down, get down
And I'm a lot of the same thing
Get down, get down, get down
And I'm a lot of the world
From the smog and the smoke
And I'm a lot of the world
And I'm a lot of the world
Somewhere far as a woman so heartless
And I'm a lot of the same thing
I'm a nigga, I'm a nigga
And I'm a lot of the world
And I'm a lot of the same thing
Get down, get down, get down
What's the fuck with me?
And I'm a lot of the same thing
And I'm a lot of the same thing
And I'm a lot of the same thing
And I'm a lot of the world
And I'm a lot of the world
And I'm a lot of the same thing
And I'm a lot of the world
And I'm a lot of the same thing
Get down, get down, get down
And I'm a lot of the same thing
And I'm tryna make it up
And I'm a lot of the same thing
And I'm a lot of the world
What's the fuck with me?
And I'm a lot of the same thing
And I'm a lot of the same thing
And I'm a lot of the same thing
And I'm a lot of the same thing
       SASHA: It's why

Now let's try creating an entire verse from scratch. We'll have to give the model a seed line to start off with. Let's try "data science students got bars for days".

In [64]:
%%writefile data/song_from_scratch.txt
yo, who dat boy? who him is?



Overwriting data/song_from_scratch.txt


In [None]:
%%bash
head -n 11 data/song_from_scratch.txt

Now run the below cell as many times as you wish to generate line after line, where each generated line will be used as the next input.

In [77]:
%%bash

# Destroy temp file
rm -f data/input_line_tmp.txt

# Write out last line of song to temp file
tail --lines=1 data/song_from_scratch.txt > data/input_line_tmp.txt

DATA_DIR=data/t2t_data
OUTDIR=t2t_model_full2
MODEL=transformer
HPARAMS=transformer_lyric_generation

# the file with the input lines
DECODE_FILE=data/input_line_tmp.txt

BEAM_SIZE=4
ALPHA=0.6

# Generate next line, write out to data/input_line_tmp.txt.*.decodes
t2t-decoder \
  --data_dir=$DATA_DIR \
  --problem=$PROBLEM \
  --model=$MODEL \
  --hparams_set=transformer_lyric_generation \
  --output_dir=$OUTDIR \
  --t2t_usr_dir=lyric_generation/trainer \
  --decode_hparams="beam_size=$BEAM_SIZE,alpha=$ALPHA" \
  --decode_from_file=$DECODE_FILE

# Append generated line to song
cat ${DECODE_FILE}.*.decodes >> data/song_from_scratch.txt

:::MLPv0.5.0 transformer 1544077005.564482927 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:59) model_hp_embedding_shared_weights: {"vocab_size": 8154, "hidden_size": 128}
:::MLPv0.5.0 transformer 1544077005.834992886 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/expert_utils.py:231) model_hp_layer_postprocess_dropout: 0.0
:::MLPv0.5.0 transformer 1544077005.835800886 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:101) model_hp_hidden_layers: 2
:::MLPv0.5.0 transformer 1544077005.836397886 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:101) model_hp_attention_dropout: 0.0
:::MLPv0.5.0 transformer 1544077005.836987019 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/transformer.py:101) model_hp_attention_dense: {"num_heads": 4, "use_bias": "false", "hidden_size": 128}
:::MLPv0.5.0 transformer 1544077006.026851892 (/usr/local/lib/python2.7/dist-packages/tensor2tensor/layers/t

  from ._max_len_seq_inner import _max_len_seq_inner
  from ._upfirdn_apply import _output_len, _apply
  from ._spectral import _lombscargle
  from ._peak_finding_utils import (_argmaxima1d, _select_by_peak_distance,
INFO:tensorflow:Importing user module trainer from path /notebooks/workspace/lyric_generation
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:Configuring DataParallelism to replicate the model.
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
INFO:tensorflow:Using config: {'_save_checkpoints_secs': None, '_num_ps_replicas': 0, '_keep_checkpoint_max': 20, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f98e5fe0b10>, '_tf_config':

In [59]:
%%bash

# Destroy temp file
rm -f data/input_line_tmp.txt

# Admire our handiwork
cat data/song_from_scratch.txt

data science students got bars for days
And I'm a lot of the same thing
It's all about the same thing
It's all about you
It's all about you
It's all about you
It's all about you
It's all about you
It's all about you
It's all about you


Some of these are still phrases and not complete sentences. This indicates that we might need to train longer or better somehow.

Tensorboard shows the loss curves plateau fairly quickly, and stop improving. What we really need to do is to get more data, but if that's not an option, we could try to reduce the NN and increase the dropout regularization. We could also do hyperparameter tuning on the dropout and network sizes.