In [None]:
# Author: Enock Niyonkuru
# Class: Deep Learning
# Topics: 
# - Time series encoding
# - Long Short Term Memory (LSTM) Networks
# - Natural Language Processing
# Date:  5 May 2022
# This project uses binary classification Cats Vs Dogs : 98%  Accuracy 

# **Deep_Learing Project Time_Series**

# **Tasks:**

**Part I** <br/>
- Task 1: Download and Preprocess data for training. 
- Task 2: Encoding Time series data for training
- Task 3: Using single feature data (IO_Type), design an LSTM based model to predict IO_Type. [Single feature binary classification]
- Task 4: Using all features, design an LSTM based model to predict IO_Type. [Multi-feature binary classification]
- Task 5: Using all features, design an LSTM based model to predict Response Time. [Multi-feature Regression]
- Task 6: Using all features, design an Transformer based model to predict Response Time. [Multi-feature Regression]<br/>


 
**Part II**
- Task 7: Train an neural network word by word from scratch to generate jokes.
<br/>
<br/>
<br/>

**Note:** 
- You need to set runtime to GPU for this exercise. 
- Task 7 is different from class exercise where we were generating jokes character by character.





# **Check GPU**

In [None]:
try:
    %tensorflow_version 2.x
    COLAB = True
    print("Note: using Google CoLab")
    gpu_info = !nvidia-smi
    gpu_info = '\n'.join(gpu_info)
    if gpu_info.find('failed') >= 0:
      print('Not connected to a GPU')
    else:
      print(gpu_info)
except:
    print("Note: not using Google CoLab")
    COLAB = False

Note: using Google CoLab
Thu May  5 18:30:44 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P8    26W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+--------------------------------------------------------------

# **Dataset:**

For this exercise, we will be working with the same SSD IO traces from Assignment 2, wherein each row contain the following Information. IO stands for Input/Output. <br/> 
- **Timestamp:** Time of initiating the IO request (float64) <br/>
- **Response :**  Time to complete the IO request (float64)<br/>
- **IOType:**  Type of IO requested (string)<br/>
- **LUN:** Logical Unit Number in SSD handling the IO request<br/>
- **Offset:** The Logical Block address handling the IO request. Also known as LBA (Logical Block Address)<br/>
- **Size:** The size of IO requested

**Task 1: Download and Preprocess data.**

We first need to preprocess the data for the neural networks. You may take help from the lecture slides, class notebooks and official Python documentation to complete the following exerices. Please follow the steps below:

- Download all the sub-traces.
- Concatenate all sub-traces and aggregate to a single dataframe.
- Sort dataframe by Timestamp.
- Categorize IOType. It should be in numeric format (R: 0 and W: 1)
- Add a feature (called size_class) which rounds up IO size to next power of 2. [125 is class 7 as pow(2,7) = 128].
- Normalize Data.
- Drop IOSize as feature.
- Drop Timestamp as feature.
- Make sure all data is numeric and contains no n/a or missing values.
- Separate data to train and test set
- Prepare X and y for training (3D Tensor format)

**Task 2: Encoding Time series data for training**

To create a model that will predict future values, you will need to consider how to encode this data to be presented to the algorithm. The data must be submitted as sequences, using a sliding window algorithm to encode the data.

We must define how large the window will be. Consider an n-sized window. Each sequence's  values will be a sequence of  data points. The 's will be the next value, after the sequence, that we are trying to predict. You can use the following function to take a series of values, and generate sequences (X) and predicted values (y).

The preprocessed training data (X) must have the following format: 
`(num_samples, sequence_size, num_features)`


**Example:**
If we have 5000 training samples and we are using a sequence size/lookback of 10 with 5 features, then the shape of your training data (X) should be:
`(5000,10,5)`

In [None]:
import os
import pandas as pd
import numpy as np
import math
import tensorflow as tf

In [None]:
# Links to traces
csv_1 = 'https://people.ucsc.edu/~cchakrab/data/ssd_traces/2016030807-LUN0.csv'
csv_2 = 'https://people.ucsc.edu/~cchakrab/data/ssd_traces/2016030807-LUN1.csv'
csv_3 = 'https://people.ucsc.edu/~cchakrab/data/ssd_traces/2016030807-LUN2.csv'
csv_4 = 'https://people.ucsc.edu/~cchakrab/data/ssd_traces/2016030807-LUN3.csv'
csv_5 = 'https://people.ucsc.edu/~cchakrab/data/ssd_traces/2016030807-LUN4.csv'
csv_6 = 'https://people.ucsc.edu/~cchakrab/data/ssd_traces/2016030807-LUN0.csv'

csv_columns = ['Timestamp', 'Response','IOType','LUN','Offset','Size']


##Task 1: Download and Preprocess data.

In [None]:
csv_columns

['Timestamp', 'Response', 'IOType', 'LUN', 'Offset', 'Size']

In [None]:
# Download all the sub-traces.
# Concatenate all sub-traces and aggregate to a single dataframe.
# Print number of rows in the aggregated trace
ser1 = pd.read_csv(csv_1)
ser2 = pd.read_csv(csv_2)
ser3 = pd.read_csv(csv_3)
ser4 = pd.read_csv(csv_4)
ser5 = pd.read_csv(csv_5)
ser6 = pd.read_csv(csv_6)
df = pd.concat([ser1,ser2,ser3,ser4,ser5,ser6])
number_rows = len(df.index)
print('Number of Columns: ',number_rows)
df

Number of Columns:  84721


Unnamed: 0,Timestamp,Response,IOType,LUN,Offset,Size
0,1.457391e+09,0.000505,R,0,4255049925632,122880
1,1.457391e+09,0.000513,R,0,4691229524992,131072
2,1.457391e+09,0.000520,R,0,4691229656064,131072
3,1.457391e+09,0.000515,R,0,4255050056704,122880
4,1.457391e+09,0.000543,R,0,501654973440,131072
...,...,...,...,...,...,...
12758,1.457391e+09,0.005793,R,0,2569248124928,118784
12759,1.457391e+09,0.000209,R,0,1166301144064,4096
12760,1.457391e+09,0.000524,R,0,4262587082240,65536
12761,1.457391e+09,0.005391,R,0,4749426532352,4096


In [None]:
 # Sort datadframe by Timestamp column
 df.sort_values(by = ['Timestamp'])

Unnamed: 0,Timestamp,Response,IOType,LUN,Offset,Size
0,1.457391e+09,0.000505,R,0,4255049925632,122880
0,1.457391e+09,0.000505,R,0,4255049925632,122880
1,1.457391e+09,0.000513,R,0,4691229524992,131072
1,1.457391e+09,0.000513,R,0,4691229524992,131072
2,1.457391e+09,0.000520,R,0,4691229656064,131072
...,...,...,...,...,...,...
30774,1.457391e+09,0.015655,R,2,4382891679232,131072
30771,1.457391e+09,0.010398,R,2,4383387964928,131072
30779,1.457391e+09,0.017071,R,2,4383388096000,131072
30778,1.457391e+09,0.016121,R,2,4382878015488,131072


In [None]:
#Add a feature (called size_class) which rounds up IO size to next power of 2.  [Example: 125 is class 7 as pow(2,7) = 128] 
sizeClass_list = []
size_list = df['Size'].to_list()
for i in size_list:
  sizeClass_list.append(round(math.log(i,2)))
size_class = pd.Series(sizeClass_list)
df['size_class'] = size_class
df

Unnamed: 0,Timestamp,Response,IOType,LUN,Offset,Size,size_class
0,1.457391e+09,0.000505,R,0,4255049925632,122880,17
1,1.457391e+09,0.000513,R,0,4691229524992,131072,17
2,1.457391e+09,0.000520,R,0,4691229656064,131072,17
3,1.457391e+09,0.000515,R,0,4255050056704,122880,17
4,1.457391e+09,0.000543,R,0,501654973440,131072,17
...,...,...,...,...,...,...,...
12758,1.457391e+09,0.005793,R,0,2569248124928,118784,17
12759,1.457391e+09,0.000209,R,0,1166301144064,4096,12
12760,1.457391e+09,0.000524,R,0,4262587082240,65536,16
12761,1.457391e+09,0.005391,R,0,4749426532352,4096,12


In [None]:
#Drop Size as feature.
df.drop(['Size'], axis=1)

Unnamed: 0,Timestamp,Response,IOType,LUN,Offset,size_class
0,1.457391e+09,0.000505,R,0,4255049925632,17
1,1.457391e+09,0.000513,R,0,4691229524992,17
2,1.457391e+09,0.000520,R,0,4691229656064,17
3,1.457391e+09,0.000515,R,0,4255050056704,17
4,1.457391e+09,0.000543,R,0,501654973440,17
...,...,...,...,...,...,...
12758,1.457391e+09,0.005793,R,0,2569248124928,17
12759,1.457391e+09,0.000209,R,0,1166301144064,12
12760,1.457391e+09,0.000524,R,0,4262587082240,16
12761,1.457391e+09,0.005391,R,0,4749426532352,12


In [None]:
# Categorize IOType (0 for Reads and 1 for writes). It should be in numeric format
iotype_list = []
io_list = df['IOType'].to_list()
for i in io_list:
  #print(i)
  if (i == 'R'):
    iotype_list.append(0)
  elif(i == 'W'):
    iotype_list.append(1)

df['IOType'] = iotype_list


In [None]:
#Drop IOSize as feature
df = df.drop(columns = ['Size', 'Timestamp'])
df

Unnamed: 0,Response,IOType,LUN,Offset,size_class
0,0.000505,0,0,4255049925632,17
1,0.000513,0,0,4691229524992,17
2,0.000520,0,0,4691229656064,17
3,0.000515,0,0,4255050056704,17
4,0.000543,0,0,501654973440,17
...,...,...,...,...,...
12758,0.005793,0,0,2569248124928,17
12759,0.000209,0,0,1166301144064,12
12760,0.000524,0,0,4262587082240,16
12761,0.005391,0,0,4749426532352,12


In [None]:
# Make sure all data is numeric and contains no n/a or missing values.
print(df.isnull().any())

Response      False
IOType        False
LUN           False
Offset        False
size_class    False
dtype: bool


In [None]:
import pandas as pd
from sklearn import preprocessing

cols = df.columns
x = df.values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df = pd.DataFrame(x_scaled)
df.columns = cols

In [None]:
# df = df.astype(np.float64)

In [None]:
df.dtypes

Response      float64
IOType        float64
LUN           float64
Offset        float64
size_class    float64
dtype: object

In [None]:
df.head(5)

Unnamed: 0,Response,IOType,LUN,Offset,size_class
0,0.003368,0.0,0.0,0.798363,0.888889
1,0.003422,0.0,0.0,0.880205,0.888889
2,0.003469,0.0,0.0,0.880205,0.888889
3,0.003435,0.0,0.0,0.798363,0.888889
4,0.003623,0.0,0.0,0.094102,0.888889


In [None]:
#from sklearn.model_selection import train_test_split

In [None]:
point_to_split = int(len(df)*0.8)
train = df[:point_to_split]
test = df[point_to_split:]

In [None]:
train.shape

(67776, 5)

In [None]:
#Separate data to train and test set
#train, test = train_test_split(df, test_size = 0.2)

In [None]:
train

Unnamed: 0,Response,IOType,LUN,Offset,size_class
0,0.003368,0.0,0.0,0.798363,0.888889
1,0.003422,0.0,0.0,0.880205,0.888889
2,0.003469,0.0,0.0,0.880205,0.888889
3,0.003435,0.0,0.0,0.798363,0.888889
4,0.003623,0.0,0.0,0.094102,0.888889
...,...,...,...,...,...
67771,0.061611,0.0,1.0,0.028897,0.333333
67772,0.086389,0.0,1.0,0.029816,0.333333
67773,0.063646,0.0,1.0,0.028918,0.333333
67774,0.000690,1.0,1.0,0.070214,0.333333


In [None]:
test

Unnamed: 0,Response,IOType,LUN,Offset,size_class
67776,0.033089,0.0,1.0,0.025979,0.888889
67777,0.001607,1.0,1.0,0.031548,0.000000
67778,0.000857,1.0,1.0,0.031548,0.888889
67779,0.003636,0.0,1.0,0.590581,0.333333
67780,0.000830,1.0,1.0,0.031548,0.333333
...,...,...,...,...,...
84716,0.038781,0.0,0.0,0.482051,0.888889
84717,0.001386,0.0,0.0,0.218811,0.333333
84718,0.003496,0.0,0.0,0.799778,0.777778
84719,0.036089,0.0,0.0,0.891125,0.333333


In [None]:
# Normalize Data.
# Prepare X and y for training (3D Tensor format)

In [None]:
numeric_feature_names = ['Response', 'IOType', 'LUN',  'Offset', 'size_class']
numeric_features = df[numeric_feature_names]
numeric_features.head()

Unnamed: 0,Response,IOType,LUN,Offset,size_class
0,0.003368,0.0,0.0,0.798363,0.888889
1,0.003422,0.0,0.0,0.880205,0.888889
2,0.003469,0.0,0.0,0.880205,0.888889
3,0.003435,0.0,0.0,0.798363,0.888889
4,0.003623,0.0,0.0,0.094102,0.888889


In [None]:
tf.convert_to_tensor(numeric_features)

<tf.Tensor: shape=(84721, 5), dtype=float64, numpy=
array([[0.00336849, 0.        , 0.        , 0.79836337, 0.88888889],
       [0.00342207, 0.        , 0.        , 0.88020516, 0.88888889],
       [0.00346894, 0.        , 0.        , 0.88020518, 0.88888889],
       ...,
       [0.00349573, 0.        , 0.        , 0.79977759, 0.77777778],
       [0.03608907, 0.        , 0.        , 0.89112485, 0.33333333],
       [0.04097104, 0.        , 0.        , 0.48205105, 0.44444444]])>

In [None]:
#Train
numeric_feature_names = ['Response', 'IOType', 'LUN',  'Offset', 'size_class']
numeric_features = train[numeric_feature_names]
numeric_features.head()
train_tf = tf.convert_to_tensor(numeric_features)

In [None]:
#Train
numeric_feature_names = ['Response', 'IOType', 'LUN',  'Offset', 'size_class']
numeric_features = test[numeric_feature_names]
numeric_features.head()
test_tf = tf.convert_to_tensor(numeric_features)

##Task 2: Encoding Time series data for training

###Task 3: Using single feature data (IO_Type), design an LSTM based model to predict IO_Type. [Single feature binary classification]


In [None]:
train

Unnamed: 0,Response,IOType,LUN,Offset,size_class
0,0.003368,0.0,0.0,0.798363,0.888889
1,0.003422,0.0,0.0,0.880205,0.888889
2,0.003469,0.0,0.0,0.880205,0.888889
3,0.003435,0.0,0.0,0.798363,0.888889
4,0.003623,0.0,0.0,0.094102,0.888889
...,...,...,...,...,...
67771,0.061611,0.0,1.0,0.028897,0.333333
67772,0.086389,0.0,1.0,0.029816,0.333333
67773,0.063646,0.0,1.0,0.028918,0.333333
67774,0.000690,1.0,1.0,0.070214,0.333333


In [None]:
spots_train = train['IOType'].tolist()
spots_test = test['IOType'].tolist()

In [None]:
import numpy as np

def to_sequences(seq_size, obs):
    x = []
    y = []

    for i in range(len(obs)-SEQUENCE_SIZE):
        #print(i)
        window = obs[i:(i+SEQUENCE_SIZE)]
        after_window = obs[i+SEQUENCE_SIZE]
        window = [[x] for x in window]
        #print("{} - {}".format(window,after_window))
        x.append(window)
        y.append(after_window)
        
    return np.array(x),np.array(y)
    
    
SEQUENCE_SIZE = 10
x_train,y_train = to_sequences(SEQUENCE_SIZE,spots_train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,spots_test)

print("Shape of training set: {}".format(x_train.shape))
print("Shape of test set: {}".format(x_test.shape))

Shape of training set: (67766, 10, 1)
Shape of test set: (16935, 10, 1)


In [None]:
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding
from tensorflow.keras.layers import LSTM
from tensorflow.keras.datasets import imdb
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np

print('Build model...')
model = Sequential()
model.add(LSTM(64, dropout=0.0, recurrent_dropout=0.0,input_shape=(None, 1)))
model.add(Dense(32))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, 
                        verbose=1, mode='auto', restore_best_weights=True)
print('Train...')

model.fit(x_train,y_train,validation_data=(x_test,y_test),
          callbacks=[monitor],verbose=1,epochs=500)

Build model...
Train...
Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 6: early stopping


<keras.callbacks.History at 0x7f9da752b610>

In [None]:
from sklearn import metrics

pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print("Score (RMSE): {}".format(score))

Score (RMSE): 0.3648067662957701



###Task 4: Using all features, design an LSTM based model to predict IO_Type. [Multi-feature binary classification]


**Task 4: Using all features, design an LSTM based model to predict IO_Type.** [Multi-feature binary classification]


- Features to train : `['Response','IOType','LUN','Offset','Size']`
- Feature to predict: IO_Type
- Training data     : 80%
- Test data         : 20%
- Sequence Size     : 10/25      

Please use a initial sequence size of 10 and compute accuracy. Compare accuracy with a sequence size of 25.

###### Sequence Size 10

In [None]:
print(train.isnull().any())
print(test.isnull().any())

Response      False
IOType        False
LUN           False
Offset        False
size_class    False
dtype: bool
Response      False
IOType        False
LUN           False
Offset        False
size_class    False
dtype: bool


In [None]:
print(train.isna().any())
print(test.isna().any())

Response      False
IOType        False
LUN           False
Offset        False
size_class    False
dtype: bool
Response      False
IOType        False
LUN           False
Offset        False
size_class    False
dtype: bool


In [None]:
df

Unnamed: 0,Response,IOType,LUN,Offset,size_class
0,0.003368,0.0,0.0,0.798363,0.888889
1,0.003422,0.0,0.0,0.880205,0.888889
2,0.003469,0.0,0.0,0.880205,0.888889
3,0.003435,0.0,0.0,0.798363,0.888889
4,0.003623,0.0,0.0,0.094102,0.888889
...,...,...,...,...,...
84716,0.038781,0.0,0.0,0.482051,0.888889
84717,0.001386,0.0,0.0,0.218811,0.333333
84718,0.003496,0.0,0.0,0.799778,0.777778
84719,0.036089,0.0,0.0,0.891125,0.333333


In [None]:
import numpy as np

def to_sequences(seq_size, obs):
    length_obs = obs.shape[0]
    #print("length_obs")
    #print(length_obs)
    x = []
    y = []

    for i in range(length_obs-SEQUENCE_SIZE-1):
        window = obs.iloc[i:(i+SEQUENCE_SIZE)]
        after_window = obs.iloc[i+SEQUENCE_SIZE]
        x.append(np.array(window))
        y.append(after_window['IOType'])        
    return np.array(x),np.array(y)
    
    
SEQUENCE_SIZE = 10
x_train,y_train = to_sequences(SEQUENCE_SIZE,train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,test)

print("Training : Shape of X: {} & Y shape = {}".format(x_train.shape,len(y_train)))
print("Test : Shape of X: {} & Y shape = {} ".format(x_test.shape,len(y_test)))

Training : Shape of X: (67765, 10, 5) & Y shape = 67765
Test : Shape of X: (16934, 10, 5) & Y shape = 16934 


In [None]:
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

(67765, 10, 5)
(16934, 10, 5)
(67765,)
(16934,)


In [None]:
# look_back, num_features = x_train[0].shape

# print('Build model...')
# model = Sequential()

# model.add(LSTM(32, activation='relu', input_shape=(look_back, num_features)))
# model.add(Dense(32, activation='relu'))
# model.add(Dense(units=1, kernel_initializer = 'random_normal'))
# model.compile(loss='mean_squared_error', optimizer='adam')
# monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, 
#                         verbose=1, mode='auto', restore_best_weights=True)


In [None]:
import tensorflow as tf
import numpy as np
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Embedding
from keras.layers import LSTM
from keras.datasets import imdb
from keras.callbacks import EarlyStopping
from sklearn import metrics
from keras.layers.core import Dense, Activation
from keras.callbacks import ModelCheckpoint


look_back, num_features = x_train[0].shape

print('Build model...')
model = Sequential()
model.add(LSTM(32, return_sequences = True, activation='relu', input_shape=(look_back, num_features)))
model.add(LSTM(32, activation='relu'))
#model.add(Dense(32, activation='relu'))

model.add(Dense(units=1, kernel_initializer = 'random_normal'))
model.compile(loss='binary_crossentropy', optimizer='adam',metrics = 'accuracy')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, 
                        verbose=1, mode='auto', restore_best_weights=True)

print('Train...')

model.fit(x_train,y_train,validation_data=(x_test,y_test),
          callbacks=[monitor],verbose=1,epochs=500)


Build model...
Train...
Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 7: early stopping


<keras.callbacks.History at 0x7f9da8b8bb10>

In [None]:
from sklearn import metrics

pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print("Score (RMSE): {}".format(score))

Score (RMSE): 0.3642408220019959


###### Sequence Size 25

In [None]:
import numpy as np

def to_sequences(seq_size, obs):
    length_obs = obs.shape[0]
    #print("length_obs")
    #print(length_obs)
    x = []
    y = []

    for i in range(length_obs-SEQUENCE_SIZE-1):
        window = obs.iloc[i:(i+SEQUENCE_SIZE)]
        after_window = obs.iloc[i+SEQUENCE_SIZE]
        x.append(np.array(window))
        y.append(after_window['IOType'])        
    return np.array(x),np.array(y)
    
    
SEQUENCE_SIZE = 25
x_train,y_train = to_sequences(SEQUENCE_SIZE,train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,test)

print("Training : Shape of X: {} & Y shape = {}".format(x_train.shape,len(y_train)))
print("Test : Shape of X: {} & Y shape = {} ".format(x_test.shape,len(y_test)))

Training : Shape of X: (67750, 25, 5) & Y shape = 67750
Test : Shape of X: (16919, 25, 5) & Y shape = 16919 


In [None]:
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

(67750, 25, 5)
(16919, 25, 5)
(67750,)
(16919,)


In [None]:
import tensorflow as tf
import numpy as np
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Embedding
from keras.layers import LSTM
from keras.datasets import imdb
from keras.callbacks import EarlyStopping
from sklearn import metrics
from keras.layers.core import Dense, Activation
from keras.callbacks import ModelCheckpoint


look_back, num_features = x_train[0].shape

print('Build model...')
model = Sequential()
model.add(LSTM(32, return_sequences = True, activation='relu', input_shape=(look_back, num_features)))
model.add(LSTM(32, activation='relu'))

model.add(Dense(units=1, kernel_initializer = 'random_normal'))
model.compile(loss='binary_crossentropy', optimizer='adam',metrics = 'accuracy')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, 
                        verbose=1, mode='auto', restore_best_weights=True)
print('Train...')

model.fit(x_train,y_train,validation_data=(x_test,y_test),
          callbacks=[monitor],verbose=1,epochs=500)


Build model...
Train...
Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 11: early stopping


<keras.callbacks.History at 0x7f9da7c9c710>

In [None]:
from sklearn import metrics

pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print("Score (RMSE): {}".format(score))

Score (RMSE): 0.3637013087228296


###Task 5: Using all features, design an LSTM based model to predict Response Time. [Multi-feature Regression]


**Task 5: Using all features, design an LSTM based model to predict Response Time.**  [Multi-feature Regression]

- Features to train : `['Response,'IOType','LUN','Offset','Size']`
- Feature to predict: Response
- Training data     : 80%
- Test data         : 20%
- Sequence Size     : 10/25      

Please use a initial sequence size of 10 and compute accuracy. Compare accuracy with a sequence size of 25.

###### Sequence Size 10

In [None]:
import numpy as np

def to_sequences(seq_size, obs):
    length_obs = obs.shape[0]
    #print("length_obs")
    #print(length_obs)
    x = []
    y = []

    for i in range(length_obs-SEQUENCE_SIZE-1):
        window = obs.iloc[i:(i+SEQUENCE_SIZE)]
        after_window = obs.iloc[i+SEQUENCE_SIZE]
        x.append(np.array(window))
        y.append(after_window['Response'])        
    return np.array(x),np.array(y)
    
    
SEQUENCE_SIZE = 10
x_train,y_train = to_sequences(SEQUENCE_SIZE,train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,test)

print("Training : Shape of X: {} & Y shape = {}".format(x_train.shape,len(y_train)))
print("Test : Shape of X: {} & Y shape = {} ".format(x_test.shape,len(y_test)))

Training : Shape of X: (67765, 10, 5) & Y shape = 67765
Test : Shape of X: (16934, 10, 5) & Y shape = 16934 


In [None]:
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

(67765, 10, 5)
(16934, 10, 5)
(67765,)
(16934,)


In [None]:
df.head(5)

Unnamed: 0,Response,IOType,LUN,Offset,size_class
0,0.003368,0.0,0.0,0.798363,0.888889
1,0.003422,0.0,0.0,0.880205,0.888889
2,0.003469,0.0,0.0,0.880205,0.888889
3,0.003435,0.0,0.0,0.798363,0.888889
4,0.003623,0.0,0.0,0.094102,0.888889


In [None]:
import tensorflow as tf
import numpy as np
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Embedding
from keras.layers import LSTM
from keras.datasets import imdb
from keras.callbacks import EarlyStopping
from sklearn import metrics
from keras.layers.core import Dense, Activation
from keras.callbacks import ModelCheckpoint


look_back, num_features = x_train[0].shape

print('Build model...')
model = Sequential()

model.add(LSTM(32, activation='relu', input_shape=(look_back, num_features)))
model.add(Dense(32, activation='relu'))
model.add(Dense(units=1, kernel_initializer = 'random_normal'))
model.compile(loss='mean_squared_error', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, 
                        verbose=1, mode='auto', restore_best_weights=True)

print('Train...')

model.fit(x_train,y_train,validation_data=(x_test,y_test),
          callbacks=[monitor],verbose=1,epochs=500)


Build model...
Train...
Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 6: early stopping


<keras.callbacks.History at 0x7f9da6e888d0>

In [None]:
from sklearn import metrics

pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print("Score (RMSE): {}".format(score))

Score (RMSE): 0.0235299134071851


###### Sequence Size 25

In [None]:
import numpy as np

def to_sequences(seq_size, obs):
    length_obs = obs.shape[0]
    #print("length_obs")
    #print(length_obs)
    x = []
    y = []

    for i in range(length_obs-SEQUENCE_SIZE-1):
        window = obs.iloc[i:(i+SEQUENCE_SIZE)]
        after_window = obs.iloc[i+SEQUENCE_SIZE]
        x.append(np.array(window))
        y.append(after_window['Response'])        
    return np.array(x),np.array(y)
    
    
SEQUENCE_SIZE = 25
x_train,y_train = to_sequences(SEQUENCE_SIZE,train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,test)

print("Training : Shape of X: {} & Y shape = {}".format(x_train.shape,len(y_train)))
print("Test : Shape of X: {} & Y shape = {} ".format(x_test.shape,len(y_test)))

Training : Shape of X: (67750, 25, 5) & Y shape = 67750
Test : Shape of X: (16919, 25, 5) & Y shape = 16919 


In [None]:
import tensorflow as tf
import numpy as np
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Embedding
from keras.layers import LSTM
from keras.datasets import imdb
from keras.callbacks import EarlyStopping
from sklearn import metrics
from keras.layers.core import Dense, Activation
from keras.callbacks import ModelCheckpoint


look_back, num_features = x_train[0].shape

print('Build model...')
model = Sequential()

model.add(LSTM(32, return_sequences = True, activation='relu', input_shape=(look_back, num_features)))
model.add(LSTM(32, activation='relu'))



model.add(Dense(units=1, kernel_initializer = 'random_normal'))
model.compile(loss='mean_squared_error', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, 
                        verbose=1, mode='auto', restore_best_weights=True)

print('Train...')

model.fit(x_train,y_train,validation_data=(x_test,y_test),
          callbacks=[monitor],verbose=1,epochs=500)


Build model...
Train...
Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 6: early stopping


<keras.callbacks.History at 0x7f9da6c8af50>

In [None]:
from sklearn import metrics

pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print("Score (RMSE): {}".format(score))

Score (RMSE): 0.023691565840595483


###Task 6: Using all features, design an Transformer based model to predict Response Time. [Multi-feature Regression]

**Task 6: Using all features, design a Transformer based model to predict Response Time.**  [Multi-feature Regression]

- Features to train : `['Response,'IOType','LUN','Offset','Size']`
- Feature to predict: Response
- Training data     : 80%
- Test data         : 20%
- Sequence Size     : 10/25      

Please use a initial sequence size of 10 and compute accuracy. Compare accuracy with a sequence size of 25.

###### Sequence Size 10

In [None]:
spots_train = train['Response'].tolist()
spots_test = test['Response'].tolist()

In [None]:
import numpy as np

def to_sequences(seq_size, obs):
    x = []
    y = []

    for i in range(len(obs)-SEQUENCE_SIZE):
        #print(i)
        window = obs[i:(i+SEQUENCE_SIZE)]
        after_window = obs[i+SEQUENCE_SIZE]
        window = [[x] for x in window]
        #print("{} - {}".format(window,after_window))
        x.append(window)
        y.append(after_window)
        
    return np.array(x),np.array(y)
    
    
SEQUENCE_SIZE = 10
x_train,y_train = to_sequences(SEQUENCE_SIZE,spots_train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,spots_test)

print("Shape of training set: {}".format(x_train.shape))
print("Shape of test set: {}".format(x_test.shape))

Shape of training set: (67766, 10, 1)
Shape of test set: (16935, 10, 1)


In [None]:
from tensorflow import keras
from tensorflow.keras import layers

def transformer_encoder(inputs, head_size, num_heads, ff_dim, dropout=0):
    # Normalization and Attention
    x = layers.LayerNormalization(epsilon=1e-6)(inputs)
    x = layers.MultiHeadAttention(
        key_dim=head_size, num_heads=num_heads, dropout=dropout
    )(x, x)
    x = layers.Dropout(dropout)(x)
    res = x + inputs

    # Feed Forward Part
    x = layers.LayerNormalization(epsilon=1e-6)(res)
    x = layers.Conv1D(filters=ff_dim, kernel_size=1, activation="relu")(x)
    x = layers.Dropout(dropout)(x)
    x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x)
    return x + res

In [None]:
def build_model(
    input_shape,
    head_size,
    num_heads,
    ff_dim,
    num_transformer_blocks,
    mlp_units,
    dropout=0,
    mlp_dropout=0,
):
    inputs = keras.Input(shape=input_shape)
    x = inputs
    for _ in range(num_transformer_blocks):
        x = transformer_encoder(x, head_size, num_heads, ff_dim, dropout)

    x = layers.GlobalAveragePooling1D(data_format="channels_first")(x)
    for dim in mlp_units:
        x = layers.Dense(dim, activation="relu")(x)
        x = layers.Dropout(mlp_dropout)(x)
    outputs = layers.Dense(1)(x)
    return keras.Model(inputs, outputs)

In [None]:
input_shape = x_train.shape[1:]

model = build_model(
    input_shape,
    head_size=256,
    num_heads=4,
    ff_dim=4,
    num_transformer_blocks=4,
    mlp_units=[128],
    mlp_dropout=0.4,
    dropout=0.25,
)

model.compile(
    loss="mean_squared_error",
    optimizer=keras.optimizers.Adam(learning_rate=1e-4)
)
#model.summary()

callbacks = [keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)]

model.fit(
    x_train,
    y_train,
    validation_split=0.2,
    epochs=200,
    batch_size=64,
    callbacks=callbacks,
)

model.evaluate(x_test, y_test, verbose=1)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200


0.0005449175368994474

In [None]:
from sklearn import metrics

pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print("Score (RMSE): {}".format(score))

Score (RMSE): 0.023343467284265833


###### Sequence Size 25

In [None]:
spots_train = train['Response'].tolist()
spots_test = test['Response'].tolist()

In [None]:
import numpy as np

def to_sequences(seq_size, obs):
    x = []
    y = []

    for i in range(len(obs)-SEQUENCE_SIZE):
        #print(i)
        window = obs[i:(i+SEQUENCE_SIZE)]
        after_window = obs[i+SEQUENCE_SIZE]
        window = [[x] for x in window]
        #print("{} - {}".format(window,after_window))
        x.append(window)
        y.append(after_window)
        
    return np.array(x),np.array(y)
    
    
SEQUENCE_SIZE = 25
x_train,y_train = to_sequences(SEQUENCE_SIZE,spots_train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,spots_test)

print("Shape of training set: {}".format(x_train.shape))
print("Shape of test set: {}".format(x_test.shape))

Shape of training set: (67751, 25, 1)
Shape of test set: (16920, 25, 1)


In [None]:
from tensorflow import keras
from tensorflow.keras import layers

def transformer_encoder(inputs, head_size, num_heads, ff_dim, dropout=0):
    # Normalization and Attention
    x = layers.LayerNormalization(epsilon=1e-6)(inputs)
    x = layers.MultiHeadAttention(
        key_dim=head_size, num_heads=num_heads, dropout=dropout
    )(x, x)
    x = layers.Dropout(dropout)(x)
    res = x + inputs

    # Feed Forward Part
    x = layers.LayerNormalization(epsilon=1e-6)(res)
    x = layers.Conv1D(filters=ff_dim, kernel_size=1, activation="relu")(x)
    x = layers.Dropout(dropout)(x)
    x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x)
    return x + res

In [None]:
def build_model(
    input_shape,
    head_size,
    num_heads,
    ff_dim,
    num_transformer_blocks,
    mlp_units,
    dropout=0,
    mlp_dropout=0,
):
    inputs = keras.Input(shape=input_shape)
    x = inputs
    for _ in range(num_transformer_blocks):
        x = transformer_encoder(x, head_size, num_heads, ff_dim, dropout)

    x = layers.GlobalAveragePooling1D(data_format="channels_first")(x)
    for dim in mlp_units:
        x = layers.Dense(dim, activation="relu")(x)
        x = layers.Dropout(mlp_dropout)(x)
    outputs = layers.Dense(1)(x)
    return keras.Model(inputs, outputs)

In [None]:
input_shape = x_train.shape[1:]

model = build_model(
    input_shape,
    head_size=256,
    num_heads=4,
    ff_dim=4,
    num_transformer_blocks=4,
    mlp_units=[128],
    mlp_dropout=0.4,
    dropout=0.25,
)

model.compile(
    loss="mean_squared_error",
    optimizer=keras.optimizers.Adam(learning_rate=1e-4)
)
#model.summary()

callbacks = [keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)]

model.fit(
    x_train,
    y_train,
    validation_split=0.2,
    epochs=200,
    batch_size=64,
    callbacks=callbacks,
)

model.evaluate(x_test, y_test, verbose=1)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200


0.0005409072618931532

In [None]:
from sklearn import metrics

pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print("Score (RMSE): {}".format(score))

Score (RMSE): 0.023257410479211142


# Part II: Train an neural network word by word from scratch to generate jokes.

Using the jokes data, train a LSTM model to generate new jokes. You may refer to `“Notebook_11_Natural_Language_Processing.ipynb”` for reference. 

Please follow the steps below.

- Extract all unique words using Space from the text and assign a unique ID to each character.
- Remove stop words
- Build two dictionaries.
- The first one will be used to convert a word into its ID.
- The second one will convert an ID back into its word.
- Tokenize text
- Build the actual sequences from the data using the unique word id.
- Use sequence length of 10 words and sample every 5 words.
- Convert the text into vectors.
-  Create the neural network. This neural network's primary feature is the LSTM layer, which allows the sequences to be processed. Please use a single LSTM layer. (Too many layers will crash your GPU)
- Display text at several "temperatures”. Temperature refers to the amount of randomness allowed in words chosen by the NN 
- Print your best generated joke in a separate cell.

The model will produce new text character by character. Use Lambda callback to generate predictions while training the model. You will need to sample the correct word from the predictions each time.

In [1]:
# Code for downloading the data

import os
import datetime
import math

import IPython
import IPython.display
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf
from sklearn import preprocessing
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.callbacks import EarlyStopping
from sklearn import metrics
from tensorflow.keras.callbacks import LambdaCallback
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.utils import get_file
import numpy as np
import random
import sys
import io
import requests
import re


In [2]:
r = requests.get("https://raw.githubusercontent.com/Rachnog/Rap_Generation/master/jokes_dataset_text.txt")
download = r.text
print(download[0:300])

input_text = download.lower()
# Taking a million samples of data. You may use full text if your environment has resources
input_text = input_text[:1000000]
print('corpus length:', len(input_text))

[me narrating a documentary about narrators] "I can't hear what they're saying cuz I'm talking"
<|endoftext|>Telling my daughter garlic is good for you. Good immune system and keeps pests away.Ticks, mosquitos, vampires... men.
<|endoftext|>I've been going through a really rough period at work this 
corpus length: 1000000


In [3]:
import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp(input_text)
vocab = set()
tokenized_text = []

for token in doc:
    word = ''.join([i if ord(i) < 128 else ' ' for i in token.text])
    word = word.strip()
    if not token.is_digit \
        and not token.like_url \
        and not token.like_email:
        vocab.add(word)
        tokenized_text.append(word)
        
print(f"Vocab size: {len(vocab)}")

Vocab size: 15523


In [4]:
print(list(vocab)[:20])

['', '|endoftext|>wanted', 'impressive', 'worrying', 'kin', 'rabbit', '|endoftext|>so', 'production', 'patient', 'silver', 'sets', 'solutions', 'gas', 'vice', 'hong', 'bowling', 'apprentice', 'irrelevant', 'lightsaber', '|endoftext|>johnson']


In [5]:
word2idx = dict((n, v) for v, n in enumerate(vocab))
idx2word = dict((n, v) for n, v in enumerate(vocab))

In [6]:
tokenized_text = [word2idx[word] for word in tokenized_text]

In [7]:
tokenized_text

[9677,
 11298,
 5364,
 9183,
 8345,
 12758,
 15417,
 12617,
 5379,
 3325,
 14354,
 5520,
 9179,
 7680,
 14039,
 5723,
 11434,
 13969,
 3325,
 198,
 4283,
 5379,
 0,
 1635,
 405,
 1347,
 9385,
 4126,
 9345,
 11999,
 1252,
 4070,
 13863,
 11999,
 9510,
 13428,
 11183,
 4264,
 6226,
 2453,
 6777,
 10036,
 6777,
 11708,
 10718,
 5948,
 13863,
 0,
 1635,
 3239,
 5645,
 54,
 4527,
 9183,
 13546,
 11004,
 10555,
 4850,
 5335,
 336,
 3107,
 903,
 12396,
 1347,
 15075,
 3646,
 1252,
 1568,
 1347,
 11590,
 1252,
 4227,
 7450,
 13863,
 0,
 1635,
 4090,
 3325,
 10118,
 11454,
 1273,
 13606,
 14048,
 6777,
 13236,
 11690,
 1095,
 10718,
 10718,
 3325,
 4504,
 4769,
 1095,
 13863,
 9755,
 13863,
 7050,
 0,
 1635,
 14174,
 13588,
 3382,
 12708,
 9183,
 13329,
 13863,
 7324,
 6530,
 7442,
 2637,
 13863,
 0,
 1635,
 9773,
 14354,
 5520,
 5250,
 11944,
 6543,
 3090,
 3818,
 8093,
 5321,
 12127,
 9183,
 93,
 5502,
 13863,
 7195,
 0,
 1635,
 9773,
 11735,
 7324,
 6692,
 2396,
 3090,
 2414,
 11071,
 12127,

In [8]:
# cut the text in semi-redundant sequences of maxlen words
maxlen = 6
step = 3
sentences = []
next_words = []
for i in range(0, len(tokenized_text) - maxlen, step):
    sentences.append(tokenized_text[i: i + maxlen])
    next_words.append(tokenized_text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 72238


In [9]:
sentences[0:5]

[[9677, 11298, 5364, 9183, 8345, 12758],
 [9183, 8345, 12758, 15417, 12617, 5379],
 [15417, 12617, 5379, 3325, 14354, 5520],
 [3325, 14354, 5520, 9179, 7680, 14039],
 [9179, 7680, 14039, 5723, 11434, 13969]]

In [10]:
import numpy as np

print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(vocab)), dtype=np.bool)
y = np.zeros((len(sentences), len(vocab)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, word in enumerate(sentence):
        x[i, t, word] = 1
    y[i, next_words[i]] = 1

Vectorization...


Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  after removing the cwd from sys.path.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  """


In [11]:
print(x.shape)
print(y.shape)
print(y[0:5])

(72238, 6, 15523)
(72238, 15523)
[[False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]]


In [12]:
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(vocab))))
model.add(Dense(len(vocab), activation='softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

Build model...


  super(RMSprop, self).__init__(name, **kwargs)


In [13]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128)               8013824   
                                                                 
 dense (Dense)               (None, 15523)             2002467   
                                                                 
Total params: 10,016,291
Trainable params: 10,016,291
Non-trainable params: 0
_________________________________________________________________


In [14]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [15]:
def on_epoch_end(epoch, _):
    # Function invoked at end of each epoch. Prints generated text.
    print("******************************************************")
    print('----- Generating text after Epoch: %d' % epoch)

    start_index = random.randint(0, len(tokenized_text) - maxlen)
    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('----- temperature:', temperature)

        #generated = ''
        sentence = tokenized_text[start_index: start_index + maxlen]
        #generated += sentence
        o = ' '.join([idx2word[idx] for idx in sentence])
        print(f'----- Generating with seed: "{o}"')
        #sys.stdout.write(generated)

        for i in range(100):
            x_pred = np.zeros((1, maxlen, len(vocab)))
            for t, word in enumerate(sentence):
                x_pred[0, t, word] = 1.
                

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_word = idx2word[next_index]

            #generated += next_char
            sentence = sentence[1:]
            sentence.append(next_index)

            sys.stdout.write(next_word)
            sys.stdout.write(' ') 
            sys.stdout.flush()
        print()

In [None]:
"is your nose in the middle"
it must have been so dark
who stole the cookie


In [16]:
print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

model.fit(x, y,
          batch_size=128,
          epochs=60,
          callbacks=[print_callback])

Epoch 1/60
----- Generating text after Epoch: 0
----- temperature: 0.2
----- Generating with seed: "first time .  < |endoftext|>approach"
is the difference between a bar and a man and a bar and a bar .  < |endoftext|>what do you call a bar ? a man .  < |endoftext|>why was the difference between a jew and a bar ? a man .  < |endoftext|>why was the difference between a and and a bar ? a man .  < |endoftext|>why did the difference between a jew and a bar and a man .  < |endoftext|>what do you call a man and a bar and a bar .  < |endoftext|>i was a man and a bar 
----- temperature: 0.5
----- Generating with seed: "first time .  < |endoftext|>approach"
 < |endoftext|>a man walks into a bar ? a |endoftext|>me .  < |endoftext|>why was the difference between difference between toilet and a bar and the other ? your body .  < |endoftext|>i was my lawyer 'm on a better .  < |endoftext|>how do you call a bar .  < |endoftext|>why did the difference between a feminist and a thanks ? a good of joke .

  after removing the cwd from sys.path.


call a raining i cow bartender small doing it when he watch into world for room name filled girlfriend sean that |endoftext|>what do you people feminists only great lady they sure whatever that about with no light then |endoftext|>how do i cut light reading 's a restaurant debate . young : husband i realize a five no i got most in party ! - , what at ... since there hand an months ... his figure make for halloween " man here years new 
Epoch 55/60
----- Generating text after Epoch: 54
----- temperature: 0.2
----- Generating with seed: "about my outstanding balance . "
< |endoftext|>do you know why there are more you times |endoftext|>why n't has a seen the it will a very . to stop a were a woman , for my she 's the difference between my dick is my dick .  < |endoftext|>what 's the difference between a women and a baby eye one had a did one to say about on the two things in , first wife by the man . " know , i is n't ok someone else .  < |endoftext|>why did n't the black people ... ... 

<keras.callbacks.History at 0x7f96c3acca10>

In [22]:
print('Favorite Joke: ')
print("'... after its seen it usually'")

Favorite Joke: 
'... after its seen it usually'


In [18]:
print('Done!')

Done!
