In [66]:
from keras.datasets import reuters

(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=None, test_split=0.2)

Words are typically represented by tokens in a numeric dictionary where each word corresponds to an integer.

In [69]:
print(x_train[0])

[1, 27595, 28842, 8, 43, 10, 447, 5, 25, 207, 270, 5, 3095, 111, 16, 369, 186, 90, 67, 7, 89, 5, 19, 102, 6, 19, 124, 15, 90, 67, 84, 22, 482, 26, 7, 48, 4, 49, 8, 864, 39, 209, 154, 6, 151, 6, 83, 11, 15, 22, 155, 11, 15, 7, 48, 9, 4579, 1005, 504, 6, 258, 6, 272, 11, 15, 22, 134, 44, 11, 15, 16, 8, 197, 1245, 90, 67, 52, 29, 209, 30, 32, 132, 6, 109, 15, 17, 12]
87


In [98]:
word_index = reuters.get_word_index(path="reuters_word_index.json")

In [102]:
print(word_index['computer'])

803


In [99]:
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
decoded_news = ' '.join([reverse_word_index.get(i - 3, '') for i in x_train[0]])

In [100]:
print(decoded_news)

              mcgrath rentcorp said as a result of its december acquisition of space co it expects earnings per share in 1987 of 1 15 to 1 30 dlrs per share up from 70 cts in 1986 the company said pretax net should rise to nine to 10 mln dlrs from six mln dlrs in 1986 and rental operation revenues to 19 to 22 mln dlrs from 12 5 mln dlrs it said cash flow per share this year should be 2 50 to three dlrs reuter 3


![Embedding](https://www.tensorflow.org/images/audio-image-text.png)

## Word embedding

Words live in a discrete space that is sparse and orthogonal, which severely suffers from the curse of dimensionality. Word embedding is basically a mapping from this challenging space to a lower dimensional vector space that is more dense and correlated. 

https://blog.acolyer.org/2016/04/21/the-amazing-power-of-word-vectors/

![Recurrent](http://colah.github.io/posts/2015-08-Understanding-LSTMs/img/RNN-unrolled.png)

## Recurrent neural networks

Recurrent neural networks allow for language patterns beyond keywords as entire sentences can be entered as input sequences. Recurrent neural networks can handle input sequences of varying lengths and share parameters in time.

http://colah.github.io/posts/2015-08-Understanding-LSTMs/


https://www.coursera.org/lecture/nlp-sequence-models/backpropagation-through-time-bc7ED

In [106]:
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding
from keras.layers import LSTM

In [70]:
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=100)
x_test = sequence.pad_sequences(x_test, maxlen=100)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

Pad sequences (samples x time)
('x_train shape:', (8982, 100))
('x_test shape:', (2246, 100))


In [81]:
print('Convert class vector to binary class matrix '
      '(for use with categorical_crossentropy)')
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print('y_train shape:', y_train.shape)
print('y_test shape:', y_test.shape)

Convert class vector to binary class matrix (for use with categorical_crossentropy)
('y_train shape:', (8982, 46))
('y_test shape:', (2246, 46))


In [94]:
model = Sequential()
model.add(Embedding(30980, 20))
model.add(LSTM(20, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(46, activation='softmax'))

In [95]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_4 (Embedding)      (None, None, 20)          619600    
_________________________________________________________________
lstm_4 (LSTM)                (None, 20)                3280      
_________________________________________________________________
dense_4 (Dense)              (None, 46)                966       
Total params: 623,846
Trainable params: 623,846
Non-trainable params: 0
_________________________________________________________________


In [96]:
model.compile(loss='categorical_crossentropy',
              optimizer='adam', metrics=['accuracy'])

In [97]:
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=15,
          validation_data=(x_test, y_test))

Train on 8982 samples, validate on 2246 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


## Further readings

https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks

https://bair.berkeley.edu/blog/2018/08/06/recurrent/