Na přednášce jsme si povídali o RBF sítích a LSTM sítích, teď si je zkusíme naprogramovat. Na RBF sítích si ukážeme,  jak se vytváří vlastní vrstva v tensorflow. 

# RBF sítě

Implementace vlastní vrstvy v tensorflow je jednoduchá, stačí implementovat třídu, která má metody build, která inicializuje parametry podle velikosti vstupu a call, která implementuje vlastní výpočet.

In [48]:
import tensorflow as tf

class RBFLayer(tf.keras.layers.Layer):
    def __init__(self, num_outputs):
        super(RBFLayer, self).__init__()
        self.num_outputs = num_outputs
    
    def build(self, input_shape):
        self.centers = self.add_variable("centers", shape=(self.num_outputs, int(input_shape[-1])))
        self.beta = self.add_variable("beta", shape=(self.num_outputs,))
    
    def compute_output_shape(self, input_shape):
        return input_shape[0], self.num_outputs
    
    def call(self, x):
        C = tf.expand_dims(self.centers, -1)
        H = tf.transpose(C - tf.transpose(x))
        return tf.math.exp(-self.beta * tf.reduce_sum(tf.pow(H,2), axis=1))                               

In [52]:
from sklearn import datasets
import numpy as np

iris = datasets.load_iris()
x, y = iris.data, iris.target

model = tf.keras.Sequential([
    RBFLayer(10),
    tf.keras.layers.Dense(3, activation=tf.nn.softmax)
]
)

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['acc'])

model.fit(x, y, epochs=1000, verbose=False)
np.mean(np.argmax(model.predict(x), axis=1)==y)

0.3333333333333333

## Cvičení

Vidíme, že naše implementace nefunguje moc dobře, na přednášce jsme si říkali, že se středy vstupních neuronů inicializují pomocí algoritmu $k$-means. Zkuste naši implementaci upravit tak, aby to brala v úvahu. (Hint: metoda add_variable má parametr initializer.)

# LSTM sítě

LSTM sítě se používají pro zpracování textu a časových řad, ukážeme si tedy, jak pomocí nich generovat text. Jako trénovací množinu použijeme texty Nietzscheho.

In [None]:
import numpy as np
import random
import sys

'''
    Example script to generate text from Nietzsche's writings.
    At least 20 epochs are required before the generated text
    starts sounding coherent.
    It is recommended to run this script on GPU, as recurrent
    networks are quite computationally intensive.
    If you try this script on new data, make sure your corpus
    has at least ~100k characters. ~1M is better.
'''

path = tf.keras.utils.get_file('nietzsche.txt', origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt")
text = open(path).read().lower()
print('corpus length:', len(text))

chars = set(text)
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 20
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

print('Vectorization...')
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1
    
print('Build model...')
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(512, return_sequences=True, input_shape=(maxlen, len(chars))))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.LSTM(512, return_sequences=False))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(len(chars), activation=tf.nn.softmax))

model.compile(loss='categorical_crossentropy', optimizer='rmsprop')


def sample(a, temperature=1.0):
    # helper function to sample an index from a probability array
    a = np.log(a) / temperature
    a = np.exp(a) / np.sum(np.exp(a))
    a = a/np.sum(a)
    return np.argmax(np.random.multinomial(1, a, 1))

# train the model, output generated text after each iteration
for iteration in range(1, 60):
    print()
    print('-' * 50)
    print('Iteration', iteration)
    model.fit(X, y, batch_size=128, nb_epoch=1)

    start_index = random.randint(0, len(text) - maxlen - 1)

    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print()
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for _ in range(400):
            x = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x[0, t, char_indices[char]] = 1.

            preds = model.predict(x, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()


Spuštění buňky nahoře na počítačí bez GPU by trvalo několik hodin, spíše i dnů. Spustil jsem ji teda na platformě Google Colab, a na výsledky se [můžete podívat](https://colab.research.google.com/drive/1B7zys275xmpPqahPwNvuYMPLmgvlV3l5).