This chapter covers:
    Text generation with LSTM
    Implementing DeepDream
    Performing neural style transfer
    Variational autoencoders
    Understanding generative adversarial networks

## 8.1 Text generation with LSTM

rnns used to generate sequence data. e.g., text generation, same techniques can be genralized to any kind of
sequence data: applicaple to a sequences of musical notes in order to generate new music, to timeseries of brush
-stroke data(e.g., recorded while an arties paints on an iPad) to generate paintaings stroke by stroke and so on.
Sequence data gen not limited to aritstic content generation. usable in speech synthesis and dialogue generation
for chatbots. The Smart Reply feature Google 2016

character-level neural language model: The output of the model will be a softmax over all possible characters: a 
        probability distribution for the next character. e.g., take a LSTM layer, feed it strings of N characters
        extracted from a text corpus, and train it to predict character N + 1. 

#### 8.1.3 The importance of the sampling strategy

In [None]:
greedy sampling = a naive aprroach; always choosing the most likely next element. results are repititive, predictable strings unlike 
                  coherent language.
stochastic sampling = makes slightly more surprising choices: introduces randomness in the sampling process,
                  by sampling from the probability distribution for the next character.

## Listing8.1 Reweighting a probability distribution to a different temperature

In [None]:
###Given a temperature value, a new probability distribution is computed from the original one(the softmax output of the model)
###by reweighting in the following way

import numpy as np

def reweight_distribution(original_distribution, temperature=0.5): # original distribution is a 1D Numpy array of
    distribution = np.log(original_distribution) / temperature     # probability values that must sum to 1. temperature 
    distribution = np.exp(distribution)                   # is a factor quantifying the entropy of the output distribution.
    return distribution / np.sum(distribution)   # Returns a reweighted version of the original distribution. The sum of 
                        # distribution may no longer be 1, so divide it by its sum to obtain the new distribution.
    
# Higher temps result in sampling distributions of higher entropy that will genreate more surprising and unstructured
# generated data, whereas a lower temp will result in less randomness and much more predicted generated data.

## 8.1.4 Implementing character-level LSTM text generation

## Listing 8.2: Downloading and parsing the initial text file

In [None]:
import keras 
import numpy as np

path = keras.utils.get_file(
    'nietzche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('Corpus length:', len(text))

## Listing 8.3: vectorizing sequence of characers

Extract partially overlapping sequences of length maxlen, one-hot encode them, and pack them in a 3D Numpy array x
of shape (sequences, maxlen, unique_characters). prepare an array y simulataneously containing the corresponding
targets: the one-hot-encoded characters that come after each extracted sequence.

In [None]:
maxlen = 60   # Extract sequences of 60 characters
step = 3      # Sample a new sequence every three characters
sentences = [] # Holds the extracted sequences
next_chars = [] # Holds the targets(the follow-up characters)  

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

chars = sorted(list(set(text)))    # list of unique characters in the corpus
print('Unique characters:', len(chars))
char_indices = dict((char, chars.index(char)) for char in chars)  # Dictionary that maps unique characters to their
                                                                  # index in the list "chars"  
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)   # One hot encodes
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)           # the character  
for i, sentence in enumerate(sentences):                            # into
    for t, char in enumerate(sentence):                             # binary
        x[i, t, char_indices[char]] = 1                             # arrays
        y[i, char_indices[next_chars[i]]] = 1


### BUILDING THE NETWORK

single LSTM layer followed by a dense classifier and softmax over all possible characters

## Listing 8.4: Single layer LSTM model for next-character prediction

In [None]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

one-hot encoded targets, therefore, categorical_crossentropy used as the loss to train the model. 

## Listing 8.5: Model compilation configuration

In [None]:
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

### TRAINING THE LANGUAGE MODEL AND SAMPLING FROM IT

1. Draw from the model a probability distribution for the next character, given the genrated text available so far.
2. Reweight the distribution to a certain temperature.
3. Sample the next character at random according to the reweighted distribution.
4. Add the new character at the end of the available text.

## Listing 8.6: Function to sample the next character given the model's predictions

In [None]:
# code used to reweight the original probability distribution coming out of the model and draw a character index from it.
# (the sampling function)

def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

## Listing 8.7: Text-generation Loop

In [None]:
import random
import sys

for epoch in range(1, 60):     # Trains the model for 60 epochs
    print('epochs', epoch)
    model.fit(x, y, batch_size=128, epochs=1)   # Fits the model for one iteration on the data
    start_index = random.randint(0, len(text) - maxlen - 1)   # Selects a text seed at random 
    generated_text = text[start_index: start_index + maxlen]
    print('---Generating with seed: "' + generated_text +'"')
    
    for temperature in [0.2, 0.5, 1.0, 1.2]:     # Tries a range of different sampling temperatures
        print('------temperature:', temperature)
        sys.stdout.write(generated_text)
        
            for i in range(400):     # Generates 400 characters, starting from the seed text
                sampled = np.zeros((1, maxlen, len(chars)))    # One-hot encodes the characters generated so far.
                for t, char in enumerate(generated_text):
                    sampled[0, t, char_indices[char]] = 1.
                    
                preds = model.predict(sampled, verbose=0)[0]
                next_index = sample(preds, temperature)
                next_char = chars[next_index]
                
                generated_text += next_char
                generated_text = generated_text[1:]
                
                sys.stdout.write(next_char)