# Chapter 8

### Reweighing a probability distribution to a different temperature

In [None]:
import numpy as np

# original distribution is 1D numpy array of probability value that add up to 1.
# Temperature is a factor quantifying the entropy of the output distribution

def reweight_distribution(original_distribution, temperature=0.5):
    distribution = np.log(original_distribution) / temperature
    distributino = np.exp(distribution)
    
    return distribution/np.sum(distribution)

# Reweighted version of the original distribution. 
# The sum of the distribution may no longer be 1, 
# so you divide it by its sum to obtain the new distribution

### Implementing character-level LSTM text generation

In [2]:
# Downloading and parsin the initial text file
import keras
import numpy as np

path = keras.utils.get_file('nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')

text = open(path).read().lower()
print('Corpus Length', len(text))

Using TensorFlow backend.


Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt


Exception: URL fetch failure on https://s3.amazonaws.com/text-datasets/nietzsche.txt: None -- [Errno 8] nodename nor servname provided, or not known

Next, you’ll extract partially overlapping sequences of length maxlen, one-hot encode them, and pack them in a 3D Numpy array x of shape (sequences, maxlen, unique_characters). Simultaneously, you’ll prepare an array y containing the corre- sponding targets: the one-hot-encoded characters that come after each extracted sequence.

In [3]:
# Vectorizing sequences of characters

# You'll extract sequences of 60 characters
maxlen= 60

# You'll sample a new sequence every 3 characters
step = 3

# Holds the extracted sequences
sentences = []

# Holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i:i+maxlen])
    next_chars.append(text[i+maxlen])
    
print('Number of sequences:',len(sentences))

chars = sorted(list(set(text)))
print('Unique characters:', len(chars))

# Dictionary that maps unique character to their index in the list "chars"
char_indices = dict((char, chars.index(char)) for char in chars)

print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
             
for i, sentences in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_char[i]]] = 1

SyntaxError: unexpected EOF while parsing (<ipython-input-3-15f8103ebff6>, line 15)

### Building The Network

Next, you’ll extract partially overlapping sequences of length maxlen, one-hot encode them, and pack them in a 3D Numpy array x of shape (sequences, maxlen, unique_characters). Simultaneously, you’ll prepare an array y containing the corre- sponding targets: the one-hot-encoded characters that come after each extracted sequence.

In [None]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

### Compile the Model
Because your targets are one-hot encoded, you’ll use categorical_crossentropy as
the loss to train the model.


In [None]:
optimizer = keras.application.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

Given a trained model and a seed text snippet, you can generate new text by doing the following repeatedly:
1 Draw from the model a probability distribution for the next character, given the generated text available so far.
2 Reweight the distribution to a certain temperature.
3 Sample the next character at random according to the reweighted distribution.
4 Add the new character at the end of the available text.
This is the code you use to reweight the original probability distribution coming out of the model and draw a character index from it (the sampling function).

### Function to sample the next character given the model's predictions

In [None]:
def sample(preds,temperature =1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

Finally, the following loop repeatedly trains and generates text. You begin generating text using a range of different temperatures after every epoch. This allows you to see how the generated text evolves as the model begins to converge, as well as the impact of temperature in the sampling strategy.

### Text Generation Loop

In [None]:
import random
import sys

# Train the model for 60 epochs
for epoch in range(1,60):
    print('epoch', epoch)
    # Fits the model for one iteration on the data
    model.fit(x, y, batch_size=128, epochs=1)
    
    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text(start_index: start_index + maxlen)
    print('----Generating with seed:"' + generated_text + '"')
    
    for temperature in [0.2,0.5, 1.0, 1.2]:
        print('--------- temperature:', temperature)
        sys.stdout.write(generated_text)
    

### Implementing DeepDream in Keras

In [None]:
# loading the pretrained Inception V3 model

from keras.application import inception_v3
from keras import backend as K

# You won't be training the model so this command disables all training specfici operations
K.set_learning_phase(0)

# Build the Inception V3 netwokrk without its convolutional base. The model will
# loaded with pretrained ImageNet weights
model = inception_v3.InceptionV3(weights='imagenet',
                                include_top=False)

Next, you’ll compute the loss: the quantity you’ll seek to maximize during the gradient-ascent process. In chapter 5, for filter visualization, you tried to maximize the value of a specific filter in a specific layer. Here, you’ll simultaneously maximize the activation of all filters in a number of layers. Specifically, you’ll maximize a weighted sum of the L2 norm of the activations of a set of high-level layers. The exact set of layers you choose (as well as their contribution to the final loss) has a major influence on the visuals you’ll be able to produce, so you want to make these parameters easily configurable. Lower layers result in geometric patterns, whereas higher layers result in visuals in which you can recognize some classes from ImageNet (for example, birds or dogs). You’ll start from a somewhat arbitrary configuration involving four layers—but you’ll definitely want to explore many different configurations later.


### Setting Up the DeepDream Configuration 

In [None]:
# Dictionary mapping layer names ot a coefficient quantifying how much the layer's 
# activation contributes to the loss you'll seek to maximize. Note that the layer
# are hardcoded in the built-in Inception V3 application. You can list all layer
# name using model.summary()

layer_contributions = {
    'mixed2': 0.2,
    'mixed3': 3.,
    'mixed4': 2.,
    'mixed5': 1.5,
}

### Defining the loss to be maximized
Now, let’s define a tensor that contains the loss: the weighted sum of the L2 norm of
the activations of the layers in listing 8.9.

In [None]:
# Creates a dictionary that maps layer names to a layer instances
layer_dict = dict([(layer.name, layer) for layer in model.layers])

# You'll define the loss by adding layer contribution to this scalar variable
loss = K.variable(0.)
for layer_name in layer_contributions:
    coeff = layer_contribution[layer_name]
    # Retrieves the layer's output
    activation = layer_dict[layer_name].output
    
    # Adds the L2 norm of the features of a layer to the loss. You avoid border
    # artifacts by only involving nonborder pixels in the loss.
    scaling = K.prod(K.cast(K.shape(activation), 'float32'))
    loss += coeff * K.sum(K.square(activation[:,2,-2,2:-2, :])) / scaling

### Gradient-ascent Process

In [None]:
# This tensor holds the generated image: the dream
dream = model.input

# Computes the gradients of the dream with regard to the loss
grads = K.gradients(loss, dream)[0]

# Normalizes the gradients (important trick)
