## Getting started with Keras Squential model

The Sequential model is a linear stack of layers.

You can create a Sequential model by passing a list of layer instances to the constructor:

In [1]:
from keras.models import Sequential
from keras.layers import Dense, Activation

model = Sequential([
    Dense(32, input_shape=(784,)),
    Activation('relu'),
    Dense(10),
    Activation('softmax'),
])

Using TensorFlow backend.


You can also simply add layers via the .add() method:

In [3]:
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))

### Specifying the input shape 

The model needs to know what input shape it should expect. For this reason, the first layer in a Sequential model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. There are several possible ways to do this:

* Pass an `input_shape` argument to the first layer. This is a shape tuple (a tuple of integers or `None` entries, where `None` indicates that any positive integer may be expected). In `input_shape`, the batch dimension is not included.

* Some 2D layers, such as `Dense`, support the specification of their input shape via the argument `input_dim`, and some 3D temporal layers support the arguments `input_dim` and `input_length`.

* If you ever need to specify a fixed batch size for your inputs (this is useful for stateful recurrent networks), you can pass a  `batch_size` argument to a layer. If you pass both `batch_size=32` and `input_shape=(6, 8)` to a layer, it will then expect every batch of inputs to have the batch shape `(32, 6, 8)`.

As such, the following snippets are strictly equivalent:

In [4]:
model = Sequential()
model.add(Dense(32, input_shape=(784,)))

In [5]:
model = Sequential()
model.add(Dense(32, input_dim=784))

### Compilation

Before training a model, you need to configure the learning process, which is done via the `compile` method. It receives three arguments:

* An optimizer. This could be the string identifier of an existing optimizer (such as `rmsprop` or `adagrad`), or an instance of the `Optimizer` class. (In this case, Optimizer means a method or algorithm that optimizes some functions)

* A loss function. This is the objective function that the model will try to minimize. It can be the string identifier of an existing loss function (such as categorical_crossentropy or mse), or it can be an objective function.

* A list of metrics. For any classification problem you will want to set this to `metrics=['accuracy']`. A metric could be the string identifier of an existing metric or a custom metric function.

In [6]:
# For a multi-class classification problem
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# For a binary classification problem
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# For a mean squared error regression problem
model.compile(optimizer='rmsprop',
              loss='mse')

# For custom metrics
import keras.backend as K

def mean_pred(y_true, y_pred):
    return K.mean(y_pred)

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy', mean_pred])

### Training

Keras models are trained on a Numpy arrays of input data and labels. For training a mode, you will typically use the `fit` function.

#### The fit function

This is a little break about the fit function. The structure of this function is somethin like this:

`fit(self, x, y, batch_size=32, epochs=10, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)`

This trains the model for a fixed number of epochs.

##### Arguments

* *x*: input data, as a Numpy array or list of Numpy arrays (if the model has multiple inputs).
* *y*: labels, as a Numpy array.
* *batch_size*: integer. Number of samples per gradient update.
* *epochs*: integer, the number of epochs to train the model.
* *verbose*: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
* *callbacks*: list of keras.callbacks.Callback instances. List of callbacks to apply during training.
* *validation_split*: float (0. < x < 1). Fraction of the data to use as held-out validation data.
* *validation_data*: tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) to be used as held-out validation data. Will override validation_split.
* *shuffle*: boolean or str (for 'batch'). Whether to shuffle the samples at each epoch. 'batch' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks.
* *class_weight*: dictionary mapping classes to a weight value, used for scaling the loss function (during training only).
* *sample_weight*: Numpy array of weights for the training samples, used for scaling the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode="temporal" in compile().
* *initial_epoch*: epoch at which to start training (useful for resuming a previous training run)

##### Returns

A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable).

##### Raises

RuntimeError: if the model was never compiled.

### Predict 

`predict(self, x, batch_size=32, verbose=0)``

Generates output predictions for the input samples.

The input samples are processed batch by batch.

#### Arguments

x: the input data, as a Numpy array.
batch_size: integer.
verbose: verbosity mode, 0 or 1.

#### Returns

A Numpy array of predictions.

## Some examples

Now we're going to see a couple of examples.

In [17]:
### Example 1
# For a single-input model with 2 classes (binary classification):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
#print("Input data")
#print(data)
labels = np.random.randint(2, size=(1000, 1))
#print("\nLabels")
#print(labels)
# Train the model, iterating on the data in batches of 32 samples
hist = model.fit(data, labels, verbose = 0, epochs=10, batch_size=32)
print(hist.history)

{'acc': [0.51000000000000001, 0.54700000000000004, 0.53200000000000003, 0.55700000000000005, 0.55800000000000005, 0.56299999999999994, 0.56599999999999995, 0.56899999999999995, 0.57499999999999996, 0.57199999999999995], 'loss': [0.70387321805953984, 0.69076384258270263, 0.69247787952423101, 0.68751943874359134, 0.68315855884552001, 0.68294048547744746, 0.67790679168701173, 0.67390061044692995, 0.67202281093597416, 0.66684537315368653]}


In [18]:
prediction = model.predict(np.random.random((5, 100)))
print(prediction)

[[ 0.60028714]
 [ 0.43608826]
 [ 0.49870765]
 [ 0.45481431]
 [ 0.4793022 ]]


In [22]:
### Example 2
# For a single-input model with 10 classes (categorical classification):

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Generate dummy data
import numpy as np
import keras
data = np.random.random((1000, 100))
labels = np.random.randint(10, size=(1000, 1))
#print("Labels")
#print(labels)
# Convert labels to categorical one-hot encoding
one_hot_labels = keras.utils.to_categorical(labels, num_classes=10)
#print("one-hot labels")
#print(one_hot_labels)
# Train the model, iterating on the data in batches of 32 samples
hist = model.fit(data, one_hot_labels, verbose=0, epochs=10, batch_size=32)

In [30]:
data = np.random.random((5,100))
#one_hot_data = keras.utils.to_categorical(data, num_classes = 10)
prediction = model.predict(data)
print(prediction)

[[ 0.07852356  0.09424989  0.09305289  0.11304904  0.1484883   0.1208453
   0.10267175  0.05804131  0.07525445  0.11582354]
 [ 0.07616915  0.12711795  0.07674219  0.0805225   0.15082555  0.09188802
   0.11951567  0.13229284  0.07625939  0.06866679]
 [ 0.11161524  0.12163014  0.06704808  0.092452    0.09380397  0.09955897
   0.11215826  0.12134843  0.08861399  0.09177095]
 [ 0.08985351  0.05642612  0.08618461  0.05884392  0.09129987  0.14388478
   0.13352601  0.12331535  0.06924877  0.14741705]
 [ 0.07444849  0.09991132  0.09586181  0.09125136  0.12575373  0.12258089
   0.08111702  0.09624729  0.10771045  0.1051176 ]]
