# Introduction to Neural Networks: Datasets, Loss, and Models
Author: Pierre Nugues

## Dataset

We extracted counts of letters per chapter and counts of _A_ from the *Salammbô* novel by Flaubert. There are 15 chapters in total.

In [None]:
import numpy as np

X = np.array(
    [[36961],
     [43621],
     [15694],
     [36231],
     [29945],
     [40588],
     [75255],
     [37709],
     [30899],
     [25486],
     [37497],
     [40398],
     [74105],
     [76725],
     [18317]])

y = np.array(
    [2503, 2992, 1042, 2487, 2014, 2805, 5062, 2643, 2126, 1784, 2641, 2766,
     5047, 5312, 1215])


## Visualizing the Dataset

In [None]:
import matplotlib.pyplot as plt

fr = plt.scatter(X, y, c='b', marker='x')
plt.title("Salammbô")
plt.xlabel("Letter count")
plt.ylabel("A count")
plt.show()

## Models

We fit three different models

In [None]:
# The polynomial degrees we will test and their color
x = X.flatten()
degrees_col = [(1, 'r-'), (8, 'b-'), (9, 'g-')]

f, axes = plt.subplots(len(degrees_col), sharex=True, sharey=True)
x_vals = np.linspace(min(x), max(x), 1000)

for idx, (degree, color) in enumerate(degrees_col):
    axes[idx].scatter(x, y, marker='x')
    # We find the fitting coefficients
    z = np.polyfit(x, y, degree)
    # We use them to create a polynomial
    p = np.poly1d(z)
    legend = axes[idx].plot(x_vals, p(x_vals), color)
plt.show()

Simpler models are better

### Using the Keras Engine to Carry out a Linear Regression

We use the mean squared error and nadam, a variant of stochastic gradient descent

In [None]:
from keras import models
from keras.layers import Dense

model = models.Sequential()
model.add(Dense(1, input_dim=1, activation='linear'))
model.compile(optimizer='nadam', loss='mse', metrics=['mse'])
model.summary()
history = model.fit(x, y, batch_size=1, epochs=200, verbose=0)

### Visualising the Loss

We visualise the loss during the training process

In [None]:
import matplotlib.pyplot as plt

loss = history.history['loss']
epochs = range(1, len(loss) + 1)
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.title('Training loss')
plt.legend()
plt.show()

### The Model

A model consists of linear parameters

In [None]:
model.get_weights()

### Visualizing the Final Model

In [None]:
fr = plt.scatter(X, y, c='b', marker='x')
plt.plot(x, model.predict(x), color='red')
plt.title("Salammbô")
plt.xlabel("Letter count")
plt.ylabel("A count")
plt.show()