# Neural Networks with Keras

In [None]:
# Generate some fake data with 3 features
from sklearn.datasets import make_classification

X, y = make_classification(n_features=3, n_redundant=0, n_informative=3,
                           random_state=42, n_classes=2, n_clusters_per_class=1)

y = y.reshape(-1, 1)

print(X.shape)
print(y.shape)

### Split the data into training and testing sets

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

## Data Preprocessing


#### Scale the data
It is really important to scale our data before using multilayer perceptron models. 
Without scaling, it is often difficult for the training cycle to converge

In [None]:
from sklearn.preprocessing import StandardScaler

# Fit the scaler with the training data
X_scaler = StandardScaler().fit(X_train)

# after fitting the scaler with the training data, we transform the training and testing data
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

#### One-hot encode the labels

In [None]:
from keras.utils import to_categorical

y_train_categorical = to_categorical(y_train)
y_test_categorical = to_categorical(y_test)
print(y_train_categorical[:10])

## Creating our Model

We must first decide what kind of model to apply to our data. 

- For numerical data, we use a regressor model. 
- For categorical data, we use a classifier model. 

In this example, we will use a classifier to build the following network:

<img src="../Images/nnet.png" width=200 height=300>

## Defining our Model Architecture (the layers)


#### Initialize the sequential model

The [sequential](https://keras.io/models/sequential/) model in the keras library allows us to create a linear stack of layers.

In [None]:
from keras.models import Sequential
model = Sequential()

### Define the hidden layer and add it to the model
This layer requires you to specify both the number of inputs and the number of nodes that you want in the hidden layer.

<img src="../Images/nnet_first_layer.png" width=300 height=300>

In [None]:
from keras.layers import Dense

#variables to define the number of input nodes and the number of nodes in the hidden layer
number_inputs = 3
number_hidden_nodes = 4
model.add(Dense(units=number_hidden_nodes, activation='relu', input_dim=number_inputs))

### Define the output layer and add it to the model

Here, we need to specify the activation function (typically `softmax` for classification) and the number of classes (labels) that we are trying to predict (2 in this example).

<img src="../Images/nnet_output_layer.png" width=300 height=300>

In [None]:
number_classes = 2

# CRITICAL: softmax is an activation function FOR CLASSIFICATION. 
# This activation function should be used in the output layer to convert our model's outputs to probabilities.
model.add(Dense(units=number_classes, activation='softmax'))

## Model Summary

In [None]:
model.summary()

## Compile the Model

Now that we have our model architecture defined, we must compile the model using a loss function and optimizer. We can also specify additional training metrics such as accuracy.

The [**optimizer**](https://keras.io/optimizers/) is the method that we'd like to use to reduce the model's error.

**Loss** is the amount of error in our model.
- For classification models, set `loss` to `categorical_crossentropy`.
- For regression models, set `loss` to `mean_squared_error`

In [None]:
# Hint: your output layer in this example is using software for logistic regression (categorical)
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

## Training the Model

Training consists of updating our weights using our optimizer and loss function. In this example, we choose 1000 iterations (loops) of training that are called epochs. We also choose to shuffle our training data and increase the detail printed out during each training cycle.

In [None]:
# Fit (train) the model
model.fit(
    X_train_scaled,
    y_train_categorical,
    epochs=1000,
    shuffle=True,
    verbose=2
)

## Quantifying the Model
We use our testing data to validate our model. This is how we determine the validity of our model (i.e. the ability to predict new and previously unseen data points)

In [None]:
model_loss, model_accuracy = model.evaluate(X_test_scaled, y_test_categorical, verbose=2)
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

## Making Predictions with new data

We can use our trained model to make predictions using `model.predict`

In [None]:
import numpy as np
new_data = X_scaler.transform(np.array([[-1.2, 0.3, 0.4]]))
print(f"Model output: {model.predict(new_data)}")
print(f"Predicted class: {model.predict_classes(new_data)}")

### View prediction probabilities

In [None]:
import pandas as pd
predictions = model.predict_classes(X_test_scaled)
probs = model.predict_proba(X_test_scaled)
pred_df = pd.DataFrame({
    "Prediction": predictions, 
    "Actual": y_test.ravel(), 
    "P(0)": [round(p[0],5) for p in probs],
   "P(1)": [round(p[1],5) for p in probs]
    })

pred_df.head(25)