# Artificial Neural Networks
In this exercise, several parts of the code are missing, which should be completed by you. 

In [None]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

import sklearn
from sklearn.linear_model import Perceptron
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, f1_score
from sklearn.model_selection import train_test_split
from tqdm.notebook import tqdm
import seaborn as sns
from sklearn.metrics import accuracy_score
sns.set()

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

%matplotlib inline

## MLP for Skin disease dataset using `scikit-learn`

Now let us apply a neural network on the skin disesase data. To reduce the training time we reduce the amount of data in our dataset.

In [None]:
df = pd.read_csv("skin_disease.csv")
df = df.sample(frac=1)
df = df.iloc[0:100000]
df.head()

In [None]:
X = df.drop(columns=["class"])
y = df["class"]

> Split the data into a train and test set. Use 40% of the data for the test set.

*Click on the dots to display the solution*

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

We define our Multi Layer Perceptron with 2 hidden layers. This time we use the [MLPClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) implementation from Scikit-Learn.

In [None]:
mlp = MLPClassifier(hidden_layer_sizes=(30,15),
                    activation='relu',  # activation function
                    solver='adam',  # optimizer
                    batch_size=1024)  # size of minibatches

> Train the neural network on `X_train`, `y_train` and plot the loss by accessing the attribute `loss_curve_`.

In [None]:
def plot_costs(costs):
    fig, ax = plt.subplots()
    ax.plot(costs)
    ax.set_title("Loss curve")
    plt.show()

*Click on the dots to display the solution*

In [None]:
mlp.fit(X_train, y_train)
plot_costs(mlp.loss_curve_)

> Implement your own predict function. For that we'll need an activation function for the hidden layers, in our case `relu` and for the output layer `sigmoid`.

In [None]:
def relu(x):
    ### START YOUR CODE ###
    
    ### END YOUR CODE ###
    pass

*Click on the dots to display the solution*

In [None]:
def relu(x):
    return np.maximum(0, x)

In [None]:
def sigmoid(x):
    ### START YOUR CODE ###
    
    ### END YOUR CODE ###
    pass

*Click on the dots to display the solution*

In [None]:
def sigmoid(x):
    return 1/(1+np.exp(-x))

In [None]:
def predict(mlp, X):
    ### START YOUR CODE ###

    # define the first activations, e.g. inputs
    
    # forward propagate through layers
            
    # last layer output activation `sigmoid`
    
    # transform to 1-D and threshold
    
    ### END YOUR CODE ###
    
    pass

*Click on the dots to display the solution*

In [None]:
def predict(mlp, X):
    # define the first activations, e.g. inputs
    A = X
    
    # forward propagate through layers
    for i, (W, B) in enumerate(zip(mlp.coefs_, mlp.intercepts_)):
        z = A.dot(W) + B
        # if hidden layer, apply `relu`
        if i != mlp.n_layers_ - 2:
            A = relu(z)
            
    # last layer output activation `sigmoid`
    out = sigmoid(z)  
    # transform to 1-D and threshold
    out = np.squeeze(out)
    out = np.array(out > 0.5, dtype=int)
    
    return out

> Test your implementation with the scikit-learn predict function.

In [None]:
# y_pred_scikit = ...
# y_pred_own = ...

# print('Are the outputs the same: %s' % ... )

*Click on the dots to display the solution*

In [None]:
y_pred_scikit = mlp.predict(X_test)
y_pred_own = predict(mlp, X_test.values)

print('Are the outputs the same: %s' % (y_pred_scikit == y_pred_own).all())

> Predict the values on the test set and calculate the accuracy and the f1-score.

*Click on the dots to display the solution*

In [None]:
y_pred = predict(mlp, X_test.values)

accuracy = accuracy_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print("Accuracy: %.4f" % accuracy)
print("F1: %.4f" % f1)

## MLP for Skin disease dataset using `TensorFlow`
In practice, the MLP from `scikit-learn` is never used because of the lack of customisation and the absence of GPU training. `TensorFlow` is a library specialised in deep learning and therefore also has implementations for advanced techniques. Thus the section below is a quick introduction to how the same network can be implemented using `TensorFlow`. The networks' results do not need to be the same, since as mentioned above, the `scikit-learn` implementation can not be as customised as the `TensorFlow` one. 

In [None]:
import tensorflow as tf

A model in `TensorFlow` can be implemented using the `Sequential API`, which enables for easy extensibility by calling `.add()`. To implement the same MLP as above, we can sequentially add `Dense` layers to the model. Here the customization possibilities compared to `scikit-learn` is evident. For example, the activation function can be set for each layer separately, which was impossible before.

In [None]:
dataset_dim = X_train.shape[1]

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(30, input_shape=(dataset_dim, ), activation='relu'))
model.add(tf.keras.layers.Dense(15, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.summary()

After defining the model, it needs to be compiled using an optimizer and loss function. In our case, we'll use adam as optimizer and binary cross-entropy as loss. Now the model can be trained by specifying the number of epochs and the batch size. 

In [None]:
adam = tf.keras.optimizers.Adam()
model.compile(optimizer=adam, loss='binary_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, batch_size=1024, epochs=150) 

We can also plot the loss curve using the same function as above.

In [None]:
plot_costs(history.history["loss"])

The model can also be evaluated on the test set with familiar code.

In [None]:
y_pred = model.predict(X_test)
y_pred = np.array(y_pred > 0.5, dtype=int).squeeze()

accuracy = accuracy_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print("Accuracy: %.4f" % accuracy)
print("F1: %.4f" % f1)

# Assignment

> Now answer the Ilias Quiz 08A Neural Networks - Notebook Verification