# Neural Networks with Tensorflow

In this notebook, we classify again the Palmers Penguins dataset, but this time use Tensorflow. We will use a simple neural network with one hidden layer.

### Preparations

We start by importing the necessary libraries. 

In [8]:
import tensorflow as tf
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler


## Loading and preprocessing the data

In [None]:
penguin_data = pd.read_csv('./penguins-numeric-all.csv', index_col=0)
penguin_data = penguin_data.sample(frac=1, random_state=42).reset_index(drop=True)

penguin_data.head(n=4)

It is usually a good idea when using neural networks to normalize the data, such that the values all have a similar scale. We will do this using the `StandardScaler` from `sklearn`.

In [10]:
y = penguin_data["species"]
x = penguin_data.drop(columns="species")

# Split the data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
scaler.fit(x_train)
x_train = scaler.transform(x_train)
x_test = scaler.transform(x_test)


When setting up a neural network, we will need to know the number of input features. We can get this from the shape of the training data using `X_train.shape`. In this case, we have 4 features and 266 samples.

In [None]:
print(x_train.shape)

We create a model with one hidden layer with 10 neurons. We use the `relu` activation function for the hidden layer and the `softmax` activation function for the output layer. The output layer has 3 neurons, one for each class.

In [15]:
# create a neural network in tensorflow with one hidden layer

model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(x_train.shape[1],)),
    tf.keras.layers.Dense(10, activation='relu'),
    tf.keras.layers.Dense(3, activation='softmax')
])

The function `model.summary()` gives us an overview of the model.


In [None]:
model.summary()

We can also already use the model to make predictions. In the following example, we extract the first example from the training set and pass it though the model. The output is a vector of length 3, where each entry corresponds to the probability of the example belonging to one of the three classes. 

In [None]:
first_example = x_train[0:1, :]
model.predict(first_example)

As we haven't trained the model yet, the parameters are random and the output is not meaningful. To train the model, we need to compile it. Compiling means that we specify the loss function, the optimizer, and the metrics we want to use. In this case, we use the `sparse_categorical_crossentropy` loss function, the `adam` optimizer, and the `accuracy` metric. The `sparse_categorical_crossentropy` loss function is used when we have multiple classes and the labels are integers.

In [17]:
model.compile(
  optimizer='adam', 
  loss='sparse_categorical_crossentropy', # sparse allows us to use 1,2,3 as species
  metrics=['accuracy'])

Now we are ready to train the model. We use the `fit` function and pass the training data and the labels. We also specify the number of epochs, as well as the batch size.
We can also instruct the fit function to use a validation set. This is useful to monitor the performance of the model on data during training. Conveniently, the `fit` function returns a history object, which we can use to plot the training and validation loss and accuracy.

In [None]:
# train the model
history = model.fit(x_train, y_train, epochs=100, batch_size=16, validation_split=0.2)

In [None]:
# plot the training history
import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

Next we test the model on the test data. We have two options to evaluate the model. We can use the `evaluate` function, which returns the loss and the metrics we specified when compiling the model. Alternatively, we can use the `predict` function to get the probabilities for each class. We can then use `argmax` to get the predicted class.

In [None]:
# evaluate the model
loss, accuracy = model.evaluate(x_test, y_test)
print(f'Accuracy: {accuracy}')


The following code shows the same evaluation but using the `predict` method:

In [None]:
y_probs = model.predict(x_test)
y_pred = y_probs.argmax(axis=1)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')

To see how well the model performs, we can plot the confusion matrix. We can use the `confusion_matrix` function from `sklearn` to get the confusion matrix. We can then use the `plot_confusion_matrix` function from `sklearn` to plot the confusion matrix.

In [None]:
# plot the confusion matrix
from sklearn.metrics import confusion_matrix
import seaborn as sns

cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

### Loading and saving a model 

Once we are happy with our model, we might want to save it to a file such that we can use it later. We can do this using the `save` function. 

In [23]:
model.save('my_model.keras')  

We could then load the model using the `load_model` function from `tensorflow.keras.models`. We specify `compile=False` as we don't need to train the model anymore.

In [None]:
# loading the model
loaded_model = tf.keras.models.load_model('my_model.keras', compile=False)
loaded_model.summary()

We can check that the loaded model gives the same results as the original model by evaluating it on the test data.

In [None]:
# evaluate the model
y_probs = loaded_model.predict(x_test)
y_pred = y_probs.argmax(axis=1)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
