# **Tensorflow/Keras model development**

Here we will process our data to develop a model that can predict peanut allergy incidence from categorical features provided.

The model will be built using Tensorflow and Keras.


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation

In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

**Import data**

In [8]:
X_train = pd.read_csv('/content/drive/MyDrive/Projects/allergy_prediction/data/X_train_resampled.csv')
y_train = pd.read_csv('/content/drive/MyDrive/Projects/allergy_prediction/data/y_train_resampled.csv')
X_test = pd.read_csv('/content/drive/MyDrive/Projects/allergy_prediction/data/X_test_encoded.csv')
y_test = pd.read_csv('/content/drive/MyDrive/Projects/allergy_prediction/data/y_test.csv')

**Design model**

In [4]:
model = Sequential()

In [9]:
# Input layer
model.add(Dense(64, activation='relu'))

# Hidden layers
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(16, activation='relu'))

# Output layer (binary classification)
model.add(Dense(1, activation='sigmoid'))

In [10]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [11]:
model.summary()

**Train model**

In [15]:
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

Epoch 1/10
[1m12982/12982[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m31s[0m 2ms/step - accuracy: 0.6334 - loss: 0.6336 - val_accuracy: 0.0477 - val_loss: 0.9059
Epoch 2/10
[1m12982/12982[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m32s[0m 2ms/step - accuracy: 0.6329 - loss: 0.6348 - val_accuracy: 0.0478 - val_loss: 0.9138
Epoch 3/10
[1m12982/12982[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m30s[0m 2ms/step - accuracy: 0.6312 - loss: 0.6345 - val_accuracy: 0.0477 - val_loss: 0.9161
Epoch 4/10
[1m12982/12982[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 2ms/step - accuracy: 0.6322 - loss: 0.6339 - val_accuracy: 0.0476 - val_loss: 0.9010
Epoch 5/10
[1m12982/12982[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 2ms/step - accuracy: 0.6327 - loss: 0.6341 - val_accuracy: 0.0477 - val_loss: 0.8897
Epoch 6/10
[1m12982/12982[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 2ms/step - accuracy: 0.6324 - loss: 0.6338 - val_accuracy: 0.0478 - val_loss: 0.896

**Evaluate on the held out test set**

In [20]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)

print(f'Test Loss: {test_loss:.4f}')
print(f'Test Accuracy: {test_accuracy:.4f}')

[1m2083/2083[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9595 - loss: 0.4975
Test Loss: 0.4989
Test Accuracy: 0.9590


Achieved c.96% accuracy.