In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

Data preparation and processing

In [2]:
from random import randint
from sklearn.utils import shuffle
from sklearn.preprocessing import MinMaxScaler

In [12]:
train_labels = []
train_samples = []

As motivation for this data, let’s suppose that an experimental drug was tested on individuals ranging from age 13 to 100 in a clinical trial. The trial had 2100 participants. Half of the participants were under 65 years old, and the other half was 65 years of age or older.

The trial showed that around 95% of patients 65 or older experienced side effects from the drug, and around 95% of patients under 65 experienced no side effects, generally showing that elderly individuals were more likely to experience side effects.

Ultimately, we want to build a model to tell us whether or not a patient will experience side effects solely based on the patient's age. The judgement of the model will be based on the training data.

Note that with the simplicity of the data along with the conclusions drawn from it, a neural network may be overkill, but understand this is just to first get introduced to working with data for deep learning, and later, we'll be making use of more advanced data sets.

The block of code below shows how to generate this dummy data.

In [13]:
for i in range(50):
    # The ~5% of younger individuals who did experience side effects
    random_younger = randint(13,64)
    train_samples.append(random_younger)
    train_labels.append(1)
#     print(train_samples)

    # The ~5% of older individuals who did not experience side effects
    random_older = randint(65,100)
    train_samples.append(random_older)
    train_labels.append(0)

for i in range(1000):
    # The ~95% of younger individuals who did not experience side effects
    random_younger = randint(13,64)
    train_samples.append(random_younger)
    train_labels.append(0)

    # The ~95% of older individuals who did experience side effects
    random_older = randint(65,100)
    train_samples.append(random_older)
    train_labels.append(1)

Data Processing

We now convert both lists into numpy arrays due to what we discussed the fit() function expects, and we then shuffle the arrays to remove any order that was imposed on the data during the creation process.

In [15]:
train_labels = np.array(train_labels)
train_samples = np.array(train_samples)
train_labels, train_samples = shuffle(train_labels, train_samples)

In this form, we now have the ability to pass the data to the model because it is now in the required format, however, before doing that, we'll first scale the data down to a range from 0 to 1.

We'll use scikit-learn’s MinMaxScaler class to scale all of the data down from a scale ranging from 13 to 100 to be on a scale from 0 to 1.

In [18]:
scaler = MinMaxScaler(feature_range=(0,1))
scaled_train_samples = scaler.fit_transform(train_samples.reshape(-1,1))

> We reshape the data as a technical requirement just since the fit_transform() function doesn’t accept 1D data by default.

In [25]:
type(scaled_train_samples)

numpy.ndarray

In [26]:
scaled_train_samples.shape

(2100, 1)

In [30]:
train_labels.shape

(2100,)

##### Create An Artificial Neural Network With TensorFlow's Keras API

In [21]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy

Build a Sequential Model

In [None]:
# or you can do like this -->  model.add(l4)
# so i prefer to do it like this --> model = sequential([l1,l2,l3])

In [27]:
model = Sequential([
    Dense(units=16, input_shape=(1,), activation='relu'),
    Dense(units=32, activation='relu'),
    Dense(units=2, activation='softmax')
])

model is an instance of a Sequential object. A tf.keras.Sequential model is a linear stack of layers. It accepts a list, and each element in the list should be a layer.

As you can see, we have passed a list of layers to the Sequential constructor. Let's go through each of the layers in this list now.


>  Note, if you don’t explicitly set an activation function, then Keras will use the linear activation function.

*First Hidden Layer*
Our first layer is a Dense layer. This type of layer is our standard fully-connected or densely-connected neural network layer. The first required parameter that the Dense layer expects is the number of neurons or units the layer has, and we’re arbitrarily setting this to 16.

Additionally, the model needs to know the shape of the input data. For this reason, we specify the shape of the input data in the first hidden layer in the model (and only this layer). The parameter called input_shape is how we specify this.

As discussed, we’ll be training our network on the data that we generated and processed in the previous episode, and recall, this data is one-dimensional. The input_shape parameter expects a tuple of integers that matches the shape of the input data, so we correspondingly specify (1,) as the input_shape of our one-dimensional data.

You can think of the way we specify the input_shape here as acting as an implicit input layer. The input layer of a neural network is the underlying raw data itself, therefore we don't create an explicit input layer. This first Dense layer that we're working with now is actually the first hidden layer.

Lastly, an optional parameter that we’ll set for the Dense layer is the activation function to use after this layer. We’ll use the popular choice of relu.

2nd  hidden layer also the same as the above menioned
*Output Layer*
Lastly, we specify the output layer. This layer is also a Dense layer, and it will have 2 neurons. This is because we have two possible outputs: either a patient experienced side effects, or the patient did not experience side effects.

This time, the activation function we’ll use is softmax, which will give us a probability distribution among the possible outputs.

*Note that we can call summary() on our model to get a quick visualization of it.*

In [28]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 16)                32        
_________________________________________________________________
dense_1 (Dense)              (None, 32)                544       
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 66        
Total params: 642
Trainable params: 642
Non-trainable params: 0
_________________________________________________________________


##### Train An Artificial Neural Network With TensorFlow's Keras API

Compiling The Model

In [29]:
model.compile(optimizer=Adam(learning_rate=0.0001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

This function configures the model for training and expects a number of parameters. First, we specify the optimizer Adam. Adam accepts an optional parameter learning_rate, which we’ll set to 0.0001.

The next parameter we specify is loss. We’ll be using sparse_categorical_crossentropy, given that our labels are in integer format.

Note that when we have only two classes, we could instead configure our output layer to have only one output, rather than two, and use binary_crossentropy as our loss, rather than categorical_crossentropy. Both options work equally well and achieve the exact same result.

With binary_crossentropy, however, the last layer would need to use sigmoid, rather than softmax, as its activation function.

Moving on, the last parameter we specify in compile() is metrics. This parameter expects a list of metrics that we’d like to be evaluated by the model during training and testing. We’ll set this to a list that contains the string ‘accuracy’.

In [34]:
model.fit(x=scaled_train_samples, y=train_labels, batch_size=10, epochs=30, verbose=2)

Epoch 1/30
210/210 - 0s - loss: 0.2381 - accuracy: 0.9500
Epoch 2/30
210/210 - 0s - loss: 0.2377 - accuracy: 0.9471
Epoch 3/30
210/210 - 0s - loss: 0.2377 - accuracy: 0.9519
Epoch 4/30
210/210 - 0s - loss: 0.2377 - accuracy: 0.9481
Epoch 5/30
210/210 - 0s - loss: 0.2375 - accuracy: 0.9481
Epoch 6/30
210/210 - 0s - loss: 0.2374 - accuracy: 0.9524
Epoch 7/30
210/210 - 0s - loss: 0.2372 - accuracy: 0.9490
Epoch 8/30
210/210 - 0s - loss: 0.2370 - accuracy: 0.9467
Epoch 9/30
210/210 - 0s - loss: 0.2370 - accuracy: 0.9510
Epoch 10/30
210/210 - 0s - loss: 0.2368 - accuracy: 0.9514
Epoch 11/30
210/210 - 0s - loss: 0.2368 - accuracy: 0.9505
Epoch 12/30
210/210 - 0s - loss: 0.2365 - accuracy: 0.9514
Epoch 13/30
210/210 - 0s - loss: 0.2367 - accuracy: 0.9462
Epoch 14/30
210/210 - 0s - loss: 0.2362 - accuracy: 0.9514
Epoch 15/30
210/210 - 0s - loss: 0.2364 - accuracy: 0.9500
Epoch 16/30
210/210 - 0s - loss: 0.2360 - accuracy: 0.9524
Epoch 17/30
210/210 - 0s - loss: 0.2360 - accuracy: 0.9524
Epoch 

<tensorflow.python.keras.callbacks.History at 0x7fdc3c463b90>

we specify verbose=2. This just specifies how much output to the console we want to see during each epoch of training. The verbosity levels range from 0 to 2, so we’re getting the most verbose output.

When we call fit() on the model, the model trains, and we get this output.