## INTRODUCTION

Keras was developed with a focus on enabling fast experimentation. Because of this, it's very user friendly and allows us to go from idea to implementation with only a few steps.<br>

Keras was originally created by François Chollet. Historically, Keras was a high-level API that sat on top of one of three lower level neural network APIs and acted as a wrapper to to these lower level libraries. These libraries were referred to as Keras backend engines.<br>

Now, when you install TensorFlow, you also automatically get Keras, as it is now part of the TensorFlow library.


## Samples And Labels

To train any neural network in a supervised learning task, we first need a data set of samples and the corresponding labels for those samples.

When referring to samples, we're just referring to the underlying data set, where each individual item or data point within that set is called a sample. Labels are the corresponding labels for the samples.

If we were to train a model to do sentiment analysis on headlines from a media source, for example, the corresponding label for each sample headline from the media source could be “positive” or “negative.”

If we were training a model on images of cats and dogs, then the label for each of the images would either be “cat” or “dog.”




## SEQUENTIAL MODEL

## DATA FORMATION AND PREPROCESSING

For this model we will form our own dummy data

In [1]:
import numpy as np
from random import randint
from sklearn.utils import shuffle
from sklearn.preprocessing import MinMaxScaler

### DUMMY STORY BEHIND THE DATA

Suppose that an experimental drug was tested on individuals ranging from age 13 to 100 in a clinical trial. 

The trial had 2100 participants. 

Half of the participants were under 65 years old, and the other half was 65 years of age or older.
 
The trial showed that around 95% of patients 65 or older experienced side effects from the drug. 

Around 95% of patients under 65 experienced no side effects. 

Thus, generally showing that elderly individuals were more likely to experience side effects.

In [2]:
train_labels = []
train_samples = []

#### PERSON WITH SIDE EFFECT === 1 (FOR THE OUTPUT)

#### PERSON WITHOUT SIDE EFFECT === 0 (FOR THE OUTPUT)


TOTAL PEOPLE = 2100<br>
TOTAL YOUNG (>13 and <65) => 2100/2 = 1050<br>
THEREFORE,<br>
  - YOUNG WHO GOT SIDE EFFECTS => 5% of 1050 = 50 (APPROX)<br>
  - YOUNG WHO ARE WITHOUT SIDE EFFECTS => 95% of 1050 = 1000 (APPROX)
        
TOTAL OLD (>=65) => 2100/2 = 1050<br>
THEREFORE,<br>
   - OLD WHO GOT SIDE EFFECTS => 95% of 1050 = 1000 (APPROX)<br>
   - OLD WHO ARE WITHOUT SIDE EFFECTS => 5% of 1050 = 50 (APPROX)       

### DATASET GENERATION CODE

In [3]:
for i in range(50):
    # GENERATE A RANDOM AGE FOR A YOUNG
    random_young_age = randint(13,64)
    train_samples.append(random_young_age)
    # THESE 50 ARE THOSE YOUNG ONES WHO HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 1
    train_labels.append(1)
    
    # GENERATE A RANDOM AGE FOR A OLD
    random_old_age = randint(65,100)
    train_samples.append(random_old_age)
    # THESE 50 ARE THOSE OLD ONES WHO DON'T HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 0
    train_labels.append(0)
    
    
for i in range(1000):
    # GENERATE A RANDOM AGE FOR A YOUNG
    random_young_age = randint(13,64)
    train_samples.append(random_young_age)
    # THESE 50 ARE THOSE YOUNG ONES WHO DON'T HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 0
    train_labels.append(0)
    
    # GENERATE A RANDOM AGE FOR A OLD
    random_old_age = randint(65,100)
    train_samples.append(random_old_age)
    # THESE 50 ARE THOSE OLD ONES WHO HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 1
    train_labels.append(1)
    

In [None]:
# VIEW THE AGES WE GENERATED
for age in train_samples:
    print(age)

In [None]:
# VIEW THE LABELS
for lab in train_labels:
    print(lab)

WE NEED TO PASS THE DATA TO OUR MODEL IN THE FORM OF A NUMPY ARRAY

THUS, WE WILL NOW CONVERT THE DATASET INTO NUMPY ARRAY FORM

In [4]:
train_labels = np.array(train_labels)
train_samples = np.array(train_samples)

NOW WE WILL SHUFFLE THE DATASET TO GET RID OF ANY ORDER WHICH MAY CAUSE OUR MODEL TO RECOGNIZE ANY IMPROPER PATTERN

In [5]:
train_labels, train_samples = shuffle(train_labels, train_samples)

WE STILL NEED TO DO MORE PRE-PROCESSING ON THE DATA TO DO THINGS LIKE:
- NORMALIZATION
- STANDARDIZATION

THESE THINGS HELP TO TRAIN BETTER AND FASTER

In [6]:
# NOW WE USE THE MINMAXSCALER TO SCALE OUR DATA FROM 0 TO 1 WHICH IS CURRENTLY FROM 13 TO 100

# SCALER OBJECT WILL BE CREATED
scaler = MinMaxScaler(feature_range=(0,1))

# TRANSFORM OUR DATASET TO SCALED DATASET
# THIS FIT_TRANSFORM DOESN'T ACCEPT ANY 1-D DATA
# THUS WE RESHAPE OUR DATA TO BECOME 2-D AND THEN PASS IT TO THE FUNCTION
scale_train_samples = scaler.fit_transform(train_samples.reshape(-1,1))

In [None]:
for scale_age in scale_train_samples:
    print(scale_age)

In [8]:
scale_train_samples.shape

(2100, 1)

## THE SEQUENTIAL MODEL COSTRUCTION

SEQUENTIAL MODEL CAN BE DESCRIBED AS A LINEAR STACK OF LAYERS

In [9]:
# IMPORT NECESSARY MODULES
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy

In [10]:
# TO CHECK FOR AVAILABILITY OF A GPU
physical_devices = tf.config.experimental.list_physical_devices('GPU')
print("NUMBER OF GPU's : {}".format(len(physical_devices)))

NUMBER OF GPU's : 0


In [11]:
# MODEL CONSTRUCTION

model = Sequential([
    # DENSE LAYER 1
    Dense(units=16, input_shape=(1,), activation='relu'),
    # WE ONLY NEED TO GIVE THE INPUT SHAPE FOR THE FIRST HIDDEN LAYER ONLY
    
    # DENSE LAYER 2
    Dense(units=32, activation='relu'),
    
    # DENSE LAYER 3
    # SOFTMAX GIVES THE PROBABILITY OF THE EACH TYPE OF CATEGORY
    # IN THIS IT TELLS THE PROBABILITY OF 
    # PERSON IS HAVING SIDE EFFECT OR NOT HAVING SIDE EFFECT
    Dense(units=2, activation='softmax')
])

In [12]:
# TO VIEW THE SUMMARY OF MODEL
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 16)                32        
_________________________________________________________________
dense_1 (Dense)              (None, 32)                544       
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 66        
Total params: 642
Trainable params: 642
Non-trainable params: 0
_________________________________________________________________


## MODEL TRAINING

In [13]:
# COMPILE OUR MODEL FOR TRAINING
model.compile(
    optimizer = Adam(learning_rate=0.0001),
    loss = 'sparse_categorical_crossentropy',
    metrics = ['accuracy']
)

In [14]:
# START TRAINING OUR MODEL
model.fit(
    x = scale_train_samples,
    y = train_labels,
    batch_size = 10,
    epochs = 30,
    # TO SHUFFLE THE DATA
    # BY DEFAULT ALSO THIS PARAMETER IS TRUE ONLY
    shuffle = True,
    # IT ALLOWS US TO SEE THE OUTPUT OF THE PROGRESS
    # IT CAN BE SET TO 0,1,2
    # AT LEVEL 0, THERE IS VERY LESS INFO SHOWN
    # AT 2, THE MAX INFO IS SHOWN
    verbose = 2
)

Epoch 1/30
210/210 - 0s - loss: 0.6510 - accuracy: 0.5538
Epoch 2/30
210/210 - 0s - loss: 0.6175 - accuracy: 0.6390
Epoch 3/30
210/210 - 0s - loss: 0.5862 - accuracy: 0.7048
Epoch 4/30
210/210 - 0s - loss: 0.5549 - accuracy: 0.7495
Epoch 5/30
210/210 - 0s - loss: 0.5232 - accuracy: 0.7914
Epoch 6/30
210/210 - 0s - loss: 0.4914 - accuracy: 0.8195
Epoch 7/30
210/210 - 0s - loss: 0.4600 - accuracy: 0.8448
Epoch 8/30
210/210 - 0s - loss: 0.4301 - accuracy: 0.8624
Epoch 9/30
210/210 - 0s - loss: 0.4023 - accuracy: 0.8762
Epoch 10/30
210/210 - 0s - loss: 0.3774 - accuracy: 0.8876
Epoch 11/30
210/210 - 0s - loss: 0.3560 - accuracy: 0.8957
Epoch 12/30
210/210 - 0s - loss: 0.3379 - accuracy: 0.9019
Epoch 13/30
210/210 - 0s - loss: 0.3229 - accuracy: 0.9076
Epoch 14/30
210/210 - 0s - loss: 0.3110 - accuracy: 0.9224
Epoch 15/30
210/210 - 0s - loss: 0.3012 - accuracy: 0.9176
Epoch 16/30
210/210 - 0s - loss: 0.2933 - accuracy: 0.9167
Epoch 17/30
210/210 - 0s - loss: 0.2872 - accuracy: 0.9229
Epoch 

<tensorflow.python.keras.callbacks.History at 0x12491f96d48>

## USE OF VALIDATION SET

VALIDATION SET CAN BE FORMED USING TWO WAYS :
 - BUILDING IT COMPLETELY FROM SCRATCH JUST LIKE THE TRAINING SET AND PASS TO THE VALIDATAION_DATA PARAMETER IN THE FIT FUNCTION
 - USE THE VALIDATION_SPLIT ATTRIBUTE OF THE FIT FUNCTION
 
IN THIS MODEL WE WILL USE THE SECOND METHOD

In [15]:
model.fit(
    x = scale_train_samples,
    y = train_labels,
    
    # IN THIS VALIDATION SPLIT 
    # WE GAVE 10% OF THE TRAINING SET TO THE VALIDATION SET
    # VALIDATION SET IS NOT USED IN FOR THE TRAINING
    # IT IS ONLY USED FOR VALIDATION
    # IT WILL TAKE THE LAST 10% OF THE TRAINING DATA
    # THE SHUFFLE ONLY HAPPENS AFTER THE SPLIT
    # SO, WE MIGHT SOMETIMES NEED TO SHUFFLE THE DATA EXPLICITLY
    validation_split = 0.1,
    
    batch_size = 10,
    epochs = 30,
    shuffle = True,
    verbose = 2
)

Epoch 1/30
189/189 - 0s - loss: 0.2571 - accuracy: 0.9328 - val_loss: 0.2743 - val_accuracy: 0.9286
Epoch 2/30
189/189 - 0s - loss: 0.2569 - accuracy: 0.9407 - val_loss: 0.2732 - val_accuracy: 0.9286
Epoch 3/30
189/189 - 0s - loss: 0.2563 - accuracy: 0.9407 - val_loss: 0.2720 - val_accuracy: 0.9286
Epoch 4/30
189/189 - 0s - loss: 0.2558 - accuracy: 0.9392 - val_loss: 0.2722 - val_accuracy: 0.9286
Epoch 5/30
189/189 - 0s - loss: 0.2550 - accuracy: 0.9344 - val_loss: 0.2737 - val_accuracy: 0.9524
Epoch 6/30
189/189 - 0s - loss: 0.2552 - accuracy: 0.9450 - val_loss: 0.2704 - val_accuracy: 0.9286
Epoch 7/30
189/189 - 0s - loss: 0.2544 - accuracy: 0.9413 - val_loss: 0.2702 - val_accuracy: 0.9286
Epoch 8/30
189/189 - 0s - loss: 0.2539 - accuracy: 0.9423 - val_loss: 0.2699 - val_accuracy: 0.9286
Epoch 9/30
189/189 - 0s - loss: 0.2535 - accuracy: 0.9429 - val_loss: 0.2690 - val_accuracy: 0.9286
Epoch 10/30
189/189 - 0s - loss: 0.2531 - accuracy: 0.9381 - val_loss: 0.2690 - val_accuracy: 0.9286

<tensorflow.python.keras.callbacks.History at 0x12492397b48>

## TRY OUR MODEL ON A TEST DATASET

TEST SET IS ALSO BUILT JUST LIKE THE TRAINING DATA

In [17]:
test_labels = []
test_samples = []


for i in range(50):
    # GENERATE A RANDOM AGE FOR A YOUNG
    random_young_age = randint(13,64)
    test_samples.append(random_young_age)
    # THESE 50 ARE THOSE YOUNG ONES WHO HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 1
    test_labels.append(1)
    
    # GENERATE A RANDOM AGE FOR A OLD
    random_old_age = randint(65,100)
    test_samples.append(random_old_age)
    # THESE 50 ARE THOSE OLD ONES WHO DON'T HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 0
    test_labels.append(0)
    
    
for i in range(1000):
    # GENERATE A RANDOM AGE FOR A YOUNG
    random_young_age = randint(13,64)
    test_samples.append(random_young_age)
    # THESE 50 ARE THOSE YOUNG ONES WHO DON'T HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 0
    test_labels.append(0)
    
    # GENERATE A RANDOM AGE FOR A OLD
    random_old_age = randint(65,100)
    test_samples.append(random_old_age)
    # THESE 50 ARE THOSE OLD ONES WHO HAVE SIDE EFFECTS
    # THUS, FOR OUTPUT THEY WILL HAVE 1
    test_labels.append(1)
    
    
    
test_labels = np.array(train_labels)
test_samples = np.array(train_samples)


test_labels, train_samples = shuffle(train_labels, train_samples)



scaler = MinMaxScaler(feature_range=(0,1))
scale_test_samples = scaler.fit_transform(test_samples.reshape(-1,1))


In [18]:
# WE WILL USE THE PREDICT FUNCTION TO PREDICT ON OUR TEST DATASET
predictions = model.predict(
    x = scale_test_samples,
    batch_size = 10,
    verbose = 0
    
)

In [22]:
for pred in predictions:
    print(pred)

[0.25795028 0.74204975]
[0.03473315 0.9652668 ]
[0.96121126 0.03878871]
[0.13877377 0.86122626]
[0.95586836 0.0441316 ]
[0.0221829 0.9778171]
[0.8480114  0.15198869]
[0.9712494  0.02875063]
[0.29492146 0.7050785 ]
[0.01981458 0.9801854 ]
[0.01001842 0.9899816 ]
[0.09222174 0.90777826]
[0.04839557 0.9516044 ]
[0.9735059  0.02649404]
[0.9725542  0.02744572]
[0.92653656 0.07346341]
[0.96121126 0.03878871]
[0.95586836 0.0441316 ]
[0.01258462 0.9874154 ]
[0.12555608 0.87444395]
[0.97333807 0.02666189]
[0.870359   0.12964097]
[0.8480114  0.15198869]
[0.1710369  0.82896316]
[0.9712494  0.02875063]
[0.0221829 0.9778171]
[0.870359   0.12964097]
[0.09222174 0.90777826]
[0.68865573 0.31134427]
[0.25795028 0.74204975]
[0.8480114  0.15198869]
[0.4215406  0.57845944]
[0.8480114  0.15198869]
[0.00893695 0.991063  ]
[0.5593919 0.4406081]
[0.03106751 0.9689325 ]
[0.9712494  0.02875063]
[0.4215406  0.57845944]
[0.95586836 0.0441316 ]
[0.97327065 0.02672932]
[0.4215406  0.57845944]
[0.9736061  0.02639383

[0.97327065 0.02672932]
[0.01579766 0.98420227]
[0.973692   0.02630799]
[0.05399186 0.94600815]
[0.09222174 0.90777826]
[0.01122928 0.9887707 ]
[0.9717329  0.02826707]
[0.96301985 0.03698009]
[0.82259643 0.17740352]
[0.8480114  0.15198869]
[0.9734389  0.02656106]
[0.11343163 0.88656837]
[0.9338565  0.06614349]
[0.01122928 0.9887707 ]
[0.9721466  0.02785345]
[0.01769454 0.98230547]
[0.01769454 0.98230547]
[0.05399186 0.94600815]
[0.01579766 0.98420227]
[0.9694751  0.03052495]
[0.37718776 0.6228122 ]
[0.22414058 0.7758595 ]
[0.9448697  0.05513027]
[0.9725542  0.02744572]
[0.11343163 0.88656837]
[0.96301985 0.03698009]
[0.15361187 0.84638816]
[0.97327065 0.02672932]
[0.01769454 0.98230547]
[0.68865573 0.31134427]
[0.9717329  0.02826707]
[0.9735059  0.02649404]
[0.9524897  0.04751033]
[0.8896471 0.1103529]
[0.92653656 0.07346341]
[0.1710369  0.82896316]
[0.4215406  0.57845944]
[0.9733044  0.02669559]
[0.01122928 0.9887707 ]
[0.5593919 0.4406081]
[0.97042537 0.0295746 ]
[0.4215406  0.578459

In [20]:
# TO GET THE PREDICTION OF HIGHER PROBABILITY
rounded_prediction = np.argmax(predictions, axis = -1)

In [21]:
for r_pred in rounded_prediction:
    print(r_pred)

1
1
0
1
0
1
0
0
1
1
1
1
1
0
0
0
0
0
1
1
0
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
0
1
0
1
1
0
0
1
0
0
1
1
1
0
1
1
1
0
1
0
1
1
1
0
0
1
1
1
1
0
1
1
0
1
1
1
0
1
1
1
1
0
0
1
1
1
1
0
0
0
1
1
0
0
0
1
0
1
1
1
1
0
0
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
1
1
1
0
1
1
1
0
1
1
0
1
0
1
1
0
0
0
1
0
0
1
0
1
0
1
0
1
1
0
0
1
0
0
0
1
0
0
0
1
1
0
1
1
0
0
1
0
1
0
0
1
0
1
1
0
1
1
1
0
0
1
1
0
0
1
1
1
1
1
0
1
0
1
0
1
0
0
0
0
0
0
0
1
1
0
0
1
1
0
0
1
0
1
1
0
0
1
1
0
1
1
0
1
1
1
0
0
0
0
0
1
0
1
0
1
1
1
1
0
1
1
1
0
1
1
0
0
1
0
0
0
1
0
1
0
0
0
0
0
0
1
1
0
1
0
0
0
1
1
1
1
0
0
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
0
1
0
0
1
1
0
1
0
0
1
0
1
0
0
0
0
1
0
0
1
1
1
1
1
1
1
0
1
0
1
1
0
1
1
0
0
1
1
0
1
1
0
1
0
0
1
1
1
1
0
0
1
0
1
0
0
1
0
0
0
1
1
1
1
0
0
1
0
1
1
1
0
0
0
1
0
1
1
0
0
1
1
0
1
1
0
1
1
1
0
1
1
0
1
0
0
0
1
1
1
1
1
0
0
1
0
0
0
0
0
0
1
0
1
1
1
0
1
1
0
1
0
1
1
1
0
0
1
0
1
1
1
0
0
0
1
1
0
0
1
0
0
0
0
1
1
1
1
0
1
1
0
1
1
0
1
1
0
1
0
0
1
1
1
0
1
0
1
1
0
0
1
1
1
1
1
0
1
0
1
0
1
0
1
1
1
1
1
1
0
1
1
1
1
1
0
1
0
1
0
0
0
0
0
1
1
0
0
1
0
0
0
0
