## Day 82 Lecture 1 Assignment

In this assignment, we will learn about activation functions. We will create a neural network and measure the model's performance using different activations.

In [1]:
import numpy as np
import pandas as pd

We will import the famous titanic dataset below and produce a neural network that will predict the chance of survival for a passenger.

In [2]:
titanic = pd.read_csv('https://tf-assets-prod.s3.amazonaws.com/tf-curric/data-science/titanic.csv')

In [3]:
titanic.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


We'll perform some feature engineering

Let's start by keeping only the columns we'd like to use for our analysis. Keep only the columns: Survived, Pclass, Sex, SibSp, Parch, and Embarked

In [4]:
# Answer below:
dft = titanic.drop(columns=['PassengerId', 'Name', 'Age', 'Parch', 'Ticket', 'Fare', 'Cabin'])
dft.head()

Unnamed: 0,Survived,Pclass,Sex,SibSp,Embarked
0,0,3,male,1,S
1,1,1,female,1,C
2,1,3,female,0,S
3,1,1,female,1,S
4,0,3,male,0,S


Now examine how many rows contain missing data. Given how much missing data we have, should we remove the column with the most missing data, or remove all rows containing missing data? Do what you think is best.

In [5]:
#A summary of missing variables represented as a percentage of the total missing content. 
def missingness_summary(df, print_log=False, sort='ascending'):
  s = df.isnull().sum()*100/df.isnull().count()
    
  if sort.lower() == 'ascending':
    s = s.sort_values(ascending=True)
  elif sort.lower() == 'descending':
    s = s.sort_values(ascending=False)  
  if print_log: 
    print(s)
  
  return pd.Series(s)

In [6]:
# Answer below: 
missingness_summary(dft)


Survived    0.000000
Pclass      0.000000
Sex         0.000000
SibSp       0.000000
Embarked    0.224467
dtype: float64

In [7]:
missingness_summary(titanic)

PassengerId     0.000000
Survived        0.000000
Pclass          0.000000
Name            0.000000
Sex             0.000000
SibSp           0.000000
Parch           0.000000
Ticket          0.000000
Fare            0.000000
Embarked        0.224467
Age            19.865320
Cabin          77.104377
dtype: float64

Judging by the next question, I should drop 22% of the data. 

In [8]:
dft.dropna(inplace=True)

Now we'll create a one hot encoding of the variables Pclass, sex, and Embarked

In [9]:
# Answer below:
dums = pd.get_dummies(dft, columns=['Pclass', 'Sex', 'Embarked'], drop_first=True)

Split the data into train and test. 20% of the data should be set aside for testing. Use Survived as your target variable.

In [10]:
# Answer below
y = dums.Survived
X = dums.drop(columns=['Survived'])

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [11]:
X_val, X_test, y_val, y_test = train_test_split(X_test, y_test, test_size=0.1, random_state=42)

In [12]:
X_val.shape

(240, 6)

In [13]:
X_test.shape

(27, 6)

At this point, we are ready to create a model. Import `Sequential` and `Dense` from Keras

In [14]:
# Answer below:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

Create a model with 5 layers. The first layer should be a dense layer that receives the input, the last layer should be of size 1. You determine the remaining layer sizes.

Use a tanh activation for the output layer.

In [15]:
# Answer below
model = Sequential()
#One
model.add(Dense(32, input_dim=X_train.shape[1], activation='relu'))
#Two
model.add(Dense(16, activation='relu'))
#Three
model.add(Dense(9, activation='sigmoid'))
#Four
model.add(Dense(3, activation='relu'))
#Five
model.add(Dense(1, activation='tanh'))


Compile the model using the adam optimizer, binary crossentropy loss, and the accuracy metric.

Fit the model using a batch size of 80 over 200 epochs.

In [16]:
# Answer below:
model.compile(loss='binary_crossentropy', optimizer='adam', 
              metrics=['accuracy'])

In [17]:
model.fit(X_train, y_train, epochs=200, batch_size=80, 
          validation_data=(X_val, y_val))

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x7fe944392198>

In [18]:
pred=model.predict(X_test)
from sklearn.metrics import classification_report

#predicted_class_indices=np.argmax(pred,axis=-1) 
predicted_class_indices=(pred > 0.5).astype("int32") #binary class identification.
y_true = y_test

print(classification_report(y_true, 
                            predicted_class_indices))

              precision    recall  f1-score   support

           0       0.61      0.79      0.69        14
           1       0.67      0.46      0.55        13

    accuracy                           0.63        27
   macro avg       0.64      0.62      0.62        27
weighted avg       0.64      0.63      0.62        27



Redefine the model using a sigmoid activation for the last layer. What is the difference in accuracy.

In [19]:
# Answer below
# Answer below
model = Sequential()
#One
model.add(Dense(32, input_dim=X_train.shape[1], activation='relu'))
#Two
model.add(Dense(16, activation='relu'))
#Three
model.add(Dense(9, activation='sigmoid'))
#Four
model.add(Dense(3, activation='relu'))
#Five
model.add(Dense(1, activation='sigmoid'))

# Answer below:
model.compile(loss='binary_crossentropy', optimizer='adam', 
              metrics=['accuracy'])


model.fit(X_train, y_train, epochs=200, batch_size=80, 
          validation_data=(X_val, y_val))

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x7fe940a981d0>

In [20]:
pred=model.predict(X_test)
from sklearn.metrics import classification_report

#predicted_class_indices=np.argmax(pred,axis=-1) 
predicted_class_indices=(pred > 0.5).astype("int32") #binary class identification.
y_true = y_test

print(classification_report(y_true, 
                            predicted_class_indices))

              precision    recall  f1-score   support

           0       0.61      0.79      0.69        14
           1       0.67      0.46      0.55        13

    accuracy                           0.63        27
   macro avg       0.64      0.62      0.62        27
weighted avg       0.64      0.63      0.62        27

