# Building deep learning models with keras
>  In this chapter, you'll use the Keras library to build deep learning models for both regression and classification. You'll learn about the Specify-Compile-Fit workflow that you can use to make predictions, and by the end of the chapter, you'll have all the tools necessary to build deep neural networks.

- toc: true 
- badges: true
- comments: true
- author: Lucas Nunes
- categories: [Datacamp]
- image: images/datacamp/___

> Note: This is a summary of the course's chapter 3 exercises "Introduction to Deep Learning in Python" at datacamp. <br>[Github repo](https://github.com/lnunesAI/Datacamp/) / [Course link](https://www.datacamp.com/tracks/machine-learning-scientist-with-python)

In [39]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams['figure.figsize'] = (8, 8)

# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
from keras.utils import to_categorical

## Creating a keras model


### Understanding your data

<div class=""><p>You will soon start building models in Keras to predict wages based on various professional and demographic factors. Before you start building a model, it's good to understand your data by performing some exploratory analysis.</p>
<p>The data is pre-loaded into a pandas DataFrame called <code>df</code>.  Use the <code>.head()</code> and <code>.describe()</code> methods in the IPython Shell for a quick overview of the DataFrame.</p>
<p>The target variable you'll be predicting is <code>wage_per_hour</code>. Some of the predictor variables are binary indicators, where a value of 1 represents <code>True</code>, and 0 represents <code>False</code>.</p>
<p>Of the 9 predictor variables in the DataFrame, how many are binary indicators? The min and max values as shown by <code>.describe()</code> will be informative here.
How many binary indicator predictors are there?</p></div>

In [19]:
df = pd.read_csv('https://github.com/lnunesAI/Datacamp/raw/main/2-machine-learning-scientist-with-python/15-introduction-to-deep-learning-in-python/datasets/wages_534x10.csv')

In [21]:
df.describe()

Unnamed: 0,wage_per_hour,union,education_yrs,experience_yrs,age,female,marr,south,manufacturing,construction
count,534.0,534.0,534.0,534.0,534.0,534.0,534.0,534.0,534.0,534.0
mean,9.024064,0.179775,13.018727,17.822097,36.833333,0.458801,0.655431,0.292135,0.185393,0.044944
std,5.139097,0.38436,2.615373,12.37971,11.726573,0.498767,0.475673,0.45517,0.388981,0.207375
min,1.0,0.0,2.0,0.0,18.0,0.0,0.0,0.0,0.0,0.0
25%,5.25,0.0,12.0,8.0,28.0,0.0,0.0,0.0,0.0,0.0
50%,7.78,0.0,12.0,15.0,35.0,0.0,1.0,0.0,0.0,0.0
75%,11.25,0.0,15.0,26.0,44.0,1.0,1.0,1.0,0.0,0.0
max,44.5,1.0,18.0,55.0,64.0,1.0,1.0,1.0,1.0,1.0


<pre>
Possible Answers

0.

5.

<b>6.</b>

</pre>

**There are 6 binary indicators.**

### Specifying a model

<div class=""><p>Now you'll get to work with your first model in Keras, and will immediately be able to run more complex neural network models on larger datasets compared to the first two chapters.</p>
<p>To start, you'll take the skeleton of a neural network and add a hidden layer and an output layer. You'll then fit that model and see Keras do the optimization so your model continually gets better.</p>
<p>As a start, you'll predict workers wages based on characteristics like their industry, education and level of experience.  You can find the dataset in a pandas dataframe called <code>df</code>.  For convenience, everything in <code>df</code> except for the target has been converted to a NumPy matrix called <code>predictors</code>. The target, <code>wage_per_hour</code>, is available as a NumPy matrix called <code>target</code>.</p>
<p>For all exercises in this chapter, we've imported the <code>Sequential</code> model constructor, the <code>Dense</code> layer constructor, and pandas.</p></div>

In [25]:
predictors = df.iloc[:, 1:].to_numpy()
target = df.iloc[:, 0].to_numpy()

Instrutions
<ul>
<li>Store the number of columns in the <code>predictors</code> data to <code>n_cols</code>. This has been done for you.</li>
<li>Start by creating a <code>Sequential</code> model called <code>model</code>.</li>
<li>Use the <code>.add()</code> method on <code>model</code> to add a <code>Dense</code> layer.<ul>
<li>Add <code>50</code> units, specify <code>activation='relu'</code>, and the <code>input_shape</code> parameter to be the tuple <code>(n_cols,)</code> which means it has <code>n_cols</code> items in each row of data, and any number of rows of data are acceptable as inputs.</li></ul></li>
<li>Add another <code>Dense</code> layer. This should have <code>32</code> units and a <code>'relu'</code> activation.</li>
<li>Finally, add an output layer, which is a <code>Dense</code> layer with a single node. Don't use any activation function here.</li>
</ul>

In [26]:
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential

# Save the number of columns in predictors: n_cols
n_cols = predictors.shape[1]

# Set up the model: model
model = Sequential()

# Add the first layer
model.add(Dense(50, activation='relu', input_shape=(n_cols,)))

# Add the second layer
model.add(Dense(32, activation='relu'))

# Add the output layer
model.add(Dense(1))

**Now that you've specified the model, the next step is to compile it.**

## Compiling and fitting a model

### Compiling the model

<div class=""><p>You're now going to compile the model you specified earlier. To compile the model, you need to specify the optimizer and loss function to use. In the video, Dan mentioned that the Adam optimizer is an excellent choice. You can read more about it as well as other keras optimizers <a href="https://keras.io/optimizers/#adam" target="_blank" rel="noopener noreferrer">here</a>, and if you are really curious to learn more, you can read the <a href="https://arxiv.org/abs/1412.6980v8" target="_blank" rel="noopener noreferrer">original paper</a> that introduced the Adam optimizer.</p>
<p>In this exercise, you'll use the Adam optimizer and the mean squared error loss function. Go for it!</p></div>

Instructions
<li>Compile the model using <code>model.compile()</code>.  Your <code>optimizer</code> should be <code>'adam'</code> and the <code>loss</code> should be <code>'mean_squared_error'</code>.</li>

In [27]:
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Verify that model contains information from compiling
print("Loss function: " + model.loss)

Loss function: mean_squared_error


**all that's left now is to fit the model!**

### Fitting the model

<p>You're at the most fun part. You'll now fit the model. Recall that the data to be used as predictive features is loaded in a NumPy matrix called <code>predictors</code> and the data to be predicted is stored in a NumPy matrix called <code>target</code>. Your <code>model</code> is pre-written and it has been compiled with the code from the previous exercise.</p>

Instructions
<li>Fit the <code>model</code>.  Remember that the first argument is the predictive features (<code>predictors</code>), and the data to be predicted (<code>target</code>) is the second argument.</li>

In [31]:
# Fit the model
model.fit(predictors, target, epochs=10);

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


**You now know how to specify, compile, and fit a deep learning model using keras!**

## Classification models

### Understanding your classification data

<div class=""><p>Now you will start modeling with a new dataset for a classification problem. This data includes information about passengers on the Titanic.  You will use predictors such as <code>age</code>, <code>fare</code> and where each passenger embarked from to predict who will survive.  This data is from <a href="https://www.kaggle.com/c/titanic" target="_blank" rel="noopener noreferrer">a tutorial on data science competitions</a>.  Look <a href="https://www.kaggle.com/c/titanic/data" target="_blank" rel="noopener noreferrer">here</a> for descriptions of the features.</p>
<p>The data is pre-loaded in a pandas DataFrame called <code>df</code>.</p>
<p>It's smart to review the maximum and minimum values of each variable to ensure the data isn't misformatted or corrupted.  What was the maximum age of passengers on the Titanic? Use the <code>.describe()</code> method in the IPython Shell to answer this question.</p></div>

In [42]:
df = pd.read_csv('https://github.com/lnunesAI/Datacamp/raw/main/2-machine-learning-scientist-with-python/15-introduction-to-deep-learning-in-python/datasets/titanic_891x11.csv')

<pre>
Possible Answers

29.699.

<b>80.</b>

891.

It is not listed.

</pre>

In [33]:
df.describe()

Unnamed: 0,survived,pclass,age,sibsp,parch,fare,male,embarked_from_cherbourg,embarked_from_queenstown,embarked_from_southampton
count,891.0,891.0,891.0,891.0,891.0,891.0,891.0,891.0,891.0,891.0
mean,0.383838,2.308642,29.699118,0.523008,0.381594,32.204208,0.647587,0.188552,0.08642,0.722783
std,0.486592,0.836071,13.002015,1.102743,0.806057,49.693429,0.47799,0.391372,0.281141,0.447876
min,0.0,1.0,0.42,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,2.0,22.0,0.0,0.0,7.9104,0.0,0.0,0.0,0.0
50%,0.0,3.0,29.699118,0.0,0.0,14.4542,1.0,0.0,0.0,1.0
75%,1.0,3.0,35.0,1.0,0.0,31.0,1.0,0.0,0.0,1.0
max,1.0,3.0,80.0,8.0,6.0,512.3292,1.0,1.0,1.0,1.0


**The maximum age in the data is 80.**

### Last steps in classification models

<div class=""><p>You'll now create a classification model using the titanic dataset, which has been pre-loaded into a DataFrame called <code>df</code>. You'll take information about the passengers and predict which ones survived. </p>
<p>The predictive variables are stored in a NumPy array <code>predictors</code>. The target to predict is in <code>df.survived</code>, though you'll have to manipulate it for keras. The number of predictive features is stored in <code>n_cols</code>. </p>
<p>Here, you'll use the <code>'sgd'</code> optimizer, which stands for <a href="https://en.wikipedia.org/wiki/Stochastic_gradient_descent" target="_blank" rel="noopener noreferrer">Stochastic Gradient Descent</a>. You'll learn more about this in the next chapter!</p></div>

In [57]:
predictors = df.iloc[:, 1:].astype(np.float32).to_numpy()
n_cols = predictors.shape[1]

Instructions
<ul>
<li>Convert <code>df.survived</code> to a categorical variable using the <code>to_categorical()</code> function.</li>
<li>Specify a <code>Sequential</code> model called <code>model</code>.</li>
<li>Add a <code>Dense</code> layer with <code>32</code> nodes. Use <code>'relu'</code> as the <code>activation</code> and <code>(n_cols,)</code> as the <code>input_shape</code>.</li>
<li>Add the <code>Dense</code> output layer. Because there are two outcomes, it should have 2 units, and because it is a classification model, the <code>activation</code> should be <code>'softmax'</code>. </li>
<li>Compile the model, using <code>'sgd'</code> as the <code>optimizer</code>, <code>'categorical_crossentropy'</code> as the loss function, and <code>metrics=['accuracy']</code> to see the accuracy (what fraction of predictions were correct) at the end of each epoch.</li>
<li>Fit the model using the <code>predictors</code> and the <code>target</code>.</li>
</ul>

In [63]:
# Convert the target to categorical: target
target = to_categorical(df.survived)

# Set up the model
model = Sequential()

# Add the first layer
model.add(Dense(32, activation='relu', input_shape=(n_cols,)))

# Add the output layer
model.add(Dense(2, activation='softmax'))

# Compile the model
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])

# Fit the model
model.fit(predictors, target, epochs=10);

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


**This simple model is generating an accuracy of 68!**

## Using models

### Making predictions

<div class=""><p>The trained network from your previous coding exercise is now stored as <code>model</code>. New data to make predictions is stored in a NumPy array as <code>pred_data</code>.  Use <code>model</code> to make predictions on your new data.</p>
<p>In this exercise, your predictions will be probabilities, which is the most common way for data scientists to communicate their predictions to colleagues.</p></div>

In [74]:
#@title ⠀ { display-mode: "form" }
pred_data = np.array([[2, 34.0, 0, 0, 13.0, 1, False, 0, 0, 1],
       [2, 31.0, 1, 1, 26.25, 0, False, 0, 0, 1],
       [1, 11.0, 1, 2, 120.0, 1, False, 0, 0, 1],
       [3, 0.42, 0, 1, 8.5167, 1, False, 1, 0, 0],
       [3, 27.0, 0, 0, 6.975, 1, False, 0, 0, 1],
       [3, 31.0, 0, 0, 7.775, 1, False, 0, 0, 1],
       [1, 39.0, 0, 0, 0.0, 1, False, 0, 0, 1],
       [3, 18.0, 0, 0, 7.775, 0, False, 0, 0, 1],
       [2, 39.0, 0, 0, 13.0, 1, False, 0, 0, 1],
       [1, 33.0, 1, 0, 53.1, 0, False, 0, 0, 1],
       [3, 26.0, 0, 0, 7.8875, 1, False, 0, 0, 1],
       [3, 39.0, 0, 0, 24.15, 1, False, 0, 0, 1],
       [2, 35.0, 0, 0, 10.5, 1, False, 0, 0, 1],
       [3, 6.0, 4, 2, 31.275, 0, False, 0, 0, 1],
       [3, 30.5, 0, 0, 8.05, 1, False, 0, 0, 1],
       [1, 29.69911764705882, 0, 0, 0.0, 1, True, 0, 0, 1],
       [3, 23.0, 0, 0, 7.925, 0, False, 0, 0, 1],
       [2, 31.0, 1, 1, 37.0042, 1, False, 1, 0, 0],
       [3, 43.0, 0, 0, 6.45, 1, False, 0, 0, 1],
       [3, 10.0, 3, 2, 27.9, 1, False, 0, 0, 1],
       [1, 52.0, 1, 1, 93.5, 0, False, 0, 0, 1],
       [3, 27.0, 0, 0, 8.6625, 1, False, 0, 0, 1],
       [1, 38.0, 0, 0, 0.0, 1, False, 0, 0, 1],
       [3, 27.0, 0, 1, 12.475, 0, False, 0, 0, 1],
       [3, 2.0, 4, 1, 39.6875, 1, False, 0, 0, 1],
       [3, 29.69911764705882, 0, 0, 6.95, 1, True, 0, 1, 0],
       [3, 29.69911764705882, 0, 0, 56.4958, 1, True, 0, 0, 1],
       [2, 1.0, 0, 2, 37.0042, 1, False, 1, 0, 0],
       [3, 29.69911764705882, 0, 0, 7.75, 1, True, 0, 1, 0],
       [1, 62.0, 0, 0, 80.0, 0, False, 0, 0, 0],
       [3, 15.0, 1, 0, 14.4542, 0, False, 1, 0, 0],
       [2, 0.83, 1, 1, 18.75, 1, False, 0, 0, 1],
       [3, 29.69911764705882, 0, 0, 7.2292, 1, True, 1, 0, 0],
       [3, 23.0, 0, 0, 7.8542, 1, False, 0, 0, 1],
       [3, 18.0, 0, 0, 8.3, 1, False, 0, 0, 1],
       [1, 39.0, 1, 1, 83.1583, 0, False, 1, 0, 0],
       [3, 21.0, 0, 0, 8.6625, 1, False, 0, 0, 1],
       [3, 29.69911764705882, 0, 0, 8.05, 1, True, 0, 0, 1],
       [3, 32.0, 0, 0, 56.4958, 1, False, 0, 0, 1],
       [1, 29.69911764705882, 0, 0, 29.7, 1, True, 1, 0, 0],
       [3, 20.0, 0, 0, 7.925, 1, False, 0, 0, 1],
       [2, 16.0, 0, 0, 10.5, 1, False, 0, 0, 1],
       [1, 30.0, 0, 0, 31.0, 0, False, 1, 0, 0],
       [3, 34.5, 0, 0, 6.4375, 1, False, 1, 0, 0],
       [3, 17.0, 0, 0, 8.6625, 1, False, 0, 0, 1],
       [3, 42.0, 0, 0, 7.55, 1, False, 0, 0, 1],
       [3, 29.69911764705882, 8, 2, 69.55, 1, True, 0, 0, 1],
       [3, 35.0, 0, 0, 7.8958, 1, False, 1, 0, 0],
       [2, 28.0, 0, 1, 33.0, 1, False, 0, 0, 1],
       [1, 29.69911764705882, 1, 0, 89.1042, 0, True, 1, 0, 0],
       [3, 4.0, 4, 2, 31.275, 1, False, 0, 0, 1],
       [3, 74.0, 0, 0, 7.775, 1, False, 0, 0, 1],
       [3, 9.0, 1, 1, 15.2458, 0, False, 1, 0, 0],
       [1, 16.0, 0, 1, 39.4, 0, False, 0, 0, 1],
       [2, 44.0, 1, 0, 26.0, 0, False, 0, 0, 1],
       [3, 18.0, 0, 1, 9.35, 0, False, 0, 0, 1],
       [1, 45.0, 1, 1, 164.8667, 0, False, 0, 0, 1],
       [1, 51.0, 0, 0, 26.55, 1, False, 0, 0, 1],
       [3, 24.0, 0, 3, 19.2583, 0, False, 1, 0, 0],
       [3, 29.69911764705882, 0, 0, 7.2292, 1, True, 1, 0, 0],
       [3, 41.0, 2, 0, 14.1083, 1, False, 0, 0, 1],
       [2, 21.0, 1, 0, 11.5, 1, False, 0, 0, 1],
       [1, 48.0, 0, 0, 25.9292, 0, False, 0, 0, 1],
       [3, 29.69911764705882, 8, 2, 69.55, 0, True, 0, 0, 1],
       [2, 24.0, 0, 0, 13.0, 1, False, 0, 0, 1],
       [2, 42.0, 0, 0, 13.0, 0, False, 0, 0, 1],
       [2, 27.0, 1, 0, 13.8583, 0, False, 1, 0, 0],
       [1, 31.0, 0, 0, 50.4958, 1, False, 0, 0, 1],
       [3, 29.69911764705882, 0, 0, 9.5, 1, True, 0, 0, 1],
       [3, 4.0, 1, 1, 11.1333, 1, False, 0, 0, 1],
       [3, 26.0, 0, 0, 7.8958, 1, False, 0, 0, 1],
       [1, 47.0, 1, 1, 52.5542, 0, False, 0, 0, 1],
       [1, 33.0, 0, 0, 5.0, 1, False, 0, 0, 1],
       [3, 47.0, 0, 0, 9.0, 1, False, 0, 0, 1],
       [2, 28.0, 1, 0, 24.0, 0, False, 1, 0, 0],
       [3, 15.0, 0, 0, 7.225, 0, False, 1, 0, 0],
       [3, 20.0, 0, 0, 9.8458, 1, False, 0, 0, 1],
       [3, 19.0, 0, 0, 7.8958, 1, False, 0, 0, 1],
       [3, 29.69911764705882, 0, 0, 7.8958, 1, True, 0, 0, 1],
       [1, 56.0, 0, 1, 83.1583, 0, False, 1, 0, 0],
       [2, 25.0, 0, 1, 26.0, 0, False, 0, 0, 1],
       [3, 33.0, 0, 0, 7.8958, 1, False, 0, 0, 1],
       [3, 22.0, 0, 0, 10.5167, 0, False, 0, 0, 1],
       [2, 28.0, 0, 0, 10.5, 1, False, 0, 0, 1],
       [3, 25.0, 0, 0, 7.05, 1, False, 0, 0, 1],
       [3, 39.0, 0, 5, 29.125, 0, False, 0, 1, 0],
       [2, 27.0, 0, 0, 13.0, 1, False, 0, 0, 1],
       [1, 19.0, 0, 0, 30.0, 0, False, 0, 0, 1],
       [3, 29.69911764705882, 1, 2, 23.45, 0, True, 0, 0, 1],
       [1, 26.0, 0, 0, 30.0, 1, False, 1, 0, 0],
       [3, 32.0, 0, 0, 7.75, 1, False, 0, 1, 0]], dtype=np.float32)

Instructions
<ul>
<li>Create your predictions using the model's <code>.predict()</code> method on <code>pred_data</code>.</li>
<li>Use NumPy indexing to find the column corresponding to predicted probabilities of survival being True. This is the second <em>column</em> (index <code>1</code>) of <code>predictions</code>. Store the result in <code>predicted_prob_true</code> and print it.</li>
</ul>

In [75]:
# Calculate predictions: predictions
predictions = model.predict(pred_data)

# Calculate predicted probability of survival: predicted_prob_true
predicted_prob_true = predictions[:,1]

# print predicted_prob_true
print(predicted_prob_true)

[0.3636128  0.4642847  0.89648247 0.44344747 0.27284613 0.24466737
 0.05226173 0.38314727 0.31875077 0.59537274 0.32326803 0.3873374
 0.28885946 0.43714145 0.26184428 0.07697893 0.35285476 0.54077506
 0.10991497 0.471074   0.69401085 0.32706165 0.05579389 0.38805586
 0.49784222 0.22459869 0.5780343  0.6327546  0.24931833 0.6614537
 0.48113307 0.50135946 0.21434593 0.3414769  0.38206545 0.73016024
 0.36361426 0.25401694 0.58288765 0.5114947  0.36100915 0.43345025
 0.55804276 0.15186702 0.39526019 0.13278006 0.4971881  0.181598
 0.5326694  0.80516404 0.4273352  0.01515844 0.5105397  0.62301004
 0.39082706 0.40628543 0.96102506 0.38467184 0.46937194 0.21434593
 0.28413895 0.3918298  0.41083246 0.5222768  0.40713668 0.31658205
 0.38481817 0.5780905  0.3037875  0.4463112  0.32357866 0.589653
 0.17968316 0.12169021 0.4738799  0.39387208 0.3878369  0.36846337
 0.24905227 0.6689958  0.5308471  0.2218818  0.3914006  0.37333944
 0.30684435 0.49534294 0.39389095 0.5524359  0.42888433 0.5375582
 0

**You're now ready to begin learning how to fine-tune your models.**