<a href="https://colab.research.google.com/github/zerotodeeplearning/ztdl-masterclasses/blob/master/solutions_do_not_open/Introduction_to_Deep_Learning_with_Keras_solution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Learn with us: www.zerotodeeplearning.com

Copyright © 2021: Zero to Deep Learning ® Catalit LLC.

In [None]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Introduction to Deep Learning with Keras

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split

In [None]:
url = 'https://raw.githubusercontent.com/zerotodeeplearning/ztdl-masterclasses/master/data/'

In [None]:
df = pd.read_csv(url + 'geoloc_elev.csv')

# we only use the 2 features that matter
X = df[['lat', 'lon']].values
y = df['target'].values

X_train, X_test, y_train, y_test = train_test_split(X, y, 
    test_size = 0.2, random_state=0)

In [None]:
df.plot.scatter(x='lat', y='lon',
                c='target', cmap='bwr');

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD, Adam

In [None]:
model = Sequential([
    Dense(1, input_shape=(2,), activation='sigmoid')
])

model.compile(SGD(learning_rate=0.5), 'binary_crossentropy', metrics=['accuracy'])

h = model.fit(X_train, y_train, epochs=10, validation_split=0.1)

In [None]:
pd.DataFrame(h.history).plot(ylim=(-0.05, 1.05));

In [None]:
def score(model):
  bm_score = pd.Series(y).value_counts()[0] / len(y)
  train_score = model.evaluate(X_train, y_train, verbose=0)[1]
  test_score = model.evaluate(X_test, y_test,  verbose=0)[1]

  print("""Accuracy scores:
    Benchmark:\t{:0.3}
    Train:\t{:0.3}
    Test:\t{:0.3}""".format(bm_score, train_score, test_score))

In [None]:
score(model)

In [None]:
def plot_decision_boundary(model):
  hticks = np.linspace(-2, 2, 101)
  vticks = np.linspace(-2, 2, 101)
  aa, bb = np.meshgrid(hticks, vticks)
  ab = np.c_[aa.ravel(), bb.ravel()]

  c = model.predict(ab)
  cc = c.reshape(aa.shape)

  ax = df.plot(kind='scatter', c='target', x='lat', y='lon', cmap='bwr')
  ax.contourf(aa, bb, cc, cmap='bwr', alpha=0.5);

In [None]:
plot_decision_boundary(model)

### Exercise 1: Deep network

- Extend the neural network defined above by adding a few inner layers.
    - add a few more nodes to the first layer
    - change the activation function of the first layer from `sigmoid` to something else
    - remember that you only need to specify the `input_shape` in the first layer, the others infer it automatically
    - insert at least another layer or more, after the first one
    - regardless of how many layers you have, the last layer (output) should have a single node and a `sigmoid` activation function
    
Your model should look like:

```python
model = Sequential([
    Dense(...),
    ...
    ...
])
```

- Retrain the model for 20 epochs. Does your model learn to separate the two classes?
- Display the history as done above
- Evaluate the score using the `score` function defined above
- Display the decision boundary using the `plot_decision_boundary`
- Bonus points if you also calculate the confusion matrix. (hint: the `model.predict` method returns probabilities, so you will need to round the results to the nearest integer before comparing them with the labels)

In [None]:
model = Sequential([
    Dense(4, input_shape=(2,), activation='tanh'),
    Dense(4, activation='tanh'),
    Dense(1, activation='sigmoid')
])

model.compile(SGD(learning_rate=0.5), 'binary_crossentropy', metrics=['accuracy'])

h = model.fit(X_train, y_train, epochs=20, validation_split=0.1)

In [None]:
pd.DataFrame(h.history).plot(ylim=(-0.05, 1.05));

In [None]:
score(model)

In [None]:
plot_decision_boundary(model)

In [None]:
from sklearn.metrics import confusion_matrix

In [None]:
y_pred_proba = model.predict(X_test)

In [None]:
y_pred_proba[:4]

In [None]:
y_pred = y_pred_proba.round(0).astype(int)

In [None]:
cm = confusion_matrix(y_test, y_pred)

pd.DataFrame(cm,
             index=["Miss", "Hit"],
             columns=['pred_Miss', 'pred_Hit'])


### Exercise 2: Regression

In this exercise we will perform a non-linear regression of a function with 5 input features and 1 output. A detailed explanation of the function can be found [here](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_friedman1.html).

- The data is generated for your convenience and added to a Pandas DataFrame
- Use `sns.pairplot` to visualize the data. It may be convenient to specify the `x_vars` and `y_vars` arguments so that you only plot the target as a function of the features
- Define a deep neural network that will be able to learn this function.
    - Specify the `input_shape` in the first layer
    - Use at least 2 layers and a few nodes to allow the network to learn nonlinear relations
    - The output layer should have a single node and no activation function, as we are doing a regression

    ```python
    model = Sequential([
        Dense(...)
        ....
    ])
    ```
- Compile the model
    - use an optimizer of your choice
    - make sure to select the appropriate loss function for a regression
    - since it's a regression, you won't be able to calculate the accuracy score
- Fit the model for at least 100 epochs with a `validation_split=0.1`
- Plot the history, does the loss go to zero?
- Bonus points if you plot the predictions of the trained model against the true values using a scatter plot

In [None]:
from sklearn.datasets import make_friedman1

X, y = make_friedman1(n_samples=1000, n_features=5, noise=0., random_state=0)
features = ['x0', 'x1', 'x2', 'x3', 'x4']
df = pd.DataFrame(X, columns=features)
df['target'] = y/10.0

df.head()

In [None]:
sns.pairplot(df, x_vars=features, y_vars='target');

In [None]:
model = Sequential([
    Dense(50, input_shape=(5,), activation='tanh'),
    Dense(50, activation='tanh'),
    Dense(10, activation='tanh'),
    Dense(1)
])

model.compile('adam', 'mse')

h = model.fit(X, y, epochs=200, validation_split=0.1)

In [None]:
pd.DataFrame(h.history).plot();

In [None]:
y_pred = model.predict(X)

In [None]:
plt.scatter(y_pred, y);

### Exercise 3: Multi-class classification

In this exercise we extend a neural network to work with more than 2 classes. The data is the usual Iris Dataset, which has 3 classes.

- Plot the data using a pairplot
- Define and train a deep neural network model
    - The number of output nodes should match the number of classes
    - Choose the correct output activation function
    - Choose the correct loss for a multi-class classification with class index labels
    - Use a `validation_split=0.2`
- Experiment with different network architectures, add and remove layers and nodes.
- Experiment with different values for the learning rate. This dataset is a bit tricky.

In [None]:
df = pd.read_csv(url + 'iris.csv')
X = df.drop('species', axis=1)
y = df['species'].map({"setosa": 0, "versicolor": 1 , "virginica": 2})

In [None]:
sns.pairplot(df, hue="species");

In [None]:
model = Sequential([
    Dense(20, input_shape=(4,), activation='tanh'),
    Dense(10, activation='tanh'),
    Dense(3, activation='softmax')
])

model.compile(Adam(learning_rate=0.001),
              'sparse_categorical_crossentropy',
              metrics=['accuracy'])

h = model.fit(X, y,
              epochs=200,
              validation_split=0.2,
              verbose=0)

pd.DataFrame(h.history).plot();