# Course 4 - Project - Part 7: Dense network

<a name="top-7"></a>
This notebook is concerned with *Part 7: Dense network*.

**Contents:**
* [Step 0: Loading data](#step-7.0)
* [Step 1: 1-layer dense network](#step-7.1)
* [Step 2: 2-layer dense network](#step-7.2)

## Step 0: Loading data<a name="step-7.0"></a> ([top](#top-7))
---

We load the training set with the extracted high-level features.

In [1]:
# Standard library.
import os
import pathlib
import typing as T

# 3rd party.
import numpy as np

# Project.
import utils

In [2]:
separator = ''.center(80, '-')

path_train = pathlib.Path.cwd() / 'data' / 'swissroads-features-train.npz'
data_train = utils.load(path_train)
print(separator)
print(f'Dataset: train\n{utils.info(data_train)}')

path_valid = pathlib.Path.cwd() / 'data' / 'swissroads-features-valid.npz'
data_valid = utils.load(path_valid)
print(separator)
print(f'Dataset: valid\n{utils.info(data_valid)}')

path_test = pathlib.Path.cwd() / 'data' / 'swissroads-features-test.npz'
data_test = utils.load(path_test)
print(separator)
print(f'Dataset: test\n{utils.info(data_test)}')

--------------------------------------------------------------------------------
Dataset: train
data: shape=(280, 224, 224, 3), dtype=float32
label_idxs: shape=(280,), dtype=int64
label_strs: shape=(6,), dtype=<U10
names: shape=(280,), dtype=<U19
features: shape=(280, 1280), dtype=float32
--------------------------------------------------------------------------------
Dataset: valid
data: shape=(139, 224, 224, 3), dtype=float32
label_idxs: shape=(139,), dtype=int64
label_strs: shape=(6,), dtype=<U10
names: shape=(139,), dtype=<U19
features: shape=(139, 1280), dtype=float32
--------------------------------------------------------------------------------
Dataset: test
data: shape=(50, 224, 224, 3), dtype=float32
label_idxs: shape=(50,), dtype=int64
label_strs: shape=(6,), dtype=<U10
names: shape=(50,), dtype=<U19
features: shape=(50, 1280), dtype=float32


In [3]:
label_strs = data_train['label_strs']  # Same for all data sets.
assert (
    np.all(data_train['label_strs'] == data_valid['label_strs']) and
    np.all(data_train['label_strs'] == data_test['label_strs'])
)

X_train = data_train['data']
y_train = data_train['label_idxs']
F_train = data_train['features']
N_train = data_train['names']

X_valid = data_valid['data']
y_valid = data_valid['label_idxs']
F_valid = data_valid['features']
N_valid = data_train['names']

X_test = data_test['data']
y_test = data_test['label_idxs']
F_test = data_test['features']
N_test = data_test['names']

## Step 1: 1-layer dense network<a name="step-7.1"></a> ([top](#top-7))
---

**TODO:**
* Grid-search. Paramters: Drop-out rate, learning rate.

In [96]:
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras import activations
from tensorflow.keras import initializers

RANDOM_SEED = 0

n_classes = len(label_strs)

# Create model
model = keras.Sequential()

# Add drop-out layer.
model.add(keras.layers.Dropout(0.2, input_shape=(1280,), seed=RANDOM_SEED))

# Add output layer.
model.add(keras.layers.Dense(
    units=n_classes, activation=activations.softmax,
    kernel_initializer=initializers.VarianceScaling(scale=1.0, seed=RANDOM_SEED),
    kernel_regularizer=keras.regularizers.l2(l=0.01)
))

# Print network summary
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dropout_14 (Dropout)         (None, 1280)              0         
_________________________________________________________________
dense_16 (Dense)             (None, 6)                 7686      
Total params: 7,686
Trainable params: 7,686
Non-trainable params: 0
_________________________________________________________________


In [97]:
from tensorflow.keras import losses
from tensorflow.keras import metrics
from tensorflow.keras import optimizers

# TODO: 1-hot encode labels.

# Compile the model.
model.compile(
    optimizer=optimizers.Adam(lr=0.001),
    loss=losses.sparse_categorical_crossentropy,
    metrics=[metrics.sparse_categorical_accuracy]
)



In [98]:
# End training when accuracy stops improving (optional).
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, verbose=1)


In [99]:
# Fit model
history = model.fit(
    x=F_train, y=y_train,
    validation_data=(F_valid, y_valid), batch_size=32, epochs=100,
    shuffle=True,
    callbacks=[early_stopping]
)

Train on 280 samples, validate on 139 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
E

**Comment:** The best result is 91.3 % accuracy on the test set.

We want to use a random forest classifier.

In this part it makes sense to merge the training and the validation sets for cross-validation (since we would not make use of the validation set otherwise).

## Step 2: Try with SVMs<a name="step-6.2"></a> ([top](#top-6))
---

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import StratifiedKFold
from sklearn.pipeline import Pipeline
from sklearn.svm import LinearSVC
from sklearn.svm import SVC

We want to tune the regularization strength of the logistic regression classifier with cross-validated grid search.

**Note:** We have imbalanced classes (e.g. 22.63% bike vs. 8.83% van).

In [None]:
df_counts = (pd
 .DataFrame(data=pd.Series(data=y_train_large).value_counts(), columns=['count'])
 .set_index(label_strs)
)
df_counts['fraction'] = df_counts['count'] / df_counts['count'].sum()
df_counts.style.format({'fraction': '{:.2%}'})

In [None]:
# Create the estimator.
svm_pipe = Pipeline([
    ('svm', LinearSVC(random_state=RANDOM_STATE)),
])

In [None]:
Cs = np.logspace(-4, 4, num=2 * 8 + 1)  # C defaults to 1.0.
gammas = [0.01, 0.1, 1.0, 10.0, 'scale']

# Setup the cross-validated grid search.
grid = [
    # LinearSVC (minize: squared hinge loss, strategy: one-vs-rest)
    {
        'svm__C': Cs,
        'svm__class_weight':[None, 'balanced']
    },
    # SVC (kernel: linear, minimize: hinge loss, strategy: one-vs-one)
    {
        'svm': [SVC(random_state=RANDOM_STATE)],
        'svm__kernel': ['linear'],
        'svm__C': Cs,
        'svm__class_weight':[None, 'balanced']
    },
    # SVC (kernel: RBF, minimize: hinge loss, strategy: one-vs-one)
    {
        'svm': [SVC(random_state=RANDOM_STATE)],
        'svm__kernel': ['rbf'],
        'svm__C': Cs,
        'svm__gamma': gammas,
        'svm__class_weight':[None, 'balanced']
    }
]

cv = StratifiedKFold(n_splits=10, random_state=RANDOM_STATE)
svm_gscv = GridSearchCV(svm_pipe, grid, n_jobs=-1, iid=False, refit=True, cv=cv, return_train_score=True)

In [None]:
# Fit/evaluate the estimator.
svm_gscv.fit(F_train_large, y_train_large);

In [None]:
# Collect results in a data frame.
df_results = (pd
    .DataFrame({
        'svm': svm_gscv.cv_results_['param_svm'],
        'kernel': svm_gscv.cv_results_['param_svm__kernel'],
        'C': svm_gscv.cv_results_['param_svm__C'],
        'gamma': svm_gscv.cv_results_['param_svm__gamma'],
        'class_weight': svm_gscv.cv_results_['param_svm__class_weight'],
        'mean_train_score': svm_gscv.cv_results_['mean_train_score'],
        'mean_test_score': svm_gscv.cv_results_['mean_test_score'],
        'std_test_score': svm_gscv.cv_results_['std_test_score'],
        'params': svm_gscv.cv_results_['params']
    })
    .sort_values(by='mean_test_score', ascending=False)
)

In [None]:
df_results.head()