<h1>Chapter 4 - Fundamentals of Machine Learning</h1>

<h2>This chapter cover</h2>
<ul>
<li>Forms of machine learning beyond classification and regression</li>
<li>Formal evaluation procedures for machine learning models</li>
<li>Preparing Data for Deep Learning</li>
<li>Feature engineering</li>
<li>Tackling Overfitting</li>
<li>The universal workflow for approaching machine learning problems</li>
</ul>

<h3>4.1.1 Supervised learning</h3>
<p>
Generally, almost all applications of deep learning that are in the spotlight thsese days belongs in this category
such as:
</p>

<ul>
<li>Optical Character Recognition</li>
<li>Speech Recognition</li>
<li>Image Classification</li>
<li>Language Translation</li>
<li>Sequence Generation</li>
<li>Syntax Tree Prediction</li>
<li>Object Detection</li>
<li>Image Segmentation</li>
</ul>

<h3>4.1.2 Unsupervised learning</h3>
<p>
Consists of finding interesting transformations of the input data without the help of any targerts, for the purposes
of data visualization
</p>

<ul>
<li>Data Visualization</li>
<li>Data Compression</li>
<li>Data Denoising or to better understand the correlations the data at hand.</li>
<li>Dimensinality Reduction</li>
<li>Clustering</li>
</ul>

<h3>Classification and Regression Glossary</h3>

<ul>
<li><i style="color: yellow">Sample or Input</i> - One Data point that goes into your model</li>
<li><i style="color: yellow">Prediction or Ouput</i> - What comes out yout model
<li><i style="color: yellow">Target</i> - The Truth. What your model should ideally have predicted, according to an external source data</li>
<li><i style="color: yellow">Prediction Error or Loss Value </i> - A measure of the distance between yout model's prediction and the target</li>
<li><i style="color: yellow">Classes</i> - A set of possible labels to choose from in a classification problem</li>
<li><i style="color: yellow">Label</i> - A specific instance of a class annotation in a classification. </br>For instance, if picture #123 is annotated as containing the  cass 'dog' then 'dog' is a label of picture #123</li>
<li><i style="color: yellow">Ground-truth or Annotations</i> - All targets for a dataset, typically collected by humans</li>
<li><i style="color: yellow">Binary classification</i> - A classification task where each input sample should be categorized into two exclusive categories</li>
<li><i style="color: yellow">Multiclass classification</i> - A classification task where each input sample should be categorized </br> into more than two categories: for instance, classifying handwritten digits.</li>
<li><i style="color: yellow">Multilabel classification</i> A classification taks where each input sample can be assigned multiple labels. </br> Give image may contain both a cat and a dog and should be annotated both with the cat label and the dog label.</li>
<li><i style="color: yellow">Scalar regressin</i> - A task where the target is a continous scalr value. Predicting house prices is a good example.</li>
<li><i style="color:yellow">Vector regression</i> - A task where the target i a set of continous values: For example, a continous vector. </br> If you're doing regression against multiple values, (such as the coordinate of a bounding box in an image), then you're doing vector regression.</li>
</ul>

<h4>4.1 Hold-out validation</h4>

In [56]:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.datasets import boston_housing
import numpy as np

<h4>4.3 Original Model</h4>

In [10]:
normal = Sequential()
normal.add(Dense(16, activation='relu', input_shape=(1000, )))
normal.add(Dense(16, activation='relu'))
normal.add(Dense(1, activation='sigmoid'))

<h4>4.3 Version of the model with lower capacity</h4>

In [11]:
lower = Sequential()
lower.add(Dense(4, activation='relu', input_shape=(1000, )))
lower.add(Dense(4, activation='relu'))
lower.add(Dense(1, activation='sigmoid'))

<h4>4.4 Version of the model with higher capacity</h4>

In [13]:
higher = Sequential()
higher.add(Dense(512, activation='relu', input_shape=(1000, )))
higher.add(Dense(512, activation='relu'))
higher.add(Dense(1, activation='sigmoid'))

<h4>4.6 Adding L2 weight regularization to the model</h4>

In [16]:
from keras import regularizers

L2 = Sequential()
L2.add(Dense(16, kernel_regularizer=regularizers.l2(0.001), activation='relu', input_shape=(1000,)))
L2.add(Dense(16, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
L2.add(Dense(1,  activation='sigmoid'))

# l2(0.001) means every coefficient in the  weight matrix of the layer will add 0.001 * weight_coefficient_value 
# to the total loss of the network.

<h3>4.4.3 Adding Dropout</h3>

In [55]:
layer_output = np.array([
  [0.3, 0.2, 1.5, 0.0], 
  [0.6, 0.1, 0.0, 0.3], 
  [0.2, 1,9, 0.3, 1.2], 
  [0.7, 0.5, 1.0, 0.0]
  ], dtype=object)
layer_output

array([list([0.3, 0.2, 1.5, 0.0]), list([0.6, 0.1, 0.0, 0.3]),
       list([0.2, 1, 9, 0.3, 1.2]), list([0.7, 0.5, 1.0, 0.0])],
      dtype=object)

In [54]:
# At training time, drops out 50% of the units in the output
layer_output *= np.random.randint(0, high=2, size=layer_output.shape)
layer_output

array([list([0.3, 0.2, 1.5, 0.0]), list([]), list([]), list([])],
      dtype=object)

In [57]:
dropout = Sequential()
dropout.add(Dense(16, activation='relu', input_shape=(1000,)))
dropout.add(Dropout(0.5))
dropout.add(Dense(16, activation='relu'))
dropout.add(Dropout(0.5))
dropout.add(Dense(1,  activation='sigmoid'))

|Problem Type| Activation | Loss |
|------------|------------|------|
|Binary classification| sigmoid | binary_crossentroyp |
|Multiclass classification | softmax | categorical_crossentroypy |
|Muliclass, multilabel classification | sigmoid | binary_crossentropy |
|Regression to arbitrary values | None | mse |
|Regression to values between 0 and 1 | sigmoid | mse or binary_crossentropy |

<h3>4.5.6 Scaling up: Developing a model that overfits</h3>
<ol>
<li>Add layers</li>
<li>Make the layer bigger</li>
<li>Train for more epochs</li>
</ol>

<h3>4.5.7 Regularizing your model and tuning your hyperparameters</h3>
<ol>
<li>Add Dropout</li>
<li>Try different architectures - add or remove layers
<li>Add L1 and/or L2 regularization</li>
<li>Try different hyperparameters</li>
<li>Add new features, remover features that don't seem to be informative</li>
</ol>