# Fundamentals

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lukeconibear/intro_ml/blob/main/docs/01_fundamentals.ipynb)

In [1]:
# if you're using colab, then install the required modules
import sys
IN_COLAB = 'google.colab' in sys.modules
if IN_COLAB:
    %pip install ...

Machine learning and deep learning are large and growing fields.

This course does not attempt to cover them in detail.

Instead, this course aims to provide high-level intuitions and practical guidance to get started.

To learn more, see the {ref}`Online Courses <online_courses>` below.

## Basic ideas

### Overview

Machine learning is a subset of Artificial Intelligence.

It is a range of methods that learn associations from data.

These can be useful for:

- Prediction problems (e.g., pattern recognition).
- Problems cannot program (e.g., image recognition).
- Faster approximations to problems that can program (e.g., spam classification).

### Methods

Within machine learning, there are many different methods.

The main methods are:

- Classic
- Deep learning (i.e., neural networks)
- Reinforcement learning
- Ensembles

We'll focus on classic machine learning and deep learning in this course.

### Data

Data has inputs and outputs.

The inputs are what you provide to the model.

The outputs are what you're trying to predict.

The data is normally in the form of tensors.

Tensors are multi-dimensional arrays e.g., vectors are 1D tensors, matrices are 2D tensors.

#### Supervised and unsupervised

- Supervised learning is when you provide labelled outputs to learn from.
- Unsupervised learning when you don't provide any labels.

We'll focus on supervised learning in this course.

#### Classification and regression

- Classification problems are those that try to predict a category (i.e., cat or dog).
- Regression problems are those that try to predict a number (i.e., beans in a jar).

#### Training, validation, and test splits

The data is normally split into training, validation, and test sets.

- The training set is for training the model.
- The validation set (optional) is for iteratively optimising the model during training.
- The test set is only for testing the model at the end.
    - This should remain untouched and single-use.

The size of the split depends on the size of the dataset and the signal you're trying to predict (i.e., the smaller the signal, then the larger the test set needs to be).

- For small data sets, a split of 60/20/20 for train/validation/test may be suitable.
- For large data sets, a split of 90/5/5 for train/validation/test may be suitable.
- For very large data sets, a split of 98/1/1 for train/validation/test may be suitable.

### Models

...

#### Evaluation

...

#### Underfit

A model _underfits_ the data when it has _high bias_. 

This means the model is _too simple_ to capture the association.

You can tell that the model underfits because there are _both_ high training errors and high test errors.

To reduce this problem, try:

- Adding more features.
- Adding more complex features.
- Decreasing regularisation.

More training data is unlikely to help a model that underfits the data.

#### Overfit

A model _overfits_ the data when it has _high variance_. 

This means the model is _too complex_ to capture the association.

You can tell that the model underfits because there are _low_ training errors _but_ high test errors.

To reduce this problem, try:

- Adding more data.
- Using fewer or simpler features.
- Increasing regularisation.
- A smaller neural network with fewer layers/parameters.

### Caveats

- Predictions are primarily based on associations, not explanations or causation.
- Predictions and models are specific to the data they were trained on.

- cost function
- gradient descent
- error analysis

R2 (coefficient of determination) 
Any value less than 1, as model can be continually awful 
1 is perfect 
0 is not more information than just predicting the mean 

### Deep learning

Steps 

Inputs 

forward propagate 

predict outputs 

compute loss 

backward propagate 

gradient descent 

update weights and biases 


Scale is driving DL progress 

Bigger training data (Larger data sets (labelled, m)) 

Bigger neural networks 

Now investment and attention drive it forward more 

Neural networks (NN) 

Useful for non-linear, with large number of features 

Singular evaluation metric 

Loss function (error on single training example) 

Cost function (average of loss functions over whole training set) 

## Tools

In [None]:
import numpy as np

### [scikit-learn](https://scikit-learn.org/stable/)

### [TensorFlow](https://www.tensorflow.org/)

In [21]:
import tensorflow as tf

In [22]:
if tf.config.list_physical_devices('GPU'):
    print('Using GPU')
else:
    print('Cannot find a GPU')

Cannot find a GPU


In [26]:
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

In [27]:
def preprocess_data(data):
    data_reshaped = data.reshape((data.shape[0], data.shape[1] * data.shape[2]))
    data_reshaped_normalised = data_reshaped.astype("float32") / 255
    return data_reshaped_normalised

In [28]:
train_images = preprocess_data(train_images)
test_images = preprocess_data(test_images)

In [29]:
model = tf.keras.Sequential([
    tf.keras.layers.Dense(512, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax")
])

In [30]:
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"]
)

In [33]:
model.fit(train_images, train_labels, epochs=1, batch_size=128);

 62/469 [==>...........................] - ETA: 1s - loss: 0.1219 - accuracy: 0.9657

2022-03-07 14:14:15.730972: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 188160000 exceeds 10% of free system memory.




In [34]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test_acc: {test_acc}")

test_acc: 0.9714000225067139


### [PyTorch](https://pytorch.org/)

repeat above simple example

## Exercises

```{admonition} Exercise 1

...

```

## {ref}`Solutions <fundamentals>`

## Key Points

```{important}

- [x] _..._

```

## Further information

### Good practices

- ...

### Other options

- ...
 
### Resources

- [Machine Learning for Everyone](https://vas3k.com/blog/machine_learning/)

(online_courses)=
### Online courses

#### Machine learning

- [Machine learning](https://www.coursera.org/learn/machine-learning), Coursera, Andrew Ng.
    - CS229, Stanford University: [Video lectures](https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU).
- [Artificial Intelligence: A Modern Approach, 4th edition](http://aima.cs.berkeley.edu/), Stuart Russell and Peter Norvig, 2021, Pearson.
- [Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/), Aurélien Géron, 2019, O’Reilly Media, Inc.
    - [Jupyter notebooks](https://github.com/ageron/handson-ml2).
- [Artificial Intelligence: Principles and Techniques](https://www.youtube.com/playlist?list=PLoROMvodv4rO1NB9TD4iUZ3qghGEGtqNX), Percy Liang and Dorsa Sadigh, CS221, Standord, 2019.
- [Machine Learning Yearning](https://www.deeplearning.ai/programs/), Andrew Ng.

#### Deep learning

- [Deep Learning Specialization](https://www.coursera.org/specializations/deep-learning), Coursera, DeepLearning.AI
    - CS230, Stanford University: [Video lectures](https://www.youtube.com/playlist?list=PLoROMvodv4rOABXSygHTsbvUz4G_YQhOb), [Syllabus](http://cs230.stanford.edu/syllabus/)
- [NYU Deep Learning](https://atcold.github.io/NYU-DLSP21/), Yann LeCun and Alfredo Canziani, NYU, 2021
    - [Video lectures](https://www.youtube.com/playlist?list=PLLHTzKZzVU9e6xUfG10TkTWApKSZCzuBI)
- [NVIDIA Deep Learning and Data Science with GPUS](https://web.cvent.com/event/5f037a53-5be6-4abf-9b48-dc94e8a8ee3a/summary?rt=QZKIZW0GWUGNA6QZE4e55Aummary)
- [Neural Networks for Machine Learning](https://www.youtube.com/playlist?list=PLLssT5z_DsK_gyrQ_biidwvPYCRNGI3iv), Geoffrey Hinton.
- [Deep Learning with Python, 2nd Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff), François Chollet, 2021, Manning.
    - [Jupyter notebooks](https://github.com/fchollet/deep-learning-with-python-notebooks).