# Machine Learning 101

__Narrow AI__: A system that can do just one or a few things well

In normal programming, we make an algorithm by programming in a set of rules.

In ML, we create algorithms that learn complex functions (patterns) from data and make predictions based on it.

1. Take in some data
2. Learn a pattern from it
3. Makes predictions on new data based on these patterns

## Deep Learning

__Deep learning__: A technique for implementing machine learning.

E.g., Deep neural networks.

{AI {Machine learning {deep learning}}}

__Properties/Attributes__: The properties of the thing you are trying to learn about. E.g., if you are trying to learn about a piece of fruit, it's properties might be colour and weight.

Each property could be thought of as a dimension, and you could then plot the dimensions to classify things.

<img src="images/machine_learning_101/appleorange.jpg" alt="" style="width: 400px;"/>

This means it's important to choose features that will enable you to make good predictions.

## Dimensionality
Adding more dimensions can help to better classify data. However, too many dimensions can cause __overfitting__ -- it knows the example data perfectly, but it can't generalize well-enough to make predictions for new data.

<img src="images/machine_learning_101/dimensions.jpg" alt="" style="width: 400px;"/>

## Training Machines

### Supervised Learning
Where the training data is labeled, you can use supervised learning. E.g., you have a list of fruits, each with colors and weights.

### Unsupervised Learning
Imagine you had 3 clusters on a graph. The machine must learn to recognize and categorize the 3 different clusters.

### Reinforcement Learning
Learning by trial and error/reward and punishment. Reward the program when it makes good moves, punish, or do nothing, when it makes bad moves.

## Approaches

### Regression
Imagine you have a plot and you draw a line of best fit through it. You could then predict new values using this line. The machine works out what the line is by itself.

Alternatively, we could use a line to divide categories or data. Then, given some new data point, we could just check which side of the line it falls on.

### k-nearest neighbour
Check the nearest points to classify your data. E.g., in the diagram below, if we put a new plot in the bottom right, it would classify it as blue, since the majority of nearest neighbours are blue.

<img src="images/machine_learning_101/knearest.jpg" alt="" style="width: 400px;"/>


### Neural Networks

#### Artificial neuron (perceptron)
An artificial neuron takes in a bunch of weighted inputs -- take a bunch of input values (say, pixel color), multiply them all by weights, and add them all. I.e., take a weighted sum of input values. Then add a bias. If the result of weighted sum + bias > some threshold, activate the neuron, which then gives an output. This output becomes the input for another neuron, an the process repeats. The strength of output depends on the activation function.

<img src="images/machine_learning_101/neuron.jpg" alt="" style="width: 600px;"/>

A network of these neurons might look something like this:

<img src="images/machine_learning_101/network.jpg" alt="" style="width: 600px;"/>

### Deep Neural Networks
Deep neural networks are networks with lots of hidden layers.
The hidden layers generally decrease in dimensionality in order to avoid overfitting.

So the first layer might identify edges, the second identify face parts from the edges, the third identify whole faces from the face parts.

<img src="images/machine_learning_101/dnn.jpg" alt="" style="width: 400px;"/>

### Convolutional Neural Networks
Convolutional neural nets can match _parts_ of an image instead of the whole thing.

<img src="images/machine_learning_101/partmatch.jpg" alt="" style="width: 400px;"/>

__features__ here refers to parts of the image. Like the slanting lines upwards and downwards of an X.

<img src="images/machine_learning_101/ccnfeature.jpg" alt="" style="width: 400px;"/>

If you choose the right feature and put it in the right place, it matches the image exactly.

1. Line up the feature with the image (pixel grids)
2. Multiply each image pixel * feature pixel
3. Add them all up
4. Divide by total pixels

If you get a 1, you have recognized where you feature in the image. So put a 1 there to recognize the feature is in this position. This process is known as __filtering__.

By moving the feature around to every position and checking like this, we do __convolution__.

(This is a whole topic in itself that is probably worth covering later.)

## ML algorithms

### Regression
Iteratively model the relationship of values using some measure of error.
Best-fit straight line would be a form of linear regression.

### Instance-based
Decision problem where you have _instances_ (examples) of training data deemed important to the model.

* Build up a database of example data
* Compare new data to the database using some similarity measure
* Find a best match and make a prediction

E.g., k nearest neighbour

### Bayesian
Use Bayes' Theorem to make prediction.
E.g., words used to classify an email as spam or not.

### Clustering
Organize data into groups based on commonality and then classify it.

### Association rules
Extract rules that explain relationships between variables.

### Neural networks/Deep learning
Deep learning is essentially using ANNs to build deeper, more complex networks.

### Dimensionality reduction
Like clustering, uses inherent structure of data.
Unsupervised. Used to summarize or describe data.

### Ensemble
Combine multiple weak models to make a prediction.

## ML Output

### Continuous output
Output is a decimal number.

### Probability Estimation
Output is a decimal number between 0-1, indicating probability.
E.g., 9/0.754 == 75.4% chance this is 9.

### Classification
Output is a label.
Input: fur, yellow, barks
Output: dog

