# What is machine learning?

The construction and study of algorithms/programs that can <b> learn from data </b>.

<!-- 
![](../images/ml-overview.jpg?raw=true)
-->
![](./images/ml-overview.jpg?raw=true)

# Traditional programming

Steps: formulate problem $\rightarrow$ design algorithm $\rightarrow$ write program $\rightarrow$ test

The program remains invariant with different input data, unless the programmer manually changes it.


# Example: minimum finding

Problem
* Given a sequence of numbers, find the smallest one

Algorithm
* Record the currently minimum, initialized to $\infty$
* Loop through the numbers one by one
 * If this number $<$ minimum, minimum $\leftarrow$ this number

In [1]:
# a simple python program to find minimum numbers
# note that it remains the same regardless of the input data

import math

# function
def findmin(numbers):
    answer = math.inf
    for value in numbers:
        if value < answer:
            answer = value
    return answer

# main
test = [3.14, 2.2, 8, -9.2, 100000, 0]

print(findmin(test))


-9.2


# Example: sorting

Problem
* Given a sequence of numbers, order them from small to large

Algorithm
* Pick one number (randomly) as anchor
* Go through each other number
 * If this number $\leq$ anchor, append it to the left sequence
 * Otherwise, append it to the right sequence
* Apply the same method recursively to the left and right sequences
* Concatenate left sequence, anchor, right sequence

In [2]:
# code is left as an exercise

# Other examples
Think about the programs you have written in the past; how many of them can learn from data?

# Machine learning

The program can learn from data and change structure/behavior

The programmer still writes (part of) the program, but it is not fixed
* Models with parameters that can change with input data, like brains
* Programmer selects and initializes the model; the model parameters change with data (training via optimization)
* The trained model deals with future situations

[<img src="https://www.wired.com/wp-content/uploads/2016/03/GW20160133774-1024x768.jpg">](http://www.wired.com/2016/03/final-game-alphago-lee-sedol-big-deal-humanity/)

# Why learning from data

Some algorithms/programs are hard/impossible to design/code manually/explicitly

The algorithms/programs might need to deal with unforseeable situations





# Example: handwriting digit recognition

[<img src="https://camo.githubusercontent.com/d440ac2eee1cb3ea33340a2c5f6f15a0878e9275/687474703a2f2f692e7974696d672e636f6d2f76692f3051493378675875422d512f687164656661756c742e6a7067">](https://github.com/cazala/mnist/blob/master/README.md)

Problem
* Input: a digit represented by a $28 \times 28$ image (MNIST)
* Output: one of the digits in [0 .. 9]

Traditional programming?
* Give it a try :-)

Machine learning
* Collect data - pairs of images and labels
* Select and initialize a model; train the model (parameters) with the data
* The model, if properly trained, can recognize handwritings not in the original dataset for training

Sometimes it is much easier to say what (example data) instead of how (algorithm)
* [Soon We Won’t Program Computers. We’ll Train Them Like Dogs](http://www.wired.com/2016/05/the-end-of-code/)

# Other applications

Self-driving cars

Language translation

Speech analysis & synthesis

Spam filtering

Recommendation systems

Fraud detection

Market prediction

# Types of machine learning

Supervised learning
* Given examples of inputs and corresponding desired outputs, predict outputs on future inputs: classification, regression, time series prediction

Unsupervised learning
* Given only inputs, automatically discover representations, features, structure, etc.: clustering, outlier detection, dimensionality reduction

Reinforcement learning
* Given sequences of actions of an agent and feedbacks from an environment, from a fixed set, learn to select action sequences in a way that maximises the expected reward: playing games, self driving cars