# Machine Learning for Comptuer Vision
A very brief overview

[//]: # (Comments here)

## Computer Vision Tasks

* Digital Image Accusation
* 3D Reconstruction
* Object Detection
* Recognition
* Sementic Segmentation
* Tracking
* ...

To expapolate and make decisions, you will need **knowledge**.

How to pass knowledge to machines? Human enforced rules? Not a good idea ...

Better go for **Machine Learning**

## Machine Learning
### What is machine learning
Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.

* Supervised learning
* Unsupervised learning
* Reinforcement learning
* ...

Let's just take a supervised image classification task (which is relevant to what we do!).

### How to do machine learning (supervised image classification task)

Think about how you would distinguish a sunflower and a red rose by looking at them.

How would you discribe them?

What if we only consider colour and size as our **features**.

| Feature | **Colour** | **Size** |
| --- | --- | --- |
| **Sunflower** | Yellow | Big |
| **Red Rose** | Red | Small |

**Features in ML:** An individual measurable property or characteristic of a phenomenon being observed.

If we quantify our selected features and plot each sample / observation as a point in 2D space

![SVM_1](./img/simple_2d_fpts.png)

Time to choose our **classifer**.

* Bayesian Model
* Support Vector Machines (SVMs)
* Random Forest
* Markov Chain
* Neural Network
* ...

By observing the distribution of the samples, we learned that they are **linearly separable**, meaning we can classify them by drawing a line.

In this case, we can comfortably choose a **Linear SVM** classifier to do the job. 

[//]: # (Explain SVM a little more with the figure)

[Link: Using SVM in OpenCV](https://docs.opencv.org/3.4/d1/d73/tutorial_introduction_to_svm.html) 

![SVM_2](https://upload.wikimedia.org/wikipedia/commons/thumb/2/2a/Svm_max_sep_hyperplane_with_margin.png/557px-Svm_max_sep_hyperplane_with_margin.png)

However, in real life, data is messy and not always linearly separable. For example, what if we have data distributed in the form of two circles like this.

![two_circles](./img/non-linearly-separable-data.jpg)

Can we still use SVMs? The answer is YES!

To accomplish this, we use a technique called the **Kernel Trick**. What a kernel does is mapping the features to a higher dimensional space in which they are linerly separable. In the two circle case, let's add a third dimension using the equation
$z=e^{-\gamma(x^2+y^2)}$, which is a **Radial Basis Function (RBF)** with a **Gaussian Kernel**. Then we plot the points with $z$ in a 3D space.

![two_circles_3d](./img/kernel-trick.jpg)

Now, they can be easily separated by a plane. Linear separability achieved and we can use SVMs again.


### Feature vs. Classifier

* Nice and clean features + simple classifier
* Messy features + more advanced classifier

Real-world problem is a lot messier and good features require careful engineering by domain experts and ML experts.
To work with messy features, we need classifiers that can handle more complexed mapping.

### Neural Network
Let's have a look at a **Neural Network**, a computational graph that can approximate arbitrary complex functions.

[Link: Neural nets can compute any function](http://neuralnetworksanddeeplearning.com/chap4.html)

![ANN](https://msatechnosoft.in/blog/wp-content/uploads/2018/05/Biological-vs-artificial-neuron-MSA-Technosoft.jpeg)

**How does a Nerual Net work?**

![bin](./img/bin_activate.png)

![bin](./img/sigmod_activate.png)

![bin](./img/layers.png)

![bin](./img/layers_matrix.png)

Neural Network warp the feature space, mapping the input data to another space where it is easy to handle. 

![warp](./img/ann_warp.png)

What a Neural Net essentially learned is a mapping from the input space to the output space, aka **A Function**.

It learns the mapping by adjusting the weights and biases through backpropagation.

[Video: How Neural Networks bend the feature space](https://www.youtube.com/watch?v=vdqu6fvjc5c)

[Video: What is backpropagation really doing?](https://www.youtube.com/watch?v=Ilg3gGewQ5U)

BTW, think about a one-layer Neural Network vs. a Linear SVM, are they equal?

OK, now we know there are more advanced classifiers.

But...

Real world is messy, what if we don't know how to construct our features.
If a NN can learn a desired mapping, can it learn how to build the features too.
The answer is again YES!

### Convolutional Neural Network (CNN)

First, what is convolution? 
[Link: A Beginner's Guide To Understanding Convolutional Neural Networks](https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/)

More details on convolution: [A guide to convolution arithmetic for deep learning](https://arxiv.org/abs/1603.07285)

Consider convolution as a mapping from input data to feature vectors.

Pooling, a subsampling to reduce the demention of feature vector.

A typical figure of CNN, from LeCun Yann's hand-written digit (MNIST dataset) recognition using a CNN.
![CNN](https://www.kdnuggets.com/wp-content/uploads/cnn-architecture.jpg)

[Video: Convolutional neural network kernels during training on MNIST dataset](https://www.youtube.com/watch?v=VUkFo6IXMJc)

Features vs. Classifiers (looks familiar?)

The native models (LeNet) only contains a few (single digit number) thousand free parameters to be tuned. After trained, it can reach over 90% accuracy on MNIST dataset.
[Link to MNIST dataset](http://yann.lecun.com/exdb/mnist/)

AlexNet, winner of the ImageNet Chanllenge in 2012 has 5 Convolutional Layers and 3 Fully Connected Layers, which gives over 60 million 








What to do with small training dataset
how to construct data for training / validation
tricks in ...

classifier fit the data


how to select your classifier
how to leverage deep learning frameworks
what is transfer learning
why transfer learning could work

when to stop training
training, validation, testing
70, 20, 10
60, 20, 20
Training set: A set of examples used for learning, that is to fit the parameters [i.e., weights] of the classifier.

Validation set: A set of examples used to tune the parameters [i.e., architecture, not weights] of a classifier, for example to choose the number of hidden units in a neural network.

Test set: A set of examples used only to assess the performance [generalization] of a fully specified classifier.


Applied deep learning research is much more about taming your problem (understanding the inputs and outputs), casting the problem as a supervised learning problem, and hammering it with ample data and ample experiments.





[Link: Deep Learning Review](http://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf)

[The Deep Learning Book](http://www.deeplearningbook.org/)

## Computer Vision
### Reconstruction
Perception
Perceve the world, 

The very native and ... task for computer vision.
![SfM](http://openmvg.readthedocs.org/en/latest/_images/structureFromMotion.png "Structure from Motion")
![SfM](http://www.cs.cornell.edu/~snavely/bundler/images/Colosseum.jpg "Structure from Motion")

3D reconstruction

tracking

### Recognition
A higher level task ...
recognition
classification
detection
segmentation, sementic segmentation
tracking

Logical


most contemporary AI researchers agree that Logic-based AI is dead.


## Machine Learning (Excluding Deep Learning)

Relation between machine learning and deep learning.
![](https://cdn-images-1.medium.com/max/1200/1*TiORvHgrJPme_lEiX3olVA.png)

### Feature / Descriptor


[From feature descriptors to deep learning: 20 years of computer vision](http://www.computervisionblog.com/2015/01/from-feature-descriptors-to-deep.html)


### Classifier

SVM
Random Forest
Neural Networks


### Supervised and Unsupervised Learning


## Deep Learning (the hottest part now in ML)

[I'm an inline-style link](https://www.google.com)

Big data + deep model
![](http://3.bp.blogspot.com/-zQlQvmK9U9g/VT_Hk6yKlmI/AAAAAAAAODQ/nNNcpVM4UPM/s1600/bg_pipeline-01.png)


### Model

nueron
activate function
sigmod
tanh
reLu
maxout



CNN
Convlutional 

what is convolution

convlution in math
kernel/filter
convlution as a matrix operation

an example
canny filter for edge


How learn
Back propgation
Gradent decent

how a NN warp and transform the feature space
a neuron --- linear classifer



[Neural nets can compute any function](http://neuralnetworksanddeeplearning.com/chap4.html)
[]()
[]()


### Implementation / Coding

[SVM OpenCV](https://docs.opencv.org/3.4/d1/d73/tutorial_introduction_to_svm.html) 
Random Forest OpenCV




### Frameworks / Libraries

![](https://agi.io/wp-content/uploads/2018/02/frameworks-new.png)


Trend
![](https://pbs.twimg.com/media/DX5I8r_VwAACbmo.jpg:large)

Use Tensorflow + Keras

bigger data + deeper model


For data, how big is big enough?





![](http://wordpress.viu.ca/ciel/files/2013/01/134992626.jpg)

## Reference

