# Machine Learning Tutorial: Making sense of a messy world

## Introduction
[Brief Introduction](https://www.youtube.com/watch?v=l95h4alXfAA)  
[What is Machine Learning?](https://www.youtube.com/watch?v=jmMcJ4XlrWM&index=2&list=PLZ9qNFMHZ-A4rycgrgOYma6zxF4BZGGPW)

## Course Project
[Kaggle Project](https://www.kaggle.com/c/grupo-bimbo-inventory-demand)

In [None]:
## Sample function
from __future__ import print_function
from sklearn.ensemble import RandomForestRegressor

def get_rfg_result(training_data, test_data, output):
    rfg_model = RandomForestRegressor()                       # Initializing the model
    rfg_model.fit(training_data[:,0:6], training_data[:,-1])  # Training the model
    predict_value = rfg_model.predict(test_data[:,1:])        # Getting predictions
    
    # Storing the result in file
    with open(output, 'w+') as out:
        print("id", "Demanda_uni_equil", sep=',', file=out)
        for i in range(len(predict_value)):
            print(int(test_data[i,0]), int(round(predict_value[i])), sep=',', file=out)

## Probability

### Sample Space  
Universal set of all possible results of an experiment.

### Event
One possible outcome of of an experiment. Subset of Sample Space.

#### Examples
_Experiment 1:_ Single Coin Toss  
 Sample Space: {H,T}  
              Event: {H}

_Experiment 2:_ Rolling a die  
Sample Space: {1,2,3,4,5,6}  
              Event: {1}

#### Exercise:  
What is the sample space of two coin tosses?

### What is Probability?  
The chance of occurence of an event given a sample space.  

#### Example
What is the probability of head in a single toss of a fair coin?    
Sample space: {H,T} and Event: {H}  
Probability of head or P(H) = 0.5



### Set Theory

<img src="images/Venn_diagram.png" width=50%>



#### Relation between union and intersection of sets
$A\cup B = A + B - A\cap B$

#### Exercise
S = {1,2,3,4,5,6}  
A = {2,4}  
B = {1,2,3}  
What is $A\cup B$, $A\cap B$, $A-B$, $A^c$ and $B^c$?

In [1]:
S = set([1,2,3,4,5,6])
A = set([2,4])
B = set([1,2,3])
print "A U B:",list(set.union(A,B))
print "A intersection B:",list(set.intersection(A,B))
print "A - B:",list(set.difference(A,B))
print "A':",list(set.difference(S,A))
print "B':",list(set.difference(S,B))

A U B: [1, 2, 3, 4]
A intersection B: [2]
A - B: [4]
A': [1, 3, 5, 6]
B': [4, 5, 6]


#### Exercise
$A\cup B = [1,2,4,5]$  
$A\cap B = [2,4]$  
What is A and B? (Hint: Multiple solutions are possible)

### Probability rules
$P(S) = 1$ (Here S denotes the sample space)

$P(A\cup B) = P(A) + P(B) - P(A\cap B)$

$P(A^c) = P(S-A) = P(S) - P(A) = 1 - P(A)$

$P(A\cap B) = P(A) \times P(B)$   (Only when A and B are independent)

## [Conditional Probability, Law of Total Probability and Bayes Theorem](http://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-statistics-spring-2014/class-slides/MIT18_05S14_class3slides.pdf)

### Expectation
<img src="images/Expectation.png" width=50%>

#### Exercise
S = [0,1,2,3,4]  
P(X=0) = 0.3  
P(X=1) = 0.1  
P(X=2) = 0.2  
P(X=3) = 0.1  
P(X=4) = 0.3  

What is E(X)?

### Variance
<img src="images/Var.png" width=50%>

### Properties of Expectation and Variance
<img src="images/Exp_Var_prop.png" width=50%>

### Law of Large Numbers
<img src="images/LLN.png" width=50%>

### Probability Distributions
<img src="images/Uniform_dist.png">

<img src="images/Normal_dist.png" width=50%>
<img src="images/Standard_Normal_dist.png" width=50%>

## Short Break
For anyone fearing that AI will take over humans, this might cheer you up. [Compilation of Robots falling in DARPA competition](https://www.youtube.com/watch?v=g0TaYhjpOfo)

# Statistics

### Sample Average (or Mean) 
$$\bar{X}=\frac{x_1+x_2+x_3+...+x_N}{N}$$  

Function: [numpy.mean](http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html)

### Standard Deviation
$$\sigma=\sqrt\frac{(x_1-\bar{X})^2 + (x_2-\bar{X})^2 +...+(x_N-\bar{X})^2}{N}$$  

Function: [numpy.std](http://docs.scipy.org/doc/numpy/reference/generated/numpy.std.html)

### Median
If N is odd then Median = $x\Big[\frac {N+1}{2}\Big]$  
If N is even then Median = Mean of $x\Big[\frac {N}{2}\Big] \text{ and } x\Big[\frac {N}{2} + 1\Big]$

**Note:** Median is robust to outliers  

Function: [numpy.median](http://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html)


## Vectors
In linear algebra, a vector or column vector is an m × 1 matrix, that is, a matrix consisting of a single column of m elements.  
$$x=\begin{bmatrix}
           x_{1} \\
           x_{2} \\
           \vdots \\
           x_{N}
         \end{bmatrix}$$
         
### p-norm of Vector

$$||x||_p=\Bigg(\displaystyle\sum_{i=1}^{N} |x_i|^p\Bigg)^{\frac{1}{p}}$$

#### 1-norm
$$||x||_1=\displaystyle\sum_{i=1}^{N} |x_i|$$

#### 2-norm
$$||x||_2=\Bigg(\displaystyle\sum_{i=1}^{N} |x_i|^2\Bigg)^{\frac{1}{2}}$$

#### 0-norm
$$||x||_0=\text{Number of non-zero terms in x}$$

#### infinity-norm
$$||x||_\infty=\displaystyle\max_i |x_i|$$  

Function: [numpy.linalg.norm](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html)

### Correlation
Correlation is a statistical technique that can show whether and how strongly pairs of variables are related. For example, height and weight are related; taller people tend to be heavier than shorter people.  
$$\text{Pearson's Correlation}(x,y) = \frac {E[(X-\mu_x)(Y-\mu_y)]}{\sigma_x \sigma_y}$$  
$$\text{Pearson's Sample Correlation}(x,y) = \frac {\sum_{i}x_i y_i - N\bar{X}\bar{Y}}{N\sigma_x \sigma_y}$$  
It gives a value between +1 and −1 inclusive, where 1 is total positive correlation, 0 is no correlation, and −1 is total negative correlation  

Function: [numpy.corrcoef](http://docs.scipy.org/doc/numpy/reference/generated/numpy.corrcoef.html)

### Maximum Likelihood Estimate (MLE)
[MLE examples](https://onlinecourses.science.psu.edu/stat414/node/191)

## Matrix Algebra
[Slides for Matrix Algebra](http://ibgwww.colorado.edu/~carey/p7291dir/handouts/matrix.algebra.pdf)

## Extra Videos
[AlphaGo and Google Deepmind](https://www.youtube.com/watch?v=TnUYcTuZJpM)