# Motivation: Why Machine Learning


## Two Views of Machine Learning

> + Machine Learning seeks to **learn models of data**: define a **space of possible
models; learn the parameters and structure of the models from data**; make
predictions and decisions
+ Machine Learning is a **toolbox of methods for processing data**: feed the data
into one of many possible methods; choose methods that have **good theoretical
or empirical performance**; make predictions and decisions

>   _Zoubin Ghahramani, MLSS 2012_

## Data, Data; Everywhere! 
+ Social Networks 
+ Cloud-Age Businesses
+ LHC & SETI & Human Connectome Project

### Data is...
#### Messy
<img src="messy.png">
#### Noisy
<img src="noisy.png">
#### Yuge
<img src="yuge.png">

## What are Models?
Represent the key aspects of the real world in mathematics.
#### Examples:
+ Odds Calculators for Sports Betting
+ Homo Economicus: The Rational Human
+ Logistic Population Growth Model
+ Two-body Gravitational System

#### Modern Requirements:
Good Models:
+ represent uncertainty in their structure and parameters 
+ adapt to the data and the problem statement automatically
+ are robust against critical failure modes
+ are capable of scaling to yuge data

#### Commonalities:
+ Data: Lives in a Space and is gathered from the real world
+ Predictions and Decisions: (if x=k, y=?) or (if x_0 = b, x_t = ?)
+ Magic Components:
  - Learning, Parameters and Structure 
    + (structure: DAG, Tree, Chain) 
    + (Learning: Gradient Descent, Bayes Rule, (What do SVMs use? What do Random Forests use?)) 
    + (Parameters: Matrices smashing into each other, Tree Nodes, Vectorspace Kernels)
  - Theoretical and Empirical Performance 
    + theoretical: Asymptotic Bounds 
    + empirical: Past Error Rates

## Problems, Problems; Everywhere! 

### What Problems is ML good at?
+ Youtube Recommendations!
+ Ad Clicks!
+ Auto-Correct!
+ Go & Chess & Pong!
+ Traffic Jams!
+ Cancer Diagnoses!


### What Problems does ML struggle with?
+ Legal Judgements

### From iRobot to MyRobot
> Probability theory is nothing but common sense reduced to calculation.
> _Laplace, 1819_


#### The Rules of Reasonable Thought:
_HT: "Probability Theory: The Logic of Science" by E.T Jaynes_

##### Logical Operations:
+ Conjunction (AND)
    **A . B**    
+ Disjunction (OR)
    **A + B**
+ Negation (NOT)
    **~A**
+ Implication (IF THEN)
    **=>**


+ Degrees of plausibility are represented by real numbers.

+ Qualitative correspondence with common sense:
  - Conditional Probability: Statement A is only as plausabile as the evidence B makes it
    + (A | B)
  - Plausability relationships between statements can take many form
    + (A | BC) or (A + B|CD )
  - Metric Comparability
    + (A | B) > (C | B)
  - Probability of Evidence changing changes things:
    + (A | C_1) > (A | C_0) AND (B | A C_1) == (B | A C_0) =>
    (A B | C_1) > (A B | C_0)
    AND
    (~A | C_1) < (~A | C_0)
    
+ **Consistent** Reasoning:
  - If a conclusion can be reasoned out in more than one way, 
  then every possible way must lead to the same result.
  - A conclusion must be based on **all** of the evidence that is
  relevant to a question. Arbitrarily ignoring some of
  the information leads to biased, ideological conclusions.
  - Equivalent **states of knowledge** are represented by equivalent plausibility assignments. 
  That is, if in two problems a robot’s state of knowledge is the same (except perhaps for 
  the labeling of the propositions), then it must assign the same 
  plausibilities in both.

## Bayes Theorem

$$ P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)} $$

$$ Prior = P(A) $$

$$ Likelihood = \frac{P(B \mid A)}{P(B)} $$


#### Components of Understanding 
+ Representation
+ Visualization
+ Modelling of Characteristics and Relationships

In [None]:
import tensorflow as tf

In [None]:
def create_probabilty_distribution():
    
    return