### Problem Motivation

Given dataset {$x^{(1)}, x^{(2)}, \dots, x^{(m)}$}
Is $x_{test}$ anomalous?    
- Model $p(x)$ 
- $p(x) < \epsilon \to$ flag anomaly
- $p(x) \geq \epsilon \to$ ok

Examples   

Fraud detection:   
    $x^{(i)}$ = features of user $i$'s activiities          
    Model $p(x)$ from data    
    Identify unusual users by checking which have $p(x) < \epsilon$  
    
Manufacturing

Monitoring computers in a data center.    
$x^{(i)}$ = features of machine $i$
$x_1$ = memory use    
$x_2$ = number of disk access/sec       
$x_3$ = CPU load     
$x_3$ = CPU load/network traffic     
...




### Gaussian distribution

Say $x \in \mathbb{R}$, If $x$ is a distribuited Gaussian with mean$\mu$ and variance (standard deviation) $\sigma^2$    

$x \sim \mathcal{N}(\mu, \sigma^{2})$ Normal Distribuition

$p(x; \mu,\sigma^{2}) = \frac{1}{\sqrt{2\pi}\sigma}\exp(-\frac{(x-\mu)^2}{2\sigma^2}) $

$\mu = 3, \sigma = 1$ centered in 3
$\mu = 0, \sigma = 1$ centered in zero    
$\mu = 0, \sigma = 0.5$ centered in zero, the shape is thin
$\mu = 0, \sigma = 2$ centered in zero, the shape is fat


$\mu = \frac{1}{m}\sum_{i=1}^{m}x^{(i)}$    
$\sigma^2 = \frac{1}{m}\sum_{i=1}^{m}(x^{(i)} - \mu)^2$


### Algorithm

Training set {$x^{(1)}, \dots, x^{(m)}$}, each example $x \in \mathbb{R}^n$     


$x_1 \sim \mathcal{N}(\mu_1, \sigma_1^{2})$   
$x_2 \sim \mathcal{N}(\mu_2, \sigma_2^{2})$   
$x_3 \sim \mathcal{N}(\mu_3, \sigma_3^{2})$   
...     

$p(x) = p(x_1; \mu_1, \sigma_1^{2})p(x_2; \mu_2, \sigma_2^{2})\dots p(x_n; \mu_n, \sigma_n^{2})$        
$p(x) = \prod_{j=1}^{n}p(x_j; \mu_j, \sigma_j^{2})$

where   
$\mu_j = \frac{1}{m}\sum_{i=1}^{m}x_j^{(i)}$    
$\sigma_j^2 = \frac{1}{m}\sum_{i=1}^{m}(x_j^{(i)} - \mu_j)^2$

#### Anomaly detection algorithm
1. Choose features $x^{(i)}$ that you think might be indicative of anomalous examples.
2. Fit paramters $\mu_1, \dots, \mu_n, \sigma_1^2,\dots, \sigma_n^2$
    - $\mu_j = \frac{1}{m}\sum_{i=1}^{m}x_j^{(i)}$ $\to$ vectorized version $\to \mu = \frac{1}{m}\sum_{i=1}^{m}X^{(i)}$
    - $\sigma_j^2 = \frac{1}{m}\sum_{i=1}^{m}(x_j^{(i)} - \mu_j)^2$
3. Given new example $x$, compute $p(x)$:
    - $p(x) = \displaystyle\prod_{j=1}^{n}p(x_j; \mu_j, \sigma_j^{2}) =
              \displaystyle\prod_{j=1}^{n}\frac{1}{\sqrt{2\pi}\sigma_j}\exp(-\frac{(x_j-\mu_j)^2}{2\sigma_j^2}) $      
    - Anomaly if $p(x) < \epsilon $