# Concept
- Distance-based methods
    * Partitioning algorithms: K-means, K-Medoids
	* Hierarchical algorithms: Agglomerative vs divisive methods
- Probabilistic and generative models
- Density based and grid-based methods

# Similarity measures


- **Measuring similarity of objects**
	* High intra-class similarity: cohesive within clusters
	* Low inter-class similarity: distinctive between clusters

<br>

- **Distance on Numeric Data -> Minkowski Distance**

	* p=1 -> become manhattan distance
		$$ d(i, j) = \left | X_{i1} - X_{j1} \right | + \left | X_{i2}-X_{j2} \right | + ... + \left | X_{il}-X_{jl} \right | $$
        
	* p=2 -> become Euclidean distance
        $$ d(i, j) = \sqrt{\left | X_{i1} - X_{j1} \right |^{2} + \left | X_{i2}-X_{j2} \right |^{2} + ... + \left | X_{il}-X_{jl} \right |^{2}} $$

	* p=infinite -> supremum distance
		$$ d(i, j) = max\left | X_{i1} - X_{j1} \right | $$

<br>

- **Distance calculation example**

|point|attribute1|attribute2|
|---|---|---|
|x1|1|2|
|x2|3|5|
|x3|2|0|
|x4|4|5|

    * Manhattan (L1)
  ||x1|x2|x3|x4|
  |-|-|-|-|
  |x1|0||||
  |x2|5|0|||
  |x3|3|6|0||
  |x4|6|1|7|0|
  
    * Euclidean (L2)
    
||x1|x2|x3|x4|
|-|-|-|-|-|
|x1|0||||
|x2|3.61|0|||
|x3|2.24|5.1|0||
|x4|4.24|1|5.39|0|
     
     * Supremum
     
 ||x1|x2|x3|x4|
 |-|-|-|-|-|
 |x1|0||||
 |x2|3|0|||
 |x3|2|5|0||
 |x4|3|1|5|0|
  
<br>  
  
- **Proximity Measure for Symmetric vs Asymmetric Binary Variables**

||1|0|sum|
|-|-|-|-|
|1|q|r|q+r|
|0|s|t|s+t|
|sum|q+s|r+t|p|

    * Symmetric binary variable (appear and not appear are count as same)
$$ d(i, j) = \frac{r+s}{q+r+s+t} $$
 
     * Asymmetric binary variable (similar as F1 score: value FP and FN different)
         * t not import to us 
$$ d(i, j) = \frac{r+s}{q+r+s} $$

<br>

- **Distance between Categorical Attributes Ordinal Attributes and Mixed Types**

    * Categorical Variable: m: # of matches, p: total # of variables
$$ d(i, j) = \frac{p-m}{p} $$
   
<br>

- **Proximity Measure between Two vectors: Cosine Similarity**

$$ cos(d1,d2) = \frac{d1 \bullet d2}{\left \| d1 \right\| \left \| d2 \right \|} $$

<br>

- **Covariance and Correlation Coefficient**

$$ Var(x) = E[(x-\mu)^{2}] = E[x^{2}] - E(x)^{2} $$

$$ cov(x1,x2) = E[(x_{1}-\mu_{1})(x_{2}-\mu_{2})] = E[x_{1}x_{2}] - E[x_{1}]E[x_{2}] $$

$$ \rho = \frac{\sigma_{12}}{\sqrt{\sigma_{1}^{2} \sigma_{2}^{2}}} $$