# Instance-based classification

## Idea
* Similar objects lie close in terms of distance in some ‘feature’ space
* Distance to other data points is a key to predict a class. **We only need the distance**

<img src='imgs/ibc1.png'/>

## Popular metrics

* **Euclidian distance** - $\rho(x, y) = \sqrt{\sum (x_i - y_i)^2}$
* **Minkowski distance** - $\rho(x, y) = \left(\sum |x_i - y_i|^p\right)^{\frac 1p}$
* **Cosine distance** - $\rho(x, y) = \frac{\sum x_i y_i}{|x||y|}$

## General approach

* $h(x) = argmax_y \sum\limits_{y_i=y}w(x_i, x)$
* $w(x_i, x)$ is a weight of object $x_i$ for object $x$
* affinity of x to y - $\Gamma_y(x) = \sum\limits_{y_i=y}w(x_i, x)$
* No learning
* Models vary by the choice of $w(x, y)$

## k Nearest Neighbors (kNN)

* kNN: $w(x, y)=1$ if $x$ is one of the k nearest neighbors of $y$

<img src='imgs/ibc2.png'/>

## No neighbors

* majority class
* depending on the task, minimizing some risk

<img src='imgs/ibc3.png'/>

## Bayesian view

* $P(y|x) = \frac{P(x|y)P(y)}{P(x)}$
* estimating probability in a small area around $x$. $n$-size of dataset, $n'$ - number of points with label $y$,  $k$-size of the small area, $k'$ number of points in small area with label $y$
* $P(x)\sim \frac kn$
* $P(y)\sim \frac {n'}{n}$
* $P(x|y)\sim \frac {k'}{n'}$
* $P(y|x) = \frac{\frac {n'}n \frac {k'}{n'}}{\frac kn} = \frac {k'}k $

## Other methods

* **RadiusNeighbors** all neighbors in radius - $w(x, y)=1$ if $\rho(x, y)<R$

* **Weighted kNN** bigger weights to the nearest neighbors - $w(x, y)=max\left(0, \frac{r-\rho(x, y)}{r}\right)$

* **Parzen window** use some kernel function 
    * constant window  $w(x, y)= K(\frac{\rho(x_i, x_i)}{r})$
    * variable window  $w(x, y)= K(\frac{\rho(x, x_i)}{\rho(x_{k+1}, y)})$, where $x_{k+1}$ - k+1 neighbor 
    


<img src='imgs/ibc4.png'/>


* **Potential energy method** - $h(x) = argmax_y \sum\limits_{y_i=y}\gamma_i w(x_i, x)$
    * init $\gamma_i=1$
    * if $h(x_i)\neq y_i \rightarrow \gamma_i=\gamma_i+1$

* **Prototype selection** - $h(x) = argmax_y \sum\limits_{x\in \Omega, y_i=y}\gamma_i w(x_i, x)$
    * Edition - points on the border
    * Condensation - points inside the class
    
* DROP5 (decremental reduction optimization procedure)
    * start with full dataset
    * sort by affinity tp the closest incorrect class
    * go in ascending order
    * delete x if LOO doesn't increase

<img src='imgs/ibc5.png'/>