# 04 Predicting from Data: Machine Learning 

<center><img src="figs/04_clustering.jpg" alt="default" width=650px/></center>


<center><img src="figs/04_mnist.png" alt="default" width=450px/></center>


#### Unit 1: Vectors, Textbook Ch. 1-5
- 01 Vectors
- 02 Linear Functions
- 03 Norms and Distances
- **_04 Clustering_**
- 05 Linear Independence

#### Unit 2: Matrices, Textbook Ch. 6-11
#### Unit 3: Least Squares, Textbook Ch. 12-14

##### Outline: 04 Clustering

- [Overview of Machine Learning](#Clustering)
- [Clustering](#Clustering)
- [K-means Algorithm](#Algorithm)
- [Applications](#Applications)

##### Outline: 04 Clustering

- **[Overview of Machine Learning](#Clustering)**
- [Clustering](#Clustering)
- [K-means Algorithm](#Algorithm)
- [Applications](#Applications)

$\color{#EF5645}{\text{Recall}}$:
- In **02 Linear functions**:
  - We predicted the price of a house:
    - quantitative prediction.
- In **03 Norms and Distances**:
  - We predicted if a subject was schizophrenic:
    - categorical prediction: yes/no.
    
We say that we performed _regression_ and _classification_.

_Regression_ and _classification_ are examples of ML methods.

<center>  </center>
<center><img src="figs/04_ai.png" alt="default" width=1800px/></center>

Today, we see another method: _clustering_.

##### Outline: 04 Clustering

- [Overview of Machine Learning](#Clustering)
- **[Clustering](#Clustering)**
- [K-means Algorithm](#Algorithm)
- [Applications](#Applications)

### Clustering: Goal (Intuition)
- Given: (i) $N$ $n$-vectors $x_1, . . . , x_N$, (ii) integer $k$.
- $\color{#EF5645}{\text{Goal}}$:
  - Group/Cluster $N$ $n$-vectors into $k$ groups/clusters
  - _such that_: vectors in the same group are "close".

<center><img src="figs/04_clustering.jpg" alt="default" width=380px/></center>

$\color{#047C91}{\text{Exercise}}$: What is $k$ in the figure above? $n$? $N$?



### Clustering in ECE

- patient clustering 
  - $x_i$ are test results, symptoms of patient $i$
- customer market "segmentation" (clustering)
  - $x_i$ is purchase history of customer $i$ 
- financial sectors clustering
  - $x_i$ are financial attributes of company $i$


### Clustering: Goal (Math)
- For each group/cluster $j=1, ..., k$: 
  - Group $G_j$: Set of indices in $1, ..., N$ (data points)
  - Representative $z_j$: typical element of $G_j$.
- For each data point $i=1,...,N$: 
  - Assignment $c_i$: $x_i$ is in $G_{c_i}$

$\color{#EF5645}{\text{Goal}}$: Find $c_i$, $z_j$ to minimize 
$J = \frac{1}{N}\sum_{i=1}^N ||x_i - z_{c_i}||^2$
, i.e. the mean square distance from vectors to their representatives.


##### Outline: 04 Clustering

- [Overview of Machine Learning](#Clustering)
- [Clustering](#Clustering)
- **[K-means Algorithm](#Algorithm)**
- [Applications](#Applications)

### K-means algorithm


- Alternate between:
  - (i) update groups, i.e assignments $c_1, ..., c_N$, 
  - (ii) update representatives $z_1, ..., z_k$.
  
- Such that the objective $J$ decreases at each step.

<center><img src="figs/04_it0.png" alt="default"/></center>

<center><img src="figs/04_it1.png" alt="default"/></center>

<center><img src="figs/04_it2.png" alt="default"/></center>

<center><img src="figs/04_it3.png" alt="default"/></center>

<center><img src="figs/04_it5.png" alt="default"/></center>

### (i) Update the groups

- Given: representatives $z_1, ..., z_k$
- $\color{#EF5645}{\text{Goal for (i)}}$: Assign to groups, i.e. choose $c_1, ..., c_N$
  - Assign each $x_i$ to its nearest representative. Justification:
    - Observe: 
      - $c_i$ only appears in $||x_i - z_{c_i}||^2$ in $J$
    - Conclude: to minimize over $c_i$, choose $c_i$ so $‖x_i − z_{c_i} ‖^2 = min_{j \in \{1, ..., k\}} ‖x_i − z_j ‖^2$.


### (ii) Update the representatives

- Given the partition $G_1, . . . , G_k$
- $\color{#EF5645}{\text{Goal for (ii)}}$: Choose representatives $z_1, . . . , z_k$
  - Choose $z_j$ = mean of the points in group $j$. Justification:
    - Observe: $J$ splits as $J = J_1 + · · · + J_k$ with:
$$J_j = \frac{1}{N} \sum_{i \in G_j} ‖x_i − z_j ‖^2$$
    - Conclude: Choose $z_j$ to minimize its $J_j$: $z_j = \frac{1}{|G_j|} \sum_{i \in G_j} x_i$ = mean/center/centroid.



### Convergence of K-means

- How many times do we iterate these steps?
  - Until the $z_j$’s stop changing: 
    - = "convergence" of the algorithm.

# Pseudo-code



<center><img src="figs/04_kmeans.jpg" alt="default"/></center>


<center><img src="figs/04_conv.png" alt="default"/></center>

- $\color{#EF5645}{\text{Remarks}}$:
  - $J$ decreases at each step, 
  - but final clustering might not minimize $J$
    - it might only be a _local_ minimum.
    
- $\color{#EF5645}{\text{Recommendation}}$:
  - Run $k$-means 10 times, with different initial representatives
  - Take as final partition the one with smallest $J$

##### Outline: 04 Clustering

- [Overview of Machine Learning](#Clustering)
- [Clustering](#Clustering)
- [K-means Algorithm](#Algorithm)
- **[Applications](#Applications)**

### MNIST Dataset: Find Digits

- MNIST images of handwritten digits (via Yann Lecun) 
- $60k$ images of size 28 × 28, reshaped as 784-vectors $x_i$

<center><img src="figs/04_mnist.png" alt="default" width=250px/></center>

- $\color{#EF5645}{\text{Goal}}$: Group these images into groups of same digit.
- $\color{#047C91}{\text{Exercice}}$: What are $k, N, n$?
- Implement it practice? Will be in your next homework!

##### Outline: 04 Clustering

- [Overview of Machine Learning](#Clustering)
- [Clustering](#Clustering)
- [K-means Algorithm](#Algorithm)
- [Applications](#Applications)

Resources: Textbook, Ch. 4