# MNIST Unsupervised Classification

How to get 0.92 V-score at MNIST clustering with [sklearn](https://scikit-learn.org/stable/) and [sknetwork](https://scikit-network.readthedocs.io/en/latest/).

In this [scikit-learn demo](https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_digits.html), PCA + KMeans yields a V-score of 0.69.

In [20]:
import numpy as np

from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.metrics import v_measure_score

from sknetwork.clustering import Louvain
from sknetwork.utils import KNeighborsTransformer

In [22]:
EMB_DIM = 32
N_NEIGH = 10
γ = 1.

In [12]:
digits = load_digits()

### Dimension reduction with PCA

In [13]:
pca = PCA(n_components=EMB_DIM)
embedding = pca.fit_transform(digits.data)

### Nearest Neighbors graph in embedding space

In [14]:
knn = KNeighborsTransformer(n_neighbors=N_NEIGH, n_jobs=-1, undirected=True)
adjacency = knn.fit_transform(embedding)

### Louvain clustering

In [23]:
louvain = Louvain(resolution=γ)
labels = louvain.fit_transform(adjacency)

### Results

In [24]:
v_measure_score(labels, digits.target)

0.9185069468307052

In [25]:
np.unique(labels, return_counts=True)

(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),
 array([186, 182, 180, 178, 177, 177, 173, 167, 146,  97,  65,  42,  27]))