This examples show how to use MiniSom to solve a classification problem. The classification mechanism will be implemented with MiniSom and the evaluation will make use of sklearn.

First, let's load a dataset (in this case the famous Iris dataset) and apply normalization:

In [1]:
from sklearn.datasets import load_iris
from minisom import MiniSom
import torch
import numpy as np

data, labels = load_iris(return_X_y=True)
labels = torch.tensor(labels, dtype=torch.float32)
data = torch.tensor(np.apply_along_axis(lambda x: x/np.linalg.norm(x), 1, data), dtype=torch.float32)

Here's naive classification function that classifies a sample in `data` using the label assigned to the associated winning neuron. A label $c$ is associated to a neuron if the majority of samples mapped in that neuron have label $c$. The function will assign the most common label in the dataset in case that a sample is mapped to a neuron for which no class is assigned.

In [2]:
def classify(som, data, label):
    """Classifies each sample in data in one of the classes definited
    using the method labels_map.
    Returns a list of the same length of data where the i-th element
    is the class assigned to data[i].
    """
    winmap = som.labels_map(data, label)
    default_class = np.sum(list(winmap.values())).most_common()[0][0]
    result = []
    for d in data:
        win_position = som.winner(d)
        if win_position in winmap:
            result.append(winmap[win_position].most_common()[0][0])
        else:
            result.append(default_class)
    return result

Now we can 1) split the data in train and test set, 2) train the som, 3) print the classification report that contains all the metrics to evaluate the results of the classification.

In [3]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

X_train, X_test, y_train, y_test = train_test_split(data, labels, stratify=labels, random_state=0)

som = MiniSom(7, 7, 4, sigma=3, learning_rate=0.5, 
              neighborhood_function='triangle', random_seed=0)
som.pca_weights_init(X_train)
som.train_random(X_train, 500, verbose=False)

print(classification_report(y_test, classify(som, X_test, y_test)))

              precision    recall  f1-score   support

         0.0       1.00      1.00      1.00        13
         1.0       1.00      1.00      1.00        13
         2.0       1.00      1.00      1.00        12

    accuracy                           1.00        38
   macro avg       1.00      1.00      1.00        38
weighted avg       1.00      1.00      1.00        38

