In [1]:
import numpy as np
import pandas as pd
import timm
import torch
import cv2
from tqdm import tqdm
from torch import nn
from torch.utils.data import DataLoader
from torchvision import transforms
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA
from sklearn.metrics import adjusted_rand_score

## Legend

## Slava's glass. Pt. 3

Slava walked down the corridor, clutching his trophy — the Glass. The twisted halls were behind him, traps dismantled, Bayesian illusions avoided. Ahead — a heavy wooden door with a sign that read "Exit." Freedom. Victory. Light.

But just as he reached for the handle, the air shimmered. The walls groaned, and the scent of chalk filled the space, as if someone had just solved an infinite system of equations.
Out of the shadows stepped Roman — draped in a black robe, a scroll under his arm, and eyes sharp enough to slice through any neural net.

Slava knew him. Knew him from the university days.
They said Roman could prove a theorem before it was even formulated.
They said he could see the loss function of reality.
The Mathematical Mage.

— “Wanna play a game?” Roman asked, and with a flick of his wrist, the Glass vanished from Slava’s hand and reappeared in Roman’s.

— “Not bad. You escaped my labyrinth. But if you want to leave this place with your precious...”
He tossed the glass and caught it casually.

— “You’ll have to impress me.”

From his sleeve, Roman pulled out a deck of old cards. He snapped his fingers — and the deck exploded into glowing fragments, scattering through the room like data points in chaos.

— “Bring me this deck — whole, complete, and intact. Not just gathered. Understood.”

Slava took a deep breath. He knew this was the final challenge. The hardest one.
He opened his laptop, put on his headphones — and in his ears began the final lines of Al Pacino’s legendary speech:

“I mean one-half a step too late, or too early, and you don’t quite make it. One-half second too slow, too fast, you don’t quite catch it. The inches we need are everywhere around us. They’re in every break of the game, every minute, every second. On this team, we fight for that inch. On this team, we tear ourselves and everyone else around us to pieces for that inch. We claw with our fingernails for that inch, because we know when we add up all those inches that’s gonna make the f**** difference between winning and losing! Between livin' and dyin'!”

He hit Enter.

🧩 Your Task:
You are given a set of arrays, where each array represents an image. The images have been augmented. It is known that all data comes from 32 original images, each one followed by several transformed versions.
Your goal is to cluster these arrays.

Slava looked at the scattered card fragments.

— “The game’s not over yet.”

## Overview

You are given two sets of arrays, **X₁** and **X₂**. Each row in both arrays corresponds to the **same image** (these arrays represent its features). Your task is to cluster the given images. It is known that there are **32 clusters**. Additionally, each original image has been **augmented multiple times**, and the augmented versions have been added to the dataset.

## Metric

[adjusted_rand_score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_rand_score.html).

## Restriction

If you want to use a neural network, you can only use the neural network below – EmbNet. You can't change anything inside the neural network class. You **can't fine-tune the model** and change the weights.

## `sample_submission_cluster.csv`

`sample_submission_cluster.csv` consists of a column ID, where the value is the row number in the dataset, and target columns with your predictions of clusters.

===

When you make a submit, make a Quick Save of the notebook, otherwise we may reject your solution.

You must solve this task on KAGGLE (YOU CAN'T USE CLOUD.RU).

In [2]:
class EmbNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = timm.create_model('tiny_vit_5m_224.dist_in22k_ft_in1k', pretrained=True, num_classes=0)

    def forward(self, image):
        x = self.model(image)
        return x

In [3]:
def generate_submit(pred_cluster):
    import hashlib
    
    sub = pd.DataFrame()
    sub['id'] = np.arange(len(pred_cluster))
    sub['target'] = pred_cluster
    
    hsh = hashlib.sha256(sub.to_csv(index=False).encode('utf-8')).hexdigest()[:8]
    submit_path = f"submit_{hsh}.csv"
    
    print(f"SUBMIT_NAME: {submit_path}")
    print(sub.head(10))
    sub.to_csv(submit_path, index=None)

---

In [4]:
X_1 = np.load('data_1.npz')
X_1 = X_1.f.arr_0
X_2 = np.load('data_2.npz')
X_2 = X_2.f.arr_0

X_1.shape, X_2.shape

((3840, 128, 4), (3840, 4, 128))

In [5]:
X_tensor = []
for x_1, x_2 in tqdm(zip(X_1, X_2), total=len(X_1)):
    pseudo_img = np.stack([
        cv2.resize(x_1, (224, 224)),
        cv2.resize(x_2, (224, 224)),
        cv2.resize((x_1 @ x_2), (224, 224))
    ], axis=0)
    X_tensor.append(torch.tensor(pseudo_img))

X_tensor = torch.stack(X_tensor, dim=0)
X_tensor.shape

100%|████████████████████████████████████████████████████████████████████████████| 3840/3840 [00:03<00:00, 1226.94it/s]


torch.Size([3840, 3, 224, 224])

In [6]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
net = EmbNet().to(device)
net.model.eval();

dl_X = DataLoader(X_tensor.to(device), batch_size=16)

In [7]:
X_emb = []
with torch.no_grad():
    for batch_X in tqdm(dl_X):
        batch_X_emb = net(batch_X)
        X_emb.append(batch_X_emb.detach().cpu().numpy())

X_emb = np.vstack(X_emb)
X_emb.shape

100%|████████████████████████████████████████████████████████████████████████████████| 240/240 [00:08<00:00, 27.08it/s]


(3840, 320)

In [8]:
pca = PCA(10, random_state=42)
X_emb_pca = pca.fit_transform(X_emb)

In [9]:
km = KMeans(32, random_state=42)
pred_cluster = km.fit_predict(X_emb_pca)
np.unique(pred_cluster, return_counts=True)

(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]),
 array([199, 164,  81, 128, 134, 176, 118, 133, 134,  63, 138, 114, 107,
        122, 148,  38, 135,  67, 139, 127,  65, 166, 104,  93, 141, 142,
         73, 135, 122,  57, 200,  77], dtype=int64))

In [10]:
generate_submit(pred_cluster)

SUBMIT_NAME: submit_fff2b404.csv
   id  target
0   0      24
1   1       1
2   2      30
3   3       0
4   4      22
5   5      25
6   6       6
7   7       3
8   8       8
9   9      11


## Score

- Private: 0.07474
- Public: 0.07916

> Baseline: (`random_state=42`)
> - Private: 0.02911
> - Public: 0.02857
