# Hotel-ID 2021 FGVC8 — Plan

Goal: Build a medal-winning baseline fast; iterate with solid CV and GPU-accelerated training.

Plan:
- Environment check: verify GPU (nvidia-smi) and install correct CUDA 12.1 PyTorch if needed.
- Data audit:
  - Inspect train.csv schema, class counts, and image availability under train_images/*.
  - Verify test_images/ count and sample_submission format.
- Validation:
  - Stratified KFold by hotel_id; ensure no leakage. Use deterministic seeds.
- Baseline model:
  - torchvision ImageNet-pretrained backbone (e.g., tf_efficientnet_b0 or resnet50; if no timm preinstalled, use torchvision resnet50).
  - Image size 224-256; augmentations (flip, color jitter, random resized crop).
  - Optimizer AdamW, cosine schedule, label smoothing; focal loss optional.
  - FP16 mixed precision with GradScaler.
- Inference:
  - TTA (horizontal flip) and average logits.
  - Produce top-5 per image in required submission format.
- Iterations:
  - If baseline is weak, switch to timm and larger backbones, mixup/cutmix; tune image size and epochs.
  - Cache datasets and OOF logits; try multiple seeds and blend.

Requests to experts will be made after: plan, data audit, baseline CV, first LB score, and major changes.

In [1]:
import os, sys, time, json, shutil, subprocess, random, textwrap
import pandas as pd
from pathlib import Path

print('=== GPU CHECK ===', flush=True)
try:
    out = subprocess.run(['bash','-lc','nvidia-smi || true'], capture_output=True, text=True, check=False)
    print(out.stdout)
except Exception as e:
    print('nvidia-smi failed:', e)

print('=== CWD and Files ===', flush=True)
cwd = Path('.').resolve()
print('CWD:', cwd)
for p in ['train_images','test_images','train.csv','sample_submission.csv','description.md']:
    print(p, 'exists:', Path(p).exists())

print('\n=== Data Audit ===', flush=True)
train_csv = pd.read_csv('train.csv')
print('train.csv shape:', train_csv.shape)
print('train.csv columns:', list(train_csv.columns))
print(train_csv.head(3))
if 'hotel_id' in train_csv.columns:
    print('n_unique hotel_id:', train_csv['hotel_id'].nunique())
    print('top hotel_id counts:\n', train_csv['hotel_id'].value_counts().head(5))

sample_sub = pd.read_csv('sample_submission.csv')
print('\nsample_submission shape:', sample_sub.shape)
print(sample_sub.head(3))

print('\n=== Images Count ===', flush=True)
test_dir = Path('test_images')
train_dir = Path('train_images')
def count_images(path: Path):
    exts = {'.jpg','.jpeg','.png','.bmp','.webp'}
    c = 0
    for root, dirs, files in os.walk(path):
        for f in files:
            if Path(f).suffix.lower() in exts:
                c += 1
    return c
print('test_images count:', count_images(test_dir))
print('train_images subdirs:', len([d for d in train_dir.iterdir() if d.is_dir()]))
print('train_images total images (may take a moment)...')
t0=time.time()
train_img_count = count_images(train_dir)
print('train_images count:', train_img_count, 'elapsed:', round(time.time()-t0,2),'s')

print('\n=== Verify file existence for sample rows ===', flush=True)
exts = {'.jpg','.jpeg','.png','.bmp','.webp'}
def find_train_path(row):
    # common patterns:
    # 1) train_images/<hotel_id>/<image>
    # 2) train_images/<image>
    img = str(row.get('image', row.get('image_id', '')))
    hid = str(row.get('hotel_id', ''))
    candidates = []
    if img:
        candidates.append(train_dir/img)
        if hid:
            candidates.append(train_dir/hid/img)
    for c in candidates:
        if c.exists():
            return str(c)
    # fallback: search only within the hotel_id folder if exists, try any extension
    if hid and (train_dir/hid).exists():
        for e in exts:
            p = train_dir/hid/(img if Path(img).suffix else img+e)
            if p.exists():
                return str(p)
    return None

samples = train_csv.sample(min(5, len(train_csv)), random_state=42)
found = 0
for _, r in samples.iterrows():
    p = find_train_path(r)
    print('image lookup:', {'row_image': r.get('image', r.get('image_id', '')), 'hotel_id': r.get('hotel_id', None), 'path': p})
    if p: found += 1
print(f'Found {found}/{len(samples)} sample image files')

print('\n=== Test sample ===', flush=True)
test_files = sorted([p.name for p in test_dir.iterdir() if p.is_file()])[:5]
print('test_images sample files:', test_files)

print('\nAudit complete.')

=== GPU CHECK ===


Sat Sep 27 05:06:15 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.06             Driver Version: 550.144.06     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A10-24Q                 On  |   00000002:00:00.0 Off |                    0 |
| N/A   N/A    P0             N/A /  N/A  |     182MiB /  24512MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

CWD: /var/lib/simon/agent_run_states/hotel-id-2021-fgvc8-20250927-045136
train_images exists: True
test_images exists: True
train.csv exists: True
sample_submission.csv exists: True
description.md exists: True

=== Data Audit ===


train.csv shape: (87798, 4)
train.csv columns: ['image', 'chain', 'hotel_id', 'timestamp']
                  image  chain  hotel_id            timestamp
0  d29287f52c2a871f.jpg      5     22408  2018-04-16 17:01:49
1  e9d067c249e4c2f9.jpg     70      2324  2016-07-08 22:26:21
2  cc9877a40a63ed93.jpg      4     47514  2017-04-14 02:28:56
n_unique hotel_id: 7770
top hotel_id counts:
 hotel_id
36363    86
18807    84
60181    81
53586    80
64314    77
Name: count, dtype: int64

sample_submission shape: (9756, 2)
                  image                       hotel_id
0  f1608c9f17fb6920.jpg  36363 53586 18807 64314 60181
1  c6c63939c67931e1.jpg  36363 53586 18807 64314 60181
2  83c214f3e90717ed.jpg  36363 53586 18807 64314 60181

=== Images Count ===


test_images count: 9756
train_images subdirs: 88
train_images total images (may take a moment)...


train_images count: 87797 elapsed: 0.18 s

=== Verify file existence for sample rows ===


image lookup: {'row_image': 'b3b34c5d6186b362.jpg', 'hotel_id': 61722, 'path': None}
image lookup: {'row_image': 'cdf882b7489fa4d0.jpg', 'hotel_id': 27853, 'path': None}
image lookup: {'row_image': '99c98c732e994ea5.jpg', 'hotel_id': 33691, 'path': None}
image lookup: {'row_image': 'c93609d63d82b1cf.jpg', 'hotel_id': 28303, 'path': None}
image lookup: {'row_image': 'b595482aaf544db6.jpg', 'hotel_id': 45003, 'path': None}
Found 0/5 sample image files

=== Test sample ===


test_images sample files: ['80196e6999ce63cf.jpg', '80296afd55d516ea.jpg', '802aab95d62b7daa.jpg', '802af4d04faf14df.jpg', '802b5ed622fd3587.jpg']

Audit complete.


In [2]:
from pathlib import Path
import pandas as pd

print('=== Build file paths using chain directory ===', flush=True)
train_df = pd.read_csv('train.csv')
train_df['chain'] = train_df['chain'].astype(str)
train_df['file_path'] = 'train_images/' + train_df['chain'] + '/' + train_df['image'].astype(str)
train_df['exists'] = train_df['file_path'].apply(lambda p: Path(p).exists())
coverage = train_df['exists'].mean()
missing = (~train_df['exists']).sum()
print(f'train path coverage: {coverage:.4f} ({len(train_df)-missing}/{len(train_df)}) missing: {missing}')
if missing > 0:
    print('Sample missing rows:')
    print(train_df.loc[~train_df['exists'], ['image','chain','hotel_id']].head(5))

train_df = train_df.loc[train_df['exists']].reset_index(drop=True)
print('Kept rows after drop missing:', len(train_df))

print('\n=== Build test file paths ===', flush=True)
test_dir = Path('test_images')
test_files = sorted([p.name for p in test_dir.iterdir() if p.is_file()])
test_df = pd.DataFrame({'image': test_files})
test_df['file_path'] = test_df['image'].apply(lambda x: str(test_dir / x))
print('Test count:', len(test_df))
print(test_df.head(3))

print('\nReady for embedding extraction next.')

=== Build file paths using chain directory ===


train path coverage: 1.0000 (87798/87798) missing: 0
Kept rows after drop missing: 87798

=== Build test file paths ===


Test count: 9756
                  image                         file_path
0  80196e6999ce63cf.jpg  test_images/80196e6999ce63cf.jpg
1  80296afd55d516ea.jpg  test_images/80296afd55d516ea.jpg
2  802aab95d62b7daa.jpg  test_images/802aab95d62b7daa.jpg

Ready for embedding extraction next.


In [3]:
import os, sys, subprocess, shutil
from pathlib import Path

def pip(*args):
    print('>', *args, flush=True)
    subprocess.run([sys.executable, '-m', 'pip', *args], check=True)

print('=== Install CUDA 12.1 PyTorch stack and deps ===', flush=True)
# Uninstall any preexisting torch stack to avoid conflicts
for pkg in ('torch','torchvision','torchaudio'):
    subprocess.run([sys.executable, '-m', 'pip', 'uninstall', '-y', pkg], check=False)

# Clean potential shadow dirs (idempotent)
for d in (
    '/app/.pip-target/torch',
    '/app/.pip-target/torchvision',
    '/app/.pip-target/torchaudio',
    '/app/.pip-target/torch-2.4.1.dist-info',
    '/app/.pip-target/torchvision-0.19.1.dist-info',
    '/app/.pip-target/torchaudio-2.4.1.dist-info',
):
    if os.path.exists(d):
        print('Removing', d, flush=True)
        shutil.rmtree(d, ignore_errors=True)

# Install exact cu121 wheels
pip('install',
    '--index-url', 'https://download.pytorch.org/whl/cu121',
    '--extra-index-url', 'https://pypi.org/simple',
    'torch==2.4.1', 'torchvision==0.19.1', 'torchaudio==2.4.1')

# Write constraints to pin torch versions for subsequent installs
Path('constraints.txt').write_text('torch==2.4.1\ntorchvision==0.19.1\ntorchaudio==2.4.1\n')

print('=== Install non-torch deps (timm, faiss-cpu, pillow, opencv) ===', flush=True)
pip('install', '-c', 'constraints.txt',
    'timm==1.0.9',
    'faiss-cpu==1.8.0.post1',
    'opencv-python-headless==4.10.0.84',
    'pillow==10.4.0',
    '--upgrade-strategy', 'only-if-needed')

import torch, torchvision
print('torch:', torch.__version__, 'CUDA build:', getattr(torch.version, 'cuda', None))
print('CUDA available:', torch.cuda.is_available())
if torch.cuda.is_available():
    print('GPU:', torch.cuda.get_device_name(0))

import timm, faiss
print('timm:', timm.__version__)
print('faiss:', getattr(faiss, '__version__', 'unknown'))
print('Install check complete.')

=== Install CUDA 12.1 PyTorch stack and deps ===






> install --index-url https://download.pytorch.org/whl/cu121 --extra-index-url https://pypi.org/simple torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1




Looking in indexes: https://download.pytorch.org/whl/cu121, https://pypi.org/simple


Collecting torch==2.4.1
  Downloading https://download.pytorch.org/whl/cu121/torch-2.4.1%2Bcu121-cp311-cp311-linux_x86_64.whl (799.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 799.0/799.0 MB 240.8 MB/s eta 0:00:00


Collecting torchvision==0.19.1
  Downloading https://download.pytorch.org/whl/cu121/torchvision-0.19.1%2Bcu121-cp311-cp311-linux_x86_64.whl (7.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.1/7.1 MB 181.8 MB/s eta 0:00:00


Collecting torchaudio==2.4.1
  Downloading https://download.pytorch.org/whl/cu121/torchaudio-2.4.1%2Bcu121-cp311-cp311-linux_x86_64.whl (3.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 155.6 MB/s eta 0:00:00


Collecting nvidia-cuda-cupti-cu12==12.1.105
  Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 219.3 MB/s eta 0:00:00


Collecting typing-extensions>=4.8.0
  Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 KB 320.2 MB/s eta 0:00:00


Collecting nvidia-curand-cu12==10.3.2.106
  Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 215.9 MB/s eta 0:00:00


Collecting filelock
  Downloading filelock-3.19.1-py3-none-any.whl (15 kB)


Collecting jinja2
  Downloading jinja2-3.1.6-py3-none-any.whl (134 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 KB 463.7 MB/s eta 0:00:00


Collecting nvidia-cusparse-cu12==12.1.0.106
  Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 226.0 MB/s eta 0:00:00


Collecting nvidia-nvtx-cu12==12.1.105
  Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 KB 474.6 MB/s eta 0:00:00


Collecting nvidia-cuda-runtime-cu12==12.1.105
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 KB 180.2 MB/s eta 0:00:00


Collecting triton==3.0.0
  Downloading triton-3.0.0-1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (209.4 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.4/209.4 MB 201.0 MB/s eta 0:00:00


Collecting networkx
  Downloading networkx-3.5-py3-none-any.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 555.1 MB/s eta 0:00:00


Collecting nvidia-cublas-cu12==12.1.3.1
  Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 231.0 MB/s eta 0:00:00


Collecting nvidia-cufft-cu12==11.0.2.54
  Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 138.7 MB/s eta 0:00:00


Collecting nvidia-cudnn-cu12==9.1.0.70
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 186.5 MB/s eta 0:00:00


Collecting nvidia-nccl-cu12==2.20.5
  Downloading nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.2/176.2 MB 199.4 MB/s eta 0:00:00


Collecting fsspec
  Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.3/199.3 KB 454.5 MB/s eta 0:00:00


Collecting nvidia-cuda-nvrtc-cu12==12.1.105
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 215.0 MB/s eta 0:00:00


Collecting sympy
  Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 513.9 MB/s eta 0:00:00


Collecting nvidia-cusolver-cu12==11.4.5.107
  Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 170.2 MB/s eta 0:00:00


Collecting pillow!=8.3.*,>=5.3.0
  Downloading pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 551.8 MB/s eta 0:00:00


Collecting numpy
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 377.3 MB/s eta 0:00:00


Collecting nvidia-nvjitlink-cu12
  Downloading nvidia_nvjitlink_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (39.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.7/39.7 MB 219.5 MB/s eta 0:00:00


Collecting MarkupSafe>=2.0
  Downloading MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23 kB)


Collecting mpmath<1.4,>=1.1.0
  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 494.7 MB/s eta 0:00:00


Installing collected packages: mpmath, typing-extensions, sympy, pillow, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, MarkupSafe, fsspec, filelock, triton, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch, torchvision, torchaudio


Successfully installed MarkupSafe-3.0.2 filelock-3.19.1 fsspec-2025.9.0 jinja2-3.1.6 mpmath-1.3.0 networkx-3.5 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.9.86 nvidia-nvtx-cu12-12.1.105 pillow-11.3.0 sympy-1.14.0 torch-2.4.1+cu121 torchaudio-2.4.1+cu121 torchvision-0.19.1+cu121 triton-3.0.0 typing-extensions-4.15.0


=== Install non-torch deps (timm, faiss-cpu, pillow, opencv) ===


> install -c constraints.txt timm==1.0.9 faiss-cpu==1.8.0.post1 opencv-python-headless==4.10.0.84 pillow==10.4.0 --upgrade-strategy only-if-needed


Collecting timm==1.0.9
  Downloading timm-1.0.9-py3-none-any.whl (2.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 66.3 MB/s eta 0:00:00


Collecting faiss-cpu==1.8.0.post1
  Downloading faiss_cpu-1.8.0.post1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 27.0/27.0 MB 66.6 MB/s eta 0:00:00
Collecting opencv-python-headless==4.10.0.84
  Downloading opencv_python_headless-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (49.9 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.9/49.9 MB 246.2 MB/s eta 0:00:00


Collecting pillow==10.4.0
  Downloading pillow-10.4.0-cp311-cp311-manylinux_2_28_x86_64.whl (4.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 251.8 MB/s eta 0:00:00
Collecting torchvision
  Downloading torchvision-0.19.1-cp311-cp311-manylinux1_x86_64.whl (7.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 262.8 MB/s eta 0:00:00


Collecting huggingface_hub
  Downloading huggingface_hub-0.35.1-py3-none-any.whl (563 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 563.3/563.3 KB 542.2 MB/s eta 0:00:00
Collecting safetensors
  Downloading safetensors-0.6.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 485.8/485.8 KB 302.9 MB/s eta 0:00:00


Collecting pyyaml
  Downloading pyyaml-6.0.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (806 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 806.6/806.6 KB 494.4 MB/s eta 0:00:00
Collecting torch
  Downloading torch-2.4.1-cp311-cp311-manylinux1_x86_64.whl (797.1 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 797.1/797.1 MB 211.4 MB/s eta 0:00:00


Collecting packaging
  Downloading packaging-25.0-py3-none-any.whl (66 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.5/66.5 KB 407.7 MB/s eta 0:00:00


Collecting numpy<2.0,>=1.0
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 305.1 MB/s eta 0:00:00


Collecting requests
  Downloading requests-2.32.5-py3-none-any.whl (64 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.7/64.7 KB 405.9 MB/s eta 0:00:00
Collecting hf-xet<2.0.0,>=1.1.3
  Downloading hf_xet-1.1.10-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 260.5 MB/s eta 0:00:00
Collecting typing-extensions>=3.7.4.3
  Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 KB 375.8 MB/s eta 0:00:00
Collecting tqdm>=4.42.1
  Downloading tqdm-4.67.1-py3-none-any.whl (78 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.5/78.5 KB 432.0 MB/s eta 0:00:00
Collecting filelock
  Downloading filelock-3.19.1-py3-none-any.whl (15 kB)


Collecting fsspec>=2023.5.0
  Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.3/199.3 KB 474.2 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.2.106
  Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 267.4 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107
  Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 281.6 MB/s eta 0:00:00
Collecting nvidia-nccl-cu12==2.20.5
  Downloading nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.2/176.2 MB 183.0 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 KB 278.1 MB/s eta 0:00:00
Collecting nvidia-nvtx-cu12==12.1.105


  Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 KB 379.3 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu12==12.1.0.106
  Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 229.2 MB/s eta 0:00:00
Collecting triton==3.0.0
  Downloading triton-3.0.0-1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (209.4 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.4/209.4 MB 253.7 MB/s eta 0:00:00
Collecting nvidia-cufft-cu12==11.0.2.54
  Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 216.2 MB/s eta 0:00:00
Collecting networkx
  Downloading networkx-3.5-py3-none-any.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 388.5 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu12==12.1.105
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 206.7 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105
  Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 149.4 MB/s eta 0:00:00
Collecting jinja2
  Downloading jinja2-3.1.6-py3-none-any.whl (134 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 KB 453.7 MB/s eta 0:00:00
Collecting sympy
  Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 495.9 MB/s eta 0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1
  Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 247.9 MB/s eta 0:00:00


Collecting nvidia-cudnn-cu12==9.1.0.70
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 411.4 MB/s eta 0:00:00


Collecting nvidia-nvjitlink-cu12
  Downloading nvidia_nvjitlink_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (39.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.7/39.7 MB 203.2 MB/s eta 0:00:00


Collecting MarkupSafe>=2.0
  Downloading MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23 kB)
Collecting certifi>=2017.4.17
  Downloading certifi-2025.8.3-py3-none-any.whl (161 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 161.2/161.2 KB 476.9 MB/s eta 0:00:00
Collecting charset_normalizer<4,>=2
  Downloading charset_normalizer-3.4.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (150 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 150.3/150.3 KB 490.7 MB/s eta 0:00:00
Collecting idna<4,>=2.5


  Downloading idna-3.10-py3-none-any.whl (70 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 KB 428.3 MB/s eta 0:00:00
Collecting urllib3<3,>=1.21.1
  Downloading urllib3-2.5.0-py3-none-any.whl (129 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.8/129.8 KB 425.3 MB/s eta 0:00:00
Collecting mpmath<1.4,>=1.1.0
  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 473.7 MB/s eta 0:00:00


Installing collected packages: mpmath, urllib3, typing-extensions, tqdm, sympy, safetensors, pyyaml, pillow, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, MarkupSafe, idna, hf-xet, fsspec, filelock, charset_normalizer, certifi, triton, requests, opencv-python-headless, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, faiss-cpu, nvidia-cusolver-cu12, huggingface_hub, torch, torchvision, timm


Successfully installed MarkupSafe-3.0.2 certifi-2025.8.3 charset_normalizer-3.4.3 faiss-cpu-1.8.0.post1 filelock-3.19.1 fsspec-2025.9.0 hf-xet-1.1.10 huggingface_hub-0.35.1 idna-3.10 jinja2-3.1.6 mpmath-1.3.0 networkx-3.5 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.9.86 nvidia-nvtx-cu12-12.1.105 opencv-python-headless-4.10.0.84 packaging-25.0 pillow-10.4.0 pyyaml-6.0.3 requests-2.32.5 safetensors-0.6.2 sympy-1.14.0 timm-1.0.9 torch-2.4.1 torchvision-0.19.1 tqdm-4.67.1 triton-3.0.0 typing-extensions-4.15.0 urllib3-2.5.0






torch: 2.4.1+cu121 CUDA build: 12.1
CUDA available: True
GPU: NVIDIA A10-24Q
timm: 1.0.9
faiss: 1.8.0
Install check complete.


  from .autonotebook import tqdm as notebook_tqdm


In [4]:
import os, math, time, json, gc, sys
from pathlib import Path
import numpy as np
import pandas as pd
import torch
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from PIL import Image, ImageOps
import timm
from timm.data import create_transform
import faiss

print('=== Embedding extraction + FAISS retrieval (convnext_tiny, 384) ===', flush=True)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
IMG_SIZE = 384
BATCH_SIZE = 96  # safe for A10 24GB at 384
NUM_WORKERS = min(12, os.cpu_count() or 4)
USE_TTA = True  # hflip TTA
TOPK = 100
P_EXP = 1.5  # sim^p aggregation

# Load prepared dataframes (from previous cells)
train_df = pd.read_csv('train.csv')
train_df['chain'] = train_df['chain'].astype(str)
train_df['file_path'] = 'train_images/' + train_df['chain'] + '/' + train_df['image'].astype(str)
test_dir = Path('test_images')
test_files = sorted([p.name for p in test_dir.iterdir() if p.is_file()])
test_df = pd.DataFrame({'image': test_files})
test_df['file_path'] = test_df['image'].apply(lambda x: str(test_dir / x))

# Sanity
assert len(train_df) > 0 and len(test_df) > 0

class ImageDataset(Dataset):
    def __init__(self, file_paths, transform):
        self.file_paths = list(file_paths)
        self.transform = transform
    def __len__(self):
        return len(self.file_paths)
    def __getitem__(self, idx):
        fp = self.file_paths[idx]
        img = Image.open(fp).convert('RGB')
        img = ImageOps.exif_transpose(img)
        x = self.transform(img)
        return x

def build_transform(img_size=384):
    return create_transform(input_size=img_size, is_training=False)

@torch.no_grad()
def extract_embeddings(model, file_paths, tta=False, batch_size=BATCH_SIZE):
    model.eval()
    tfm = build_transform(IMG_SIZE)
    ds = ImageDataset(file_paths, tfm)
    dl = DataLoader(ds, batch_size=batch_size, shuffle=False, num_workers=NUM_WORKERS, pin_memory=True)
    all_embs = []
    t0 = time.time()
    n = len(ds)
    for bi, batch in enumerate(dl):
        with torch.cuda.amp.autocast(enabled=True, dtype=torch.float16):
            feats = model(batch.to(device))
        if tta:
            # horizontal flip TTA
            with torch.cuda.amp.autocast(enabled=True, dtype=torch.float16):
                feats_flipped = model(torch.flip(batch, dims=[3]).to(device))
            feats = 0.5 * (feats + feats_flipped)
        feats = F.normalize(feats.float(), dim=1).cpu().numpy()
        all_embs.append(feats)
        if (bi+1) % 50 == 0 or (bi+1)==len(dl):
            elapsed = time.time()-t0
            done = (bi+1)*batch.size(0)
            print(f'  Emb {done}/{n} ({(done/n)*100:.1f}%) elapsed {elapsed:.1f}s', flush=True)
    embs = np.concatenate(all_embs, axis=0)
    return embs

# Create model (global_pool='avg' and num_classes=0 yields feature vectors)
print('Creating model...', flush=True)
model = timm.create_model('convnext_tiny', pretrained=True, num_classes=0, global_pool='avg')
model.to(device)
model.eval()
feat_dim = model(torch.zeros(1,3,IMG_SIZE,IMG_SIZE, device=device)).shape[1]
print('Feature dim:', feat_dim)

# Cache paths
train_emb_path = Path(f'train_emb_convnext_tiny_{IMG_SIZE}.npy')
test_emb_path = Path(f'test_emb_convnext_tiny_{IMG_SIZE}.npy')

# Train embeddings
if train_emb_path.exists():
    train_emb = np.load(train_emb_path)
    print('Loaded cached train embeddings:', train_emb.shape)
else:
    print('Extracting train embeddings...', flush=True)
    train_emb = extract_embeddings(model, train_df['file_path'].values, tta=USE_TTA)
    np.save(train_emb_path, train_emb)
    print('Saved train embeddings:', train_emb.shape)

# Test embeddings
if test_emb_path.exists():
    test_emb = np.load(test_emb_path)
    print('Loaded cached test embeddings:', test_emb.shape)
else:
    print('Extracting test embeddings...', flush=True)
    test_emb = extract_embeddings(model, test_df['file_path'].values, tta=USE_TTA)
    np.save(test_emb_path, test_emb)
    print('Saved test embeddings:', test_emb.shape)

del model; gc.collect();

# Ensure L2-normalized
def l2norm(x):
    x = x.astype('float32')
    faiss.normalize_L2(x)
    return x
train_emb = l2norm(train_emb)
test_emb = l2norm(test_emb)

print('Building FAISS index (IP over L2-normalized vectors -> cosine)...', flush=True)
index = faiss.IndexFlatIP(train_emb.shape[1])
index.add(train_emb)
print('Index ntotal:', index.ntotal)

print('Searching topK neighbors for all test...', flush=True)
t0=time.time()
sims, nbrs = index.search(test_emb, TOPK)
print('Search done in', round(time.time()-t0,2),'s')

# Aggregate per hotel_id
train_hids = train_df['hotel_id'].values.astype(np.int64)
preds_top5 = []
for i in range(len(test_df)):
    nb_idx = nbrs[i]
    nb_sim = sims[i]
    agg = {}
    for j in range(nb_idx.shape[0]):
        hid = int(train_hids[nb_idx[j]])
        w = float(nb_sim[j])
        if w <= 0: # neighbors can be negative with cosine; keep but small
            pass
        agg[hid] = agg.get(hid, 0.0) + (w ** P_EXP)
    # sort by score desc
    top = sorted(agg.items(), key=lambda x: x[1], reverse=True)[:5]
    top_ids = [str(h) for (h,_) in top]
    # pad if less than 5 (unlikely)
    while len(top_ids) < 5:
        top_ids.append(top_ids[-1] if top_ids else '0')
    preds_top5.append(' '.join(top_ids[:5]))
    if (i+1) % 500 == 0 or (i+1)==len(test_df):
        print(f'  Aggregated {i+1}/{len(test_df)}', flush=True)

# Build submission following sample order
sample = pd.read_csv('sample_submission.csv')
name_to_pred = dict(zip(test_df['image'].values, preds_top5))
sub = sample.copy()
sub['hotel_id'] = sub['image'].map(name_to_pred)
assert sub['hotel_id'].notnull().all(), 'Missing predictions for some images'
sub.to_csv('submission.csv', index=False)
print('Saved submission.csv with shape', sub.shape)
print(sub.head())
print('Pipeline complete.')

=== Embedding extraction + FAISS retrieval (convnext_tiny, 384) ===


Creating model...


Feature dim: 768
Extracting train embeddings...




  with torch.cuda.amp.autocast(enabled=True, dtype=torch.float16):


  with torch.cuda.amp.autocast(enabled=True, dtype=torch.float16):


  Emb 4800/87798 (5.5%) elapsed 22.5s


  Emb 9600/87798 (10.9%) elapsed 39.5s




  Emb 14400/87798 (16.4%) elapsed 56.5s


  Emb 19200/87798 (21.9%) elapsed 73.6s


  Emb 24000/87798 (27.3%) elapsed 90.7s


  Emb 28800/87798 (32.8%) elapsed 107.8s




  Emb 33600/87798 (38.3%) elapsed 125.1s


  Emb 38400/87798 (43.7%) elapsed 142.3s


  Emb 43200/87798 (49.2%) elapsed 159.5s


  Emb 48000/87798 (54.7%) elapsed 176.8s


  Emb 52800/87798 (60.1%) elapsed 193.9s


  Emb 57600/87798 (65.6%) elapsed 211.1s


  Emb 62400/87798 (71.1%) elapsed 228.3s




  Emb 67200/87798 (76.5%) elapsed 245.5s


  Emb 72000/87798 (82.0%) elapsed 262.7s


  Emb 76800/87798 (87.5%) elapsed 279.9s


  Emb 81600/87798 (92.9%) elapsed 297.1s


  Emb 86400/87798 (98.4%) elapsed 314.3s


  Emb 49410/87798 (56.3%) elapsed 319.3s


Saved train embeddings: (87798, 768)
Extracting test embeddings...


  Emb 4800/9756 (49.2%) elapsed 20.2s


  Emb 9600/9756 (98.4%) elapsed 37.4s


  Emb 6120/9756 (62.7%) elapsed 38.0s


Saved test embeddings: (9756, 768)


Building FAISS index (IP over L2-normalized vectors -> cosine)...


Index ntotal: 87798
Searching topK neighbors for all test...


Search done in 28.66 s
  Aggregated 500/9756


  Aggregated 1000/9756


  Aggregated 1500/9756


  Aggregated 2000/9756


  Aggregated 2500/9756


  Aggregated 3000/9756


  Aggregated 3500/9756


  Aggregated 4000/9756


  Aggregated 4500/9756


  Aggregated 5000/9756


  Aggregated 5500/9756


  Aggregated 6000/9756


  Aggregated 6500/9756


  Aggregated 7000/9756


  Aggregated 7500/9756


  Aggregated 8000/9756


  Aggregated 8500/9756


  Aggregated 9000/9756


  Aggregated 9500/9756


  Aggregated 9756/9756


Saved submission.csv with shape (9756, 2)
                  image                       hotel_id
0  f1608c9f17fb6920.jpg       782 15319 40842 492 5441
1  c6c63939c67931e1.jpg   30544 27517 37807 9492 26349
2  83c214f3e90717ed.jpg  14201 53242 54525 34150 16137
3  a6b5dbe436592c82.jpg   39435 6150 36307 13918 50333
4  dd2f02f4354993cc.jpg  15115 58441 60918 17911 62628
Pipeline complete.
