# Hotel-ID 2021 FGVC8 – Plan

Objectives:
- Achieve medal: MAP@5 bronze ≥ 0.0216, silver ≥ 0.39, gold ≥ 0.7205.
- Build fast, reliable baseline; iterate with solid CV and logging.

Initial Actions:
1) Environment & GPU check (nvidia-smi, install PyTorch cu121 if needed).
2) Data audit: train.csv schema, label counts, image presence; test_images count; sample_submission format.
3) Validation: stratified K-fold by hotel_id with group-aware leakage control (if multiple images per hotel).
4) Baseline model:
   - Image model: pretrained CNN (e.g., timm resnet50 / efficientnet_b0) with mixed precision & augmentations.
   - Input size 224, simple aug: RRC/flip/color jitter. Optim: AdamW, CE loss with label smoothing.
   - Quick smoke: 1-2 folds, few epochs to validate pipeline; cache features if needed.
5) Full training:
   - 5-fold StratifiedKFold; freeze then unfreeze (2+6 epochs) with cosine schedule.
   - Save OOF logits and test logits per fold.
6) Inference:
   - TTA (e.g., hflip). Average fold logits → softmax → top5 per image.
   - Ensure exact submission format: image,label_pred
7) Error analysis:
   - Inspect per-class performance, confusion among similar hotels; consider higher resolution or stronger backbone (convnext_base, eca_nfnet_l0) if time.
8) Ensembling:
   - Blend diverse backbones/seeds if CV gains and time allows.

Checkpoints to request expert review:
- After plan (this cell).
- After EDA & CV decision.
- After baseline training results (OOF MAP@5).
- Before long trainings (architecture choice & epochs).
- After first LB score; adjust strategy.

Risk/Time Management:
- Use fast smoke runs; abort slow configs.
- Always log fold indices and elapsed times.
- Save and reuse fold splits and logits.

Next Step: Run environment + data sanity checks.

In [1]:
# Environment + Data Sanity Checks
import os, sys, subprocess, time, json, glob
import pandas as pd

t0 = time.time()
print("=== GPU CHECK (nvidia-smi) ===", flush=True)
try:
    out = subprocess.run(['bash','-lc','nvidia-smi || true'], capture_output=True, text=True)
    print(out.stdout, flush=True)
except Exception as e:
    print("nvidia-smi failed:", e, flush=True)

print("=== List top-level files ===", flush=True)
for p in sorted(os.listdir('.')):
    try:
        print(p, os.path.getsize(p) if os.path.isfile(p) else '<dir>')
    except Exception:
        print(p)

print("=== CSV heads ===", flush=True)
for csv in ['train.csv','sample_submission.csv']:
    if os.path.exists(csv):
        try:
            df = pd.read_csv(csv)
            print(csv, df.shape, flush=True)
            print(df.head(3).to_string(index=False), flush=True)
        except Exception as e:
            print(f"Failed to read {csv}: {e}", flush=True)
    else:
        print(csv, "not found", flush=True)

print("=== Image counts ===", flush=True)
test_imgs = glob.glob('test_images/*')
print("test_images:", len(test_imgs), flush=True)

# train_images are in subdirectories; count per subdir and total
train_total = 0
train_dirs = sorted([d for d in glob.glob('train_images/*') if os.path.isdir(d)])
per_dir_counts = {}
for d in train_dirs:
    cnt = len(glob.glob(os.path.join(d, '*')))
    per_dir_counts[os.path.basename(d)] = cnt
    train_total += cnt
print("train subdirs:", len(train_dirs), "total train images:", train_total, flush=True)
print("First 5 dir counts:", list(per_dir_counts.items())[:5], flush=True)

print("Elapsed:", round(time.time()-t0,2), "s", flush=True)

=== GPU CHECK (nvidia-smi) ===


Sat Sep 27 02:22:26 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.06             Driver Version: 550.144.06     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A10-24Q                 On  |   00000002:00:00.0 Off |                    0 |
| N/A   N/A    P0             N/A /  N/A  |     182MiB /  24512MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

=== List top-level files ===


.00_eda_and_planning_kernel_state.json 183
00_eda_and_planning.ipynb 5282
agent_metadata <dir>
description.md 7362
docker_run.log 27034
requirements.txt 2021
sample_submission.csv 497571
submission.csv 497571
task.txt 940
test_images <dir>
train.csv 4326682
train_images <dir>
=== CSV heads ===


train.csv (87798, 4)


               image  chain  hotel_id           timestamp
d29287f52c2a871f.jpg      5     22408 2018-04-16 17:01:49
e9d067c249e4c2f9.jpg     70      2324 2016-07-08 22:26:21
cc9877a40a63ed93.jpg      4     47514 2017-04-14 02:28:56


sample_submission.csv (9756, 2)


               image                      hotel_id
f1608c9f17fb6920.jpg 36363 53586 18807 64314 60181
c6c63939c67931e1.jpg 36363 53586 18807 64314 60181
83c214f3e90717ed.jpg 36363 53586 18807 64314 60181


=== Image counts ===


test_images: 9756


train subdirs: 88 total train images: 87797


First 5 dir counts: [('0', 18213), ('1', 1118), ('10', 11), ('11', 221), ('12', 23)]


Elapsed: 0.2 s
