Skip to content

Claude/iterative training plan tbjy5#2

Merged
fanrado merged 11 commits into
mainfrom
claude/iterative-training-plan-tbjy5
Apr 17, 2026
Merged

Claude/iterative training plan tbjy5#2
fanrado merged 11 commits into
mainfrom
claude/iterative-training-plan-tbjy5

Conversation

@fanrado
Copy link
Copy Markdown
Owner

@fanrado fanrado commented Apr 17, 2026

No description provided.

claude and others added 11 commits April 10, 2026 20:02
Implements the iterative training plan from the README as a runnable shell
script with Pass 1 (full augmentations, shape learning, 200 epochs) and
Pass 2 (minimal augmentations, color learning, 100 epochs). Supports
--pass1-only and --pass2-only flags for independent execution. Includes
k-NN evaluation after both passes for side-by-side comparison.

https://claude.ai/code/session_01PxMmQiYRaGMb1ZXsULGVoN
main_dino.py loads ImageFolder directly from the root (no train/val split
in the directory structure). eval_knn.py handles the 80/20 split internally
via sklearn train_test_split. Removed the /train suffix from --data_path in
both torchrun calls in run_dino_iterative.sh and corrected the matching
README commands and dataset layout description.

https://claude.ai/code/session_01PxMmQiYRaGMb1ZXsULGVoN
…rado/dinoLearning into claude/iterative-training-plan-tbjy5
Added option to load pre-trained weights. This is useful in the current approach doing the training in two stages
- run_dino_iterative.sh: increase batch size from 32 to 64 for both
  passes; double learning rates accordingly (P1: 0.0000625 -> 0.000125,
  P2: 0.00000625 -> 0.0000125) to maintain linear LR scaling
- run_visualize_attention.sh: point checkpoint to epoch-40 snapshot
  (checkpoint0040.pth), switch input image to img.png, reduce image size
  from 1440x1440 to 960x960, set attention threshold to 0.3
- plot_training_metrics.ipynb: update notebook cell outputs/parameters
…tImages

Outlines 8 steps to improve k-NN/linear accuracy beyond the current 88.96
(20-NN) baseline, including dataset cleaning, dropping the broken Pass 2
self-sup in favour of a linear probe, hyperparameter fixes (LR, out_dim,
teacher_temp), patch size 8, augmentation calibration, self-distillation,
and evaluation/infra upgrades.
@fanrado fanrado merged commit 24e60c6 into main Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants