Skip to content

feat: training improvements — W&B logging, Optuna HPO, discriminative LR#16

Merged
ziv-lazarov-nagish merged 1 commit intomainfrom
feat/training-improvements
Apr 14, 2026
Merged

feat: training improvements — W&B logging, Optuna HPO, discriminative LR#16
ziv-lazarov-nagish merged 1 commit intomainfrom
feat/training-improvements

Conversation

@ziv-lazarov-nagish
Copy link
Copy Markdown
Contributor

Summary

Depends on #15 (refactor/data-layer).

  • Enable W&B logging by default with --wandb_entity/--wandb_project args and full hyperparameter tracking
  • Add --lr_scale_backbone for discriminative fine-tuning (lower LR for CNN + transformer backbone, full LR for classification heads)
  • Add Optuna HPO via --optuna <yaml> with a dedicated hpo.py module
    • YAML search space split into architecture and training sections — architecture params auto-skipped when fine-tuning
    • monitor_metric configurable in the YAML
    • W&B callback (as_multirun) for per-trial logging, trials named <run_name>-t0, <run_name>-t1, etc. (--run_name required for Optuna)
    • Pruning callback to kill unpromising trials early
  • Refactor training loop into train(overrides, monitor_metric) function
  • Move wandb from dev to main deps, add optuna, optuna-integration, pyyaml

Test plan

  • ruff check . passes
  • 37 tests pass
  • Training runs successfully (verified 1 epoch, validation_hm_iou 0.617)

🤖 Generated with Claude Code

Comment thread pyproject.toml
"wandb",
"optuna",
"optuna-integration[pytorch_lightning]",
"pyyaml",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed for normal install for inference? only for dev, right?

@ziv-lazarov-nagish ziv-lazarov-nagish force-pushed the feat/training-improvements branch 2 times, most recently from 03c0394 to 7576ce0 Compare April 13, 2026 13:30
Enable W&B logging by default with --wandb_entity/--wandb_project.
Add --lr_scale_backbone for discriminative fine-tuning (lower LR for
backbone, full LR for classification heads). Add Optuna HPO via
--optuna <yaml> with dedicated hpo.py module — YAML search space
split into architecture/training sections, monitor_metric configurable,
W&B callback per trial, pruning for unpromising trials. Refactor
training loop into train(overrides, monitor_metric) function.
@ziv-lazarov-nagish ziv-lazarov-nagish force-pushed the feat/training-improvements branch from 7576ce0 to cebfb68 Compare April 14, 2026 07:20
@ziv-lazarov-nagish ziv-lazarov-nagish changed the base branch from refactor/data-layer to main April 14, 2026 07:20
@ziv-lazarov-nagish ziv-lazarov-nagish merged commit fee02c4 into main Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants