Skip to content

refactor: dataset registry for multi-dataset training#18

Merged
ziv-lazarov-nagish merged 3 commits intomainfrom
refactor/dataset-registry
Apr 14, 2026
Merged

refactor: dataset registry for multi-dataset training#18
ziv-lazarov-nagish merged 3 commits intomainfrom
refactor/dataset-registry

Conversation

@ziv-lazarov-nagish
Copy link
Copy Markdown
Contributor

Summary

  • Add DATASET_REGISTRY + register_dataset() + build_datasets() to common.py so new datasets can be added without modifying train.py or evaluate.py
  • Each dataset implements from_args(split, args, **augment_kwargs) classmethod and registers itself on import
  • --datasets CLI arg accepts comma-separated names (e.g. dgs,platform), replacing the DatasetType enum and --dataset choices
  • Shared dataset construction path between train.py and evaluate.py (removes duplication flagged in feat: annotation platform dataset and multi-dataset training #17)

Adding a new dataset

  1. Subclass BaseSegmentationDataset, implement from_args + get_split_manifest
  2. Call register_dataset("name", YourClass) in __init__.py
  3. Add the import to _ensure_datasets_registered() in common.py

Test plan

  • ruff check . passes
  • pytest passes (38 tests)
  • Smoke test: 1-epoch training with --datasets dgs (validation_hm_iou: 0.617)

Addresses review feedback from #17.

Replace hardcoded dataset construction with a registry so new datasets
can be added by implementing from_args + calling register_dataset(),
without touching train.py or evaluate.py.

- DATASET_REGISTRY + register_dataset() + build_datasets() in common.py
- Abstract from_args classmethod on BaseSegmentationDataset
- --datasets arg (comma-separated) replaces single-dataset dispatch
- Shared build path between train.py and evaluate.py
- Remove DatasetType enum (registry replaces it)
@ziv-lazarov-nagish ziv-lazarov-nagish merged commit 1d20f28 into main Apr 14, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants