Eval recipe pipelines and config reorg#810
Conversation
…/earth2studio into vnv-recipe-skeleton
|
/blossom-ci |
Greptile SummaryThis PR introduces a
|
| Filename | Overview |
|---|---|
| recipes/eval/src/pipeline.py | New file — well-structured ABC with two built-in implementations; shared run() correctly handles output filtering, ensemble injection, and threaded-write flushing before markers. |
| recipes/eval/src/output.py | Refactored — OutputManager.__init__ slimmed down; validate_output_store added for deferred store creation; flush() for resume sync; build_forecast_coords / build_diagnostic_coords extracted as standalone helpers. |
| recipes/eval/src/work.py | Added write_marker, filter_completed_items, clear_progress, and progress_dir; also fixes _parse_initial_times to treat start_times: null as absent so campaign configs can use the IC-block path. |
| recipes/eval/main.py | Cleanly delegates to build_pipeline / pipeline.setup / pipeline.run; resume early-exit is globally consistent because filter_completed_items reads from the same shared filesystem across all ranks. |
| recipes/eval/predownload.py | Split into _predownload_forecast and _predownload_diagnostic; custom pipelines (fully-qualified class names) will hit a ValueError with a message that doesn't hint users need to implement their own pre-download step. |
| recipes/eval/test/_multigpu_worker.py | Updated to use ForecastPipeline; manually assigns attributes instead of calling pipeline.setup(), bypassing the run_on_rank0_first barrier inside load_prognostic. |
| recipes/eval/test/test_resume.py | New — comprehensive coverage of progress tracking, resume/flush, and pipeline resume integration. |
| recipes/eval/test/test_pipeline.py | New — covers ABC enforcement, registry lookup, custom-class-by-FQN path, and the stub pipeline's build/run methods. |
| recipes/eval/test/test_diagnostic_inference.py | New — covers single-IC, multi-IC, ensemble, and empty-work-item paths for DiagnosticPipeline. |
| recipes/eval/cfg/default.yaml | Adds pipeline, resume, and predownload keys; moves predownload config into the shared default. |
| recipes/eval/cfg/campaign/fcn3_2024_full.yaml | New campaign config for FCN3 full-year ensemble; correctly sets start_times: null to activate the IC-block path and resume: true for multi-job splitting. |
Reviews (1): Last reviewed commit: "Merge branch 'main' into vnv-recipe-exte..." | Re-trigger Greptile
|
/blossom-ci |
|
/blossom-ci |
Earth2Studio Pull Request
Description
An incremental update to eval recipe, making it more flexible for different types of models via the introduction of the
Pipelineinterface. Description copied from README below. Besides that, also reorganizes the config system to reduce bloat from having different possible models and inference campaigns.Pipeline interface
All inference logic is driven by a Pipeline — an abstract base class
(
src/pipeline.py) that separates per-work-item inference from the sharedscaffolding (work iteration, output filtering, ensemble injection, zarr
writes). Subclasses implement three methods:
setup(cfg, device)build_total_coords(times, ensemble_size)run_item(item, data_source, device)(tensor, coords)pairs for one work itemThe base class
Pipeline.run()handles everything else: iterating workitems, building the output variable filter, injecting the ensemble dimension,
and writing to the
OutputManager.Two built-in pipelines are provided:
ForecastPipeline(pipeline=forecast) — prognostic rollout withoptional diagnostic models. Yields one output per lead-time step.
DiagnosticPipeline(pipeline=diagnostic) — diagnostic-only (noprognostic model). Yields a single output per work item.
Custom pipelines
To add a custom inference loop, subclass
Pipelineand setpipelineinyour Hydra config to the fully-qualified class name:
Custom pipelines inherit the full shared machinery — distributed output
management, ensemble dimension handling, threaded zarr writes — for free.
Checklist
Dependencies