Implement opt hparams per stream by sophie-xhonneux · Pull Request #2029 · ecmwf/WeatherGenerator

sophie-xhonneux · 2026-03-10T13:27:22Z

Description

See issue and commit messages

Issue Number

Closes #2028

Is this PR a draft? Mark it as draft.

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

clessig · 2026-03-15T18:49:10Z

+    :return: List of param group dicts for torch.optim.AdamW.
+    """
+    # unwrap DDP if necessary
+    raw_model = model.module if hasattr(model, "module") else model


Isn't there a better way to detect this. In trainer we should also retain a handle to the original model (don't think we do it now).

clessig · 2026-03-15T18:51:08Z

+
+    default_wd = optimizer_cfg.weight_decay
+    stream_param_ids: set[int] = set()
+    groups: list[dict] = []


Is there a reason this is not a dict of dicts with the outer one having the stream name as key.

clessig · 2026-03-15T18:54:46Z

+        )
+
+    # shared group: everything not assigned to a stream
+    shared_params = [


It would be more natural to drop the stream parameters above rather than working with id()

clessig · 2026-03-15T18:55:46Z

+    if is_root():
+        for g in groups:
+            logger.info(
+                f"Param group '{g['name']}': {len(g['params'])} params, "


This should be prefaced with Optimizer parameters or something similar

clessig · 2026-03-15T18:56:36Z

+            self.model, stream_optimizer_cfgs, self.training_cfg.optimizer
        )
+        lr_start = self.training_cfg.learning_rate_scheduling.lr_start
+        for g in param_groups:


This should be done in build_param_groups()

clessig · 2026-03-29T14:36:26Z


 import weathergen.common.config as config
-from weathergen.train.utils import TRAIN
+from weathergen.evaluate.plotting.plot_utils import create_filename


If we want to have the changes in this file, then let's please open a separate PR that also move create_filename to packages/common/src/weathergen/common/paths.py

clessig · 2026-03-29T14:38:30Z

-                                    and col_split[3] == channel
-                                    and int(col_split[4]) in forecast_steps
-                                ):
+                                if col == stream_name.lower():


It seems this removes some branches from the current code? This doesn't seem functionally equivalent

clessig · 2026-03-29T14:39:49Z

        nargs="+",
        help="List of channels to plot",
    )
-    parser.add_argument(


Where is this option?

clessig · 2026-03-29T14:39:52Z

        nargs="+",
        help="List of metrics (e.g. mse) to plot",
    )
-    parser.add_argument(


Where is this option?

clessig · 2026-03-29T14:40:03Z

        clean_plot_folder(out_dir)

-    # collect all physical streams from all run_ids if requested
-    if "all" in streams:


Where is this option?

Implement opt hparams per stream

dd00248

github-project-automation Bot added this to WeatherGen-dev Mar 10, 2026

Fix wrong variable assignments

fb9d204

github-actions Bot added the model Related to model training or definition (not generic infra) label Mar 14, 2026

clessig reviewed Mar 15, 2026

View reviewed changes

Mess with plotting

21ce28f

github-actions Bot added the eval anything related to the model evaluation pipeline label Mar 16, 2026

clessig reviewed Mar 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement opt hparams per stream#2029

Implement opt hparams per stream#2029
sophie-xhonneux wants to merge 3 commits intodevelopfrom
sophiex/dev/per-embed-lr

sophie-xhonneux commented Mar 10, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 29, 2026

Uh oh!

clessig Mar 29, 2026

Uh oh!

clessig Mar 29, 2026

Uh oh!

clessig Mar 29, 2026

Uh oh!

clessig Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sophie-xhonneux commented Mar 10, 2026

Description

Issue Number

Checklist before asking for review

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants