Skip to content

Discussion: Load files for ensemble trial members from multiple directories #1939

@mo-tomosevans

Description

@mo-tomosevans

Describe the bug

Ensemble trial data will typically produce directory structure as follows for a particular run:
20240609T0600Z/enuk_um_003, 20240609T0600Z/enuk_um_004, 20240609T0600Z/enuk_um_005. The number in each folder represents the ensemble member number, each one will have exactly the same filenames in them i.e. enukaa_pd000, enukaa_pd003, ... , enukaa_pd123, where the number represents the forecast lead time.
When passing file globs into the workflow (/enuk_um_0*/enukaa_pd*) it will fail because it sets up one directory and creates symbolic links to each of the files in this directory, using only the final part of the path (i.e. the file name) as the link name. So it will correctly expand the pattern and find all the files, but will only load the first set of enukaa_pd* files it finds. Files for subsequent ensemble members are treated as duplicates, so none of the other members are loaded in.
As a result, the plots generated are only of one realization despite the globbing pointing towards multiple members.

How to reproduce

Steps to reproduce the behaviour:

  1. Choose one trial model and point the filepath in rose-suite.conf to a path ending /enuk_um*/enukaa_pd*
  2. Run the workflow
  3. Look at the job.err log for the fetch_m1 task - it should show it loads the first enukaa_pd* files, but if it finds another file path with the same filename endpoint it will print a duplicate error.

Expected behaviour

This probably requires a discussion, Stephen Gallagher (@SGallagherMet) has suggested this could be a common issue for other meteorological centres producing trial output but may be intentionally coded.

Metadata

Metadata

Assignees

No one assigned

    Labels

    R2O TrialsItems of significance to the Met Office Research to Operations Team.bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions