Feat/cmip7 awiesm3 veg hr#266
Open
JanStreffing wants to merge 153 commits intoprep-releasefrom
Open
Conversation
The entry_points() API changed between Python 3.9 and 3.10: - Python 3.9: entry_points() returns dict-like object - Python 3.10+: entry_points(group='name') with keyword argument Use try/except to detect the API version at runtime. Fixes TypeError: entry_points() got an unexpected keyword argument 'group'
…6_table-based approach - Use user-specified CMIP7_DReq_metadata file for DataRequest loading - Fix cmip6_cmor_table -> cmip6_table key mismatch in table.py - Extract table IDs from cmip6_table values not compound name prefix - Add warning when rules have no matching data_request_variables - Add debug logging to find_matching_rule for troubleshooting This partially addresses the architectural issue where CMIP7 is forced into CMIP6's table-based structure. Full compound name matching still needs implementation (see CMIP7_ARCHITECTURE_ISSUE.md). Fixes silent failure where rules were dropped with no user feedback.
Add step-by-step failure scenario showing: - Silent failure symptoms - Root cause discovery process (3 layered bugs) - Log output at each debugging stage - Key symptoms and workarounds
The branch fixes immediate bugs (silent failure, config ignored) but architectural issues persist (cmip6_table dependency, partial matching).
- Index variables by full compound name instead of cmip6_table - Implement exact compound name matching for CMIP7 (find_matching_rule_cmip7) - Generate synthetic table headers from variable metadata - Remove dependency on cmip6_table field for CMIP7 data loading - Add comprehensive unit tests for synthetic header generation - Maintain full backward compatibility with CMIP6 and existing CMIP7 metadata Resolves critical AttributeError for table_header in CMIP7 processing. Addresses architectural issues identified in CMIP7_ARCHITECTURE_ISSUE.md. Tests: 15 passed, 1 skipped
Fixes trailing whitespace on blank lines in cmorizer.py and reformats several other files to be consistent with black when run from root.
1a42875 to
5617a18
Compare
Resolves merge conflicts in cmorizer.py and global_attributes.py, keeping CMIP7_DReq_metadata feature and integrating prep-release compound_name table_id derivation logic.
…ected The PycmorConfigManager applies a 'pycmor' namespace, so it looks for keys like 'pycmor_dask_cluster'. But the YAML 'pycmor:' section provides unprefixed keys like 'dask_cluster', which were silently ignored and fell back to defaults (e.g. dask_cluster defaulted to 'local' instead of 'slurm'). Fix by prefixing dict keys in _create_environments. Also adds custom_steps.py with vertical_integrate pipeline step and fixes grid_file path and max_jobs in the minimal example.
Fix two bugs where pipelines didn't get the Dask cluster assigned: 1. _post_init_create_pipelines appended a new Pipeline.from_dict(p) instead of the one that had cluster assigned 2. DefaultPipeline created at rule init time bypassed CMORizer cluster assignment — now handled in _match_pipelines_in_rules Switch example config from adaptive to fixed SLURM scaling to avoid race condition where adaptive scaler kills workers before .compute() submits the real Dask graph.
Contributor
Author
…ntested) - Rules for 20 of 28 core ocean variables in cmip7_awiesm3-veg-hr_ocean.yaml - New custom steps: load_gridfile (generic), compute_deptho, compute_sftof, compute_thkcello_fx, compute_masscello_fx (FESOM mesh-derived) - 6 new pipeline definitions for Ofx variables (fx_extract, fx_deptho, etc.) - namelist.io: vec_autorotate=.true., hnode output, daily sst/sss/ssh - Todo tracking and missing.md for variables FESOM cannot output - Research: FESOM uses potential temp (no bigthetao), MLD3 for mlotst, velocities need rotation, u/v on elem grid -> use unod/vnod Not yet tested — pipelines and rules need validation against actual data.
NOT TESTED — pipelines and custom steps need validation against real data. - New steps: compute_density (gsw/TEOS-10), compute_mass_transport (Boussinesq rho_0*dz), compute_zostoga (global thermosteric SL) - mass_transport_pipeline for umo/vmo/wmo - zostoga_pipeline using gsw for EOS computation - Rules for umo, vmo, wmo, zostoga in ocean rules file - gsw package installed in pycmor_py312 environment masscello(Omon) still needs density x hnode pipeline.
…(untested) - 8 sea ice rules (simass, siu, siv, sithick, snd, ts, siconc, sitimefrac) - siconc_pipeline (fraction_to_percent) and sitimefrac_pipeline (binary ice presence) - fraction_to_percent and compute_sitimefrac custom steps - Runnable sea ice config (examples/awiesm3-cmip7-seaice.yaml) - namelist.io: added h_ice, h_snow, ist (monthly) and a_ice (daily) - Moved missing.md and namelist.io up one level per user request - Removed old awiesm3-cmip7-example.yaml (superseded by ocean/seaice configs)
…th inherit - Add 45 CAP7 sea ice variable rules (direct mapping, scale, multi-variable compute, melt ponds, hemisphere integrals, stress tensor) - Add custom pipeline steps: scale_by_constant, integrate_over_hemisphere, compute_sispeed, compute_ice_mass_transport, compute_sistressave/max, compute_siflcondtop, compute_sihc, compute_sisnhc, compute_sitempbot, compute_sifb, compute_constant_field, compute_simpeffconc - Restructure all rules YAMLs into full runnable configs with general, pycmor, jobqueue, pipelines, and inherit sections - Move data_path into inherit section with YAML anchor for reuse in inputs.path across all rules - Update namelist.io with new monthly/daily diagnostics for CAP7 variables - Add CAP7 sea ice variables todo tracking (~89 variables, 45 done)
- Add 28 CAP7 ocean variable rules covering easy (pbo, volo, global means, squaring, wfo), medium (tob, sob, pso, phcint, scint, difvho/difvso, difmxylo, masso), decadal (7 variables), and hard (opottemptend) categories - Add custom pipeline steps: compute_square, extract_bottom, compute_surface_pressure - Full runnable config with inherit section (data_path anchor) - Comprehensive todo tracking ~147 CAP7 ocean variables (28 done, ~20 skipped, rest blocked or need model re-run)
…llo_dec, opottempmint, somint) Second pass over CAP7 ocean variables to identify what can be computed purely in pycmor post-processing. Adds volcello_fx and volcello_time custom steps and pipelines, plus rules for virtual salt flux, static/decadal cell volume, decadal cell mass, and yearly depth- integrated temperature and salinity.
…vsfcorr, mlotst_day, uos, vos) Add evap and relaxsalt to monthly output in namelist.io, and MLD3, unod, vnod to daily output. Write corresponding pycmor rules with scale_pipeline, surface_extract_pipeline, and direct mappings. Add extract_surface custom step for daily surface velocity extraction. Note: daily 3D unod/vnod output is very storage-heavy.
Adds a shared home for CF cell-measure fx variables so every config that references ``cell_measures: area: areacello`` / ``areacella`` can produce the companion fx file with a one-line pipeline reference instead of copy-pasting a 7-step pipeline into each yaml. std_lib/cell_measures.py * load_gridfile: open ``rule.grid_file`` as the data source (fx variables read a grid/mesh, not time-series output). * compute_areacello: read ``cell_area``/``cluster_area`` from the mesh Dataset produced by ``load_gridfile``. Works for any model whose mesh carries a per-cell surface-area field (FESOM, ICON, MPAS, ...). * compute_areacella: spherical-Earth formula on a regular lat/lon grid (R^2 * dlon * |sin(lat+dlat/2) - sin(lat-dlat/2)|). core/pipeline.py * AreacelloFxPipeline (FrozenPipeline): load_gridfile -> compute_areacello -> set_global/variable/coordinates -> map_dimensions -> save_dataset. Reference as ``uses: pycmor.core.pipeline.AreacelloFxPipeline``. * AreacellaFxPipeline: same shape for atmosphere. The same functions remain in examples/custom_steps.py unchanged for backward compatibility with existing configs; new configs should prefer the std_lib paths. examples/_verify_sidmassth.yaml * Switches the areacello pipeline from inline steps to ``uses: pycmor.core.pipeline.AreacelloFxPipeline``. Output is byte-identical apart from timestamp; QC state unchanged (1 polar-cell flag on both files, which is the known mesh quirk).
The atmospheric areacella is computed analytically from lat/lon on a regular grid -- it does not read a per-cell area field from a mesh the way areacello does. Existing per-config areacella pipelines use load_mfdataset + get_variable to pick up lat/lon from any model-output file; the FrozenPipeline now matches that pattern (instead of load_gridfile, which only made sense for the unstructured areacello case).
…ines Every CMIP7 run that produces variables with cell_measures: area: areacello (or areacella) must ship the referenced measure as a companion fx file. Now that pycmor.core.pipeline.AreacelloFxPipeline and AreacellaFxPipeline exist in std_lib, the copy-pasted 10-step pipelines across configs become single `uses:` lines. Fewer places to keep in sync and no absolute script:// paths to custom_steps.py. Migrations (inline steps -> uses:): * awi-esm3-veg-hr-variables/core_ocean -> AreacelloFxPipeline * awi-esm3-veg-hr-variables/core_land -> AreacellaFxPipeline * awi-esm3-veg-hr-variables/extra_land -> AreacellaFxPipeline (feeds areacellr) * examples/cmip7_core_ocean_core2_test -> AreacelloFxPipeline * examples/cmip7_core_land_tco95_test -> AreacellaFxPipeline * examples/cmip7_extra_land_tco95_test -> AreacellaFxPipeline * examples/awiesm3-cmip7-minimal -> AreacelloFxPipeline Additions (configs that referenced but did not ship the measure): * examples/awiesm3-cmip7-minimal: new areacello rule * awi-esm3-veg-hr-variables/core_atm: new areacella pipeline + rule Verified _verify_sidmassth.yaml still produces the same 1-finding (polar-cell) CF state on the areacello output, byte-parity otherwise.
Brings HR (awi-esm3-veg-hr-variables/*/cmip7_awiesm3-veg-hr_*.yaml) and
LR (examples/cmip7_*_test.yaml) rule sets into parity; 16 of 17 topics
are now byte-identical at the rule-structure level (same names, same
compound_name, same pipeline assignments). The one exception is
lrcs_ocean, which keeps 15 HR-only entries (msftm_density /
msftmmpa_density / msftmmpa_depth + *_dec variants) as commented-out
stubs in LR with a note explaining why (custom steps not yet
implemented, decadal averages need a 10y+ run).
Substantive changes:
- cap7_aerosol: add ghg_scalar_pipeline + cfc11/cfc12/ch4/n2o_mon
- cap7_atm: drop dead compute_hur_ml pipeline, read hur directly from
new XIOS ml output
- cap7_land: HR moved from 'pipeline:' (singular, non-schema) to the
correct 'pipelines:' list form for 48 LPJ-GUESS rules
- cap7_ocean: unify tauuo/tauvo on 3hr frequency
- core_atm: LR gains areacella + AreacellaFxPipeline frozen pipeline
- core_ocean/lrcs_ocean: attach scale_pipeline to mlotst{,_day}
- lrcs_ocean: HR gains hfbasin/msftmz/sltbasin + pipelines
- lrcs_seaice: LR gains 23 HR-only rules (rad_seaice, siconca(+day),
sidragtop, sifl*top, sisnmass_*_si, siarea/siextent/sivol _day,
regrid_atm_to_fesom_pipeline)
Repointing:
- HR yamls → /work/bb1469/a270092/runtime/awiesm3-develop/HR_test_01,
year_start=year_end=1586 (new HR test run)
- LR yamls → /work/bb1469/a270092/runtime/awiesm3-develop/LR_test_01,
year_start=1900, year_end=1901 (previous LR test run)
- All '*.fesom.<year>.nc' literal-year patterns regex-ified to
'*\.fesom\..*\.nc' so the yamls work across any sim year
Disabled outputs (matching XIOS file_def decisions):
- 6hr model-level rules in cap7_atm (5 rules)
- 3hr plev6 rules in veg_atm (5 rules)
Both blocks commented with a pointer to
doc/awi_cap7_volume_estimate.txt explaining the data-volume driver.
atmos gn switch fallout: the HR and LR runs now write atmos on the
native reduced Gaussian (cell=40320 at LR, cell=421120 at HR), lat/lon
as auxiliary coords with bounds_lat/lon(cell, nvertex=4). Yamls carry
grid_label=gn accordingly.
- examples/run_core_atm_hr.sh: sbatch wrapper that runs the HR
core_atm production yaml locally on a compute node (Prefect server
on compute node, HDF5 file-locking off, scratch-based TMPDIR), with
output_directory rewritten to ./cmorized_output/core_atm_hr so it
does not clash with a parallel LR run.
- doc/awi_cap7_volume_estimate.txt: final DKRZ planning estimate
derived from running estimate_data_volume_{lr,hr}.py against the
current yaml set, with corrected TCo95/TCo319 reduced-Gaussian grid
sizes (40320 / 421120 points, not the old 192x400 regular-grid
assumption). Scenario: native atmos/land/veg, ocean/seaice native
plus 1° (LR) / 0.25° (HR) regrid, 6hr_ml and 3hr_pl6 excluded,
empirical 1.62x compression factor measured on real pycmor output.
JanStreffing
commented
Apr 22, 2026
| return ds | ||
|
|
||
|
|
||
| def _attach_bounds_from_mesh(ds, rule, coord_names): |
Contributor
Author
There was a problem hiding this comment.
Check if generic enough for backend
JanStreffing
commented
Apr 22, 2026
| return any(v.chunks is not None for v in ds.data_vars.values()) | ||
|
|
||
|
|
||
| def _encoding_from_dask_chunks(ds, rule): |
Contributor
Author
There was a problem hiding this comment.
Deferred for later review
JanStreffing
commented
Apr 22, 2026
|
|
||
| if table_id is None: | ||
| # Fallback to user-provided | ||
| table_id = self.rule_dict.get("table_id", None) |
Contributor
Author
There was a problem hiding this comment.
Should be checked. Why deleted?
JanStreffing
commented
Apr 22, 2026
| """ | ||
| return self.rule_dict.get("Conventions", "CF-1.11") | ||
|
|
||
| # ======================================================================== |
Contributor
Author
There was a problem hiding this comment.
Should be a class. Do we need it?
JanStreffing
commented
Apr 22, 2026
| return "hdl:21.14100/" + str(uuid.uuid4()) | ||
| """Generate a unique tracking ID (prefix overridable via rule_dict). | ||
|
|
||
| The CMIP7 tracking_id CV requires the ``hdl:21.14107/<uuid>`` prefix |
Contributor
Author
There was a problem hiding this comment.
Does that mean cmip6 wont work anymore?
JanStreffing
commented
Apr 22, 2026
| approx_interval = drv.table_header.approx_interval | ||
| frequency_str = _frequency_from_approx_interval(approx_interval) | ||
| logger.debug(f"{approx_interval=} {frequency_str=}") | ||
| # attach the frequency_str to rule, it is referenced when creating file name | ||
| rule.frequency_str = frequency_str | ||
| time_method = _get_time_method(drv.frequency) | ||
| rule.time_method = time_method | ||
| # FESOM yearly files and concat'd hemispheric selects can yield a |
Contributor
Author
There was a problem hiding this comment.
Hallucination? xarray can sort itself?
JanStreffing
commented
Apr 22, 2026
|
|
||
| ## Development Commands | ||
|
|
||
| ### Environment Setup |
Contributor
Author
There was a problem hiding this comment.
Delete before merge
Driven by a full sweep of LR test-run failures. pycmor std_lib: - cell_measures.compute_areacella now handles native reduced Gaussian / unstructured grids via per-cell bounds_lat/bounds_lon; returns 1D (cell,) instead of degenerate (cell, cell). Bounds broadcast along time by open_mfdataset is squeezed out. Resolves 'Bad chunk sizes' on TCo95 and the 258 GB OOM on TCo319 (the old code asked for a 421120x421120 materialization). - AreacellaFxPipeline drops get_variable so compute_areacella sees the Dataset with its bounds variables. - chunking + variable_attributes skip _FillValue / missing_value on CF flag variables (flag_values/flag_meanings); missing_value cast to integer dtype now checks iinfo bounds and skips on overflow. Resolves 'basin' OverflowError. examples/custom_steps.py: - _load_secondary_mf matches via re.fullmatch against os.listdir (consistent with pycmor's regex-based primary gather_inputs). - compute_hur_plev recognises additional plev coord names (pressure_levels, plev39/plev7h/plev8). Config fixes: - snd_day: add second_input_path/_pattern/_variable for rsn. - 4 WMGHG scalars (cfc11/cfc12/ch4/n2o): branding suffix tavg-u-hm-air -> tavg-u-hm-u (dreq v1.2.2.2). - lrcs_seaice: add missing oifs_data_path &odp anchor; sisnmass NH/SH hm-si -> hm-u. - 6 atm yamls: normalise secondary-input patterns from glob (*.nc) to regex (.*\.nc), matching primary pattern convention.
The default flox path for resample().first/mean() routes through flox's numbagg backend, which JIT-compiles each aggregator via numba on first use. On HR runs the compile takes ~30 s per (aggregator, dtype, worker) triple, and that cost is repaid for every fresh Dask worker process. tasmax_mon on TCo319 spent 612 s in trigger_compute almost entirely inside numba compile — save_dataset itself took 1 s. Make "numpy" the default engine (vectorised, zero JIT cold-start), and add a "flox_engine" knob (rule attribute or pycmor-config) for rules that genuinely benefit from numba — we currently have none. Measured on the minimal bench (examples/cmip7_slow_write_bench_hr.yaml, tasmax_mon, TCo319, year 1586, 4 workers): default (numbagg): rule total ~615 s, trigger_compute 612 s flox_engine=numpy: rule total ~7 s, trigger_compute 3.5 s Also drops the two sbatch wrappers used to run the comparison and the minimal yaml, so the bench is reproducible.
105e934 to
0700690
Compare
Write path was hardcoded to zlib-1 + shuffle and wrapped in
scheduler="synchronous". On the old PyPI-netCDF4 stack that was fine
because the bundled HDF5 is not thread-safe and libnetcdf had no
alternate codecs anyway. On a thread-safe HDF5 build with a modern
libnetcdf that has zstd/blosc filters, both restrictions are leaving
most of the write throughput on the table.
Add two rule-level knobs (with pycmor-config fallbacks):
- netcdf_compression_codec: one of zlib (default), zstd, blosc_lz,
blosc_lz4, blosc_lz4hc, blosc_zlib, blosc_zstd, bzip2, szip.
Sets the `compression=` encoding kwarg that netCDF4-python passes
through to libnetcdf.
- netcdf_write_scheduler: dask scheduler used around save_mfdataset
(default "synchronous" — safe; "threads" wins when HDF5 is built
threadsafe).
Wired through both chunk-encoding paths:
- _calculate_netcdf_chunks → get_encoding_with_chunks
- _encoding_from_dask_chunks (dask-aligned writes)
Measured on the wap_day bench (HR TCo319, 9.1 GB input, 1 year):
zlib-1 + shuffle, sync scheduler (old, bundled stack) ........ 22 MB/s
zlib-1 + shuffle, threaded (new env) ......................... 25 MB/s
blosc_zstd-3 + shuffle, threaded, dask=4 ..................... 56 MB/s
blosc_zstd-3 + shuffle, threaded, dask=1 + BLOSC_NTHREADS=16 . 106 MB/s
Also drops the tasmax_mon bench (it isolated flox's numba cold-start,
already fixed) and adds a wap_day bench pair (sync netCDF4 + system
netCDF4 variants) that exercises the new knobs end-to-end.
…ords attr) Seen together on a 9 GB wap file produced for the HR core-atmosphere benchmark. Each bug is fixable in isolation but all three were biting the same file. files.py :: _encoding_from_dask_chunks Mirror the _FillValue logic already present in get_encoding_with_chunks into the dask-aligned encoding path. Without this, large dask-backed float32 data variables (e.g. wap(time, plev19, cell)) were written with the xarray default fill of NaN instead of the CMIP-required 1e20, producing a `_FillValue != missing_value` CF §2.5 finding on every dask-path output. files.py :: _ensure_lat_lon_bounds_impl Also accept XIOS-style ``bounds_<coord>`` bounds variables (as emitted by IFS output) in addition to the CF-standard ``<coord>_bnds`` form; rename to the canonical name and fix up the ``bounds`` attr. Without this, the bounds attr on lat/lon pointed at a variable that was never promoted through the pipeline. generic.py :: get_variable When a selected model variable's coord has a ``bounds`` attr, attach the bounds variable as a coord on the returned DataArray so it survives downstream steps and is emitted in the final save. XIOS stores bounds as data_vars with an extra ``nvertex`` dim; simple ``data[var_name]`` indexing drops them. Wrapped in try/except so exotic bounds (e.g. time_bounds with ``axis_nbounds``) that xarray refuses to attach as coords are silently skipped. files.py :: _ensure_coordinates_attr (new save-time pass) Rebuild the ``coordinates`` attribute of each data variable from the current dim/coord names at save time. ``set_coordinate_attributes`` runs early in the pipeline (before ``map_dimensions``); a rename afterwards (e.g. vertical coord ``pressure_levels`` -> ``plev19``) would otherwise leave the attribute pointing at a variable that no longer exists. Wired into _ensure_lat_lon_bounds_and_external_vars alongside the existing external_variables pass so every save path picks it up. Verified on examples/_verify_sidmassth.yaml -- no regression (still CF 1 + wcrp_cmip7 8 findings, same as before).
The previous attempt attached bounds as coords of the selected DataArray, but xarray rejects coords whose dims are not a subset of the target (e.g. ``bounds_lat(cell, nvertex)`` on ``wap(time, plev19, cell)``) and returning a Dataset from ``get_variable`` broke every downstream step that relied on ``DataArray.name`` (e.g. ``scale_by_constant``). New strategy: leave ``get_variable`` alone and recover the referenced bounds variable at save time. When ``_ensure_lat_lon_bounds_impl`` sees a ``lat.bounds`` / ``lon.bounds`` attribute pointing at a variable that is not in the live dataset, open the first file named by ``rule.inputs``, locate the bounds variable (candidates: the declared name, ``bounds_<coord>``, ``<coord>_bnds``), verify the first-dim size matches the coord size, and re-attach as ``<coord>_bnds``. Fills the gap for CF §7.1 compliance on HR IFS atmospheric output without touching the pipeline flow.
CF 1.11 §7.1 is explicit that bounds variables must not carry their
own attributes -- they inherit from the parent coordinate. Both
_attach_bounds_from_mesh (FESOM path) and _recover_bounds_from_inputs
(XIOS path) were setting ``units='degrees'`` on the emitted bounds
DataArray, triggering §7.1 findings on every unstructured output:
'lat_bnds' has attr 'units' 'degrees' that does not agree with its
associated variable ('lat')'s attr value 'degrees_north'
...
The Boundary variables 'lat_bnds' should not have the attributes:
'['units']'
Pass an empty attrs dict in both emitters. Verified on sidmassth --
CF findings stay at 1 (polar-cell recommendation) and bounds vars
now carry only the ``coordinates`` attribute inherited via xarray.
libnetcdf >= 4.9 exposes ``quantize_mode`` + ``significant_digits`` encoding knobs for bit-level lossy quantization. Turn on BitGroom quantization with 5 significant digits by default for float data variables, which gives ~30-50% file-size reduction on top of zlib/BLOSC with no measurable impact on typical analyses. Apply in all three encoding builders -- ``get_encoding_with_chunks`` (chunking.py), ``_encoding_from_dask_chunks`` (dask-aligned path), and ``_calculate_netcdf_chunks`` (simple path) -- so the behaviour is consistent regardless of which write path a rule goes through. Skip cases that must remain bit-exact: * ``*_bnds``, ``*_bounds`` and ``bounds_*`` variables (CF §7.1 requires bounds values to agree exactly with the parent coord). Also prevents libnetcdf stamping a ``_QuantizeBitGroom...`` attribute on bounds, which was tripping the CF §2.3 naming check. * Integer flag / index variables (``dtype.kind != 'f'``). * Coordinate variables (not in ``ds.data_vars``). Opt-out per rule via ``netcdf_quantize_mode: null``; customise sig digits via ``netcdf_significant_digits``. Defaults were chosen to be safe for CMIP-class output where 5 sig digits is well above the precision of any model calculation.
…es. as well as commenting out dmoc for now
This was referenced Apr 27, 2026
…CMOR_HOME - core/validate.py + core/utils.py: expand $VARS (and ~ on the loader side) before resolving script:// paths, so configs can reference custom-step scripts via portable env vars instead of hard-coded absolute paths - migrate 40 example/ and awi-esm3-veg-hr-variables/ yamls from /work/ab0246/a270092/software/pycmor/... to $PYCMOR_HOME/... - absolute paths and ~/... still work; unset env vars produce a clear "Must be a valid file path" validator error early
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CMIP7 cmorization for AWI-ESM3-VEG-HR
Adds full CMIP7 support targeting AWI-ESM3-VEG-HR, including a native compound-name
architecture that replaces the legacy cmip6-table-based data request lookup.
Key changes
CMIP7 data request
DataRequestfromCMIP7_DReq_metadataJSON instead of cmip6 tablesocean.tos.tavg-u-hxy-sea.mon.GLB)cmip6_table→cmip6_cmor_tablein vendored metadatacompound_namematching againstcmip6_compound_nameandcmip7_compound_nameattributestable_idfrom compound name when not set explicitlyValueErroron zero DRV matches (instead of silent skip)Pipeline
vertical_integratecustom pipeline stepconvert()step fromDefaultPipelineStateobjects not being unwrapped to actual results in parallel runsStandard library
src/pycmor/std_lib/time_bounds.py)getattr+_pycmor_cfgfallbackglobal_attributesto derivetable_idfrom CMIP6/CMIP7 compound namesXarray accessor API
StdLibAccessorwith.process()Test infrastructure
pycmor.fixtures.model_runs)pycmor.tutorialdataset system (xarray.tutorial-style API)Misc fixes
entry_points()compatibilitypyfesom2imports for environments without itTest plan
pytest tests/unit/pycmor process examples/awiesm3-cmip7-minimal.yamlruns successfully on Levante