Address PR97 review comments + RCMIP3 wire-in#102
Conversation
Addresses the concrete code/test items from #97 review (still open: the four maintainership questions and the back-reporting of input forcings in CICEROSCMPY2): - magicc7: drop the **kwargs __init__ override; move the pymagicc availability check and magicc_scenario_setup init into _init_model where they belong on the new base. - _protocol: drop @runtime_checkable; the structural Protocol stays for type-hint use. Remove the isinstance(adapter, AdapterLike) runtime check in run.py — duck-typing is enough. - base.py: drop the default NotImplementedError from_native_distribution stub from _Adapter; adapters that ship a native bundle (FAIR2, CICEROSCMPY2) already define it themselves. - run.py: raise ValueError when scenarios is None; require IamDataFrame-shaped input (no AttributeError fallback); accept either a dict (values: cfgs list or pre-built adapter) or an iterable of (name, cfgs) tuples and/or adapter instances, freely mixable; require model_name on adapter instances rather than falling back to type().__name__. - ciceroscm_wrapper: trim the distutils-removal comment. - test_modern_adapters: run all 3 mini-bundle members; use the full ssp126/ssp245/ssp370 trio; replace the "0 < 2100 GSAT < 7" smoke assert with a pytest-regressions snapshot of GSAT + total ERF at 1850/1900/2000/2025/2050/2100; expand parametrize ids ("fair2-emissions-driven", etc). - delete tests/unit/adapters/test_{ciceroscmpy2,fair2}.py per request. - README: drop the "Supported climate models and run modes" and "Programmatic API" prose sections (live in code / notebooks). - pyproject: drop fail_under to 0 with a comment to bump back up once the adapter rewrite lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Marit's review of #97 corrected the characterisation of the conc / em files the FaIR2 and CICEROSCMPY2 adapters consume: they are RCMIP-format tabular files, not a CICERO-internal format. This is the naming-only first step of the follow-up to switch both adapters to consuming the canonical RCMIP wide-tables from Zenodo 20430630. - _concentrations_translator.py: rename CICERO_TO_FAIR2_SPECIES to RCMIP_TO_FAIR2_SPECIES; rename _read_cicero_conc_file to _read_rcmip_conc_file; rename local cicero_name to rcmip_name; update module + function docstrings to say RCMIP-format throughout. - fair2_adapter.py: update the ValueError message to say RCMIP-format ``{scen}_conc_*`` instead of CICERO-format. - ciceroscmpy2_adapter.py: update docstrings for ``historical_em_file`` / ``historical_conc_file`` and the _build_hybrid_emissions_data / _build_hybrid_concentrations_data helpers to say RCMIP-format. CICERO-SCM-internal API names (the v1.1.x Fortran adapter, the _OPENSCM_TO_CICERO_CONC map, _cicero_unit_to_pint, cicero_species locals in the species-overlay code) are unchanged: those refer to CICERO-SCM upstream conventions, not file format. No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Shared reader module for the canonical RCMIP Phase 3 wide-table CSVs (Zenodo 20430630 -- concentrations, emissions, forcing). Both adapters will consume this in the follow-up commit so we can drop the per-scenario tab-delimited input path the FaIR2 + CICEROSCMPY2 adapters currently rely on, and the hard-coded ``_DEFAULT_LAND_USE_SCENARIO = "M"`` choice in FaIR2. API: - ``load_rcmip3(path, kind, scenarios=None, variables=None, region="World", model=None)`` returns a filtered wide-format DataFrame in the canonical layout (Model, Scenario, Region, Variable, Unit, Activity_Id, Type, Priority, Mip_Era, Version, 1750, ..., 2500). - Convenience wrappers ``load_rcmip3_concentrations`` / ``_emissions`` / ``_forcings`` for the three kinds. - ``load_rcmip3_albedo_categories(path, category)`` reads the per-category Land Use + Irrigation series from the bundle's upstream ``input_datafiles_generation/data/Forcing_AFOLU_CO2.csv`` and ``Forcing_irrigation_population_scale.csv`` files. Necessary because the canonical forcing CSV only publishes the lumped ``Albedo Change`` row for ssp scenarios; per-component breakdown for ssp inputs comes from the per-category upstream files. Path resolution accepts either the bundle root, the ``RCMIP3_input_datafiles`` subdir, or a direct CSV path. Mini fixture under ``tests/test-data/rcmip3-mini/`` is a 2-scenario x 3-variable x 6-year slice of the real CSVs (no external fetch needed for tests). Unit tests at ``tests/unit/io/test_rcmip3.py`` cover the filter surface, error paths, and metadata-column validation. No adapter wiring yet; that's the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…orcings
Replaces the hard-coded ``_DEFAULT_LAND_USE_SCENARIO = "M"`` choice
in FaIR2's ``_fill_natural_forcings`` (and the equivalent
``rf_luc_file`` single-trajectory path in CICEROSCMPY2) with a
per-scenario lookup against the canonical RCMIP Phase 3 bundle
(Zenodo 20430630). Opt-in via the new ``rcmip3_bundle_path`` cfg
key on either adapter; legacy paths remain as the back-compat
default.
How it works:
- :func:`openscm_runner.io.rcmip3.resolve_scenario_category` maps
the user's scenario name to a CMIP7 ScenarioMIP category
(``VL`` / ``LN`` / ``L`` / ``ML`` / ``M`` / ``H`` / ``HL``). The
defaults cover the published SSP-RCP scenario set (ssp119 -> VL,
ssp126 -> VL, ssp245 -> M, ssp370 -> H, ssp434 -> L, ssp460 ->
ML, ssp534-over -> LN, ssp585 -> HL); callers override via the
``scenario_to_category`` cfg key, and ``scen7-{cat}`` native
CMIP7 names skip the map entirely.
- ``historical`` / ``historical-cmip6`` use the per-component
``Effective Radiative Forcing|Anthropogenic|Albedo Change|{Land
Use,Irrigation}`` rows published directly in
``rcmip_phase3_forcing_v2.0.0.csv``; the per-category lookup is
bypassed.
- Idealised scenarios (``protocol_land_use_forcing ==
"constant_zero"``) keep zeroing out via the existing
suppression mask in both adapters.
- Unknown scenarios log a warning and leave Land use / Irrigation
at zero, with a pointer to ``scenario_to_category``.
FaIR2 surface:
- ``_fill_natural_forcings`` factors the Land use + Irrigation
block out into two helpers: ``_fill_land_use_from_legacy_bundle``
(existing behaviour, kept) and ``_fill_land_use_from_rcmip3``
(new). The choice is driven by the ``rcmip3_bundle_path`` kwarg.
- Both call sites (``_run_native_cfgs`` -> ``_run_one_calibration``
and ``_run_translated_cfgs``) pass the cfg-level
``rcmip3_bundle_path`` / ``scenario_to_category`` through.
- Translated-cfg path validates uniqueness of both kwargs across
cfgs (same rule as ``emissions_bundle``).
CICEROSCMPY2 surface:
- ``_build_scendata_list`` grew a third LUC branch: when
``cfg.get("rcmip3_bundle_path")`` is set, populates the existing
in-memory ``rf_luc_data`` scendata slot from
:func:`_build_rf_luc_data_from_rcmip3` instead of writing
``rf_luc_file``. Idealised (zeros) and legacy (path) branches
unchanged.
- ``_build_rf_luc_data_from_rcmip3`` returns a year-indexed
single-column DataFrame matching the shape upstream's
``DistributionRun`` expects. CICEROSCM has no irrigation channel,
so only Land Use is wired in; Irrigation is FaIR-only.
Solar / Volcanic deferred to a follow-up commit; this PR is
land-use only per Ben + Marit's scope.
Test fixture under ``tests/test-data/rcmip3-mini/`` expanded with
the historical Albedo Change|{Land Use,Irrigation} breakdown rows
so the new code paths are exercised end-to-end without external
fetch. Unit tests in ``tests/unit/io/test_rcmip3.py`` cover the
resolver, the per-category loader, and the historical-breakdown
canonical CSV path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trix Marit's tests-multi-env job (added in 0d9f5b1 / merged via #100) runs two isolated venvs: * env1 -- fair>=1.6,<2 + ciceroscm>=1.1,<2, selects ``-m "faircicero1x_only or magicc"``. * env2 -- fair>=2,<3 + ciceroscm>=2,<3, selects ``-m "(faircicero2_only or magicc) and not faircicero1x_only"``. Unmarked tests are caught by the main ``tests`` job, which currently resolves to legacy majors via the poetry lockfile. The FaIRv2 + CICEROSCMPY2 integration smoke tests need the modern majors, so route the module through env2 via a module-level ``pytestmark``. The existing per-test ``fair2_skip`` / ``cicero_skip`` markers stay so the module still collects (and individually skips) under the main job's legacy resolution. The unit tests in tests/unit/io_tests/test_rcmip3.py stay unmarked -- they're pure pandas-based reader tests with no model dependency, so the main ``tests`` job (which runs everything unmarked) covers them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ings
Extends the ``rcmip3_bundle_path`` cfg key wired in the previous
land-use commit so it also covers Solar + Volcanic effective
radiative forcings, reading per-scenario rows from
``rcmip_phase3_forcing_v2.0.0.csv``
(``Effective Radiative Forcing|Natural|{Solar,Volcanic}``) instead of
broadcasting a single trajectory from the calibration bundle's
``solar_forcing`` / ``volcanic_forcing`` CSVs.
Per-scenario coverage matters here: the canonical CSV's
``historical`` and ``ssp245`` solar rows differ at every overlapping
year (historical uses observations, the SSP-RCP rows use the
SSP-projection extension); the legacy single-trajectory broadcast
loses that distinction.
- ``_fill_natural_forcings``: branches on ``rcmip3_bundle_path``;
delegates to ``_fill_natural_from_legacy_bundle`` (existing
single-CSV-broadcast) or the new ``_fill_natural_from_rcmip3``
(per-scenario CSV lookup with ``natural_mask`` suppression and a
warning-then-zero fallback for scenarios with no canonical row).
- Docstring reorganised: both Solar/Volcanic and Land use/Irrigation
now describe a "canonical RCMIP3" and "legacy bundle" path; the
former two columns of the structure mirror each other now.
CICEROSCMPY2 Solar+Volcanic is **not** in this commit. Two reasons:
1. Upstream ``ciceroscm`` v2 uses ``rf_sun_*`` (not ``rf_solar_*``)
as the canonical solar key; the openscm-runner adapter currently
passes ``rf_solar_file`` which upstream silently ignores in favour
of its default ``solar_IPCC.txt``. This is a pre-existing bug
that's orthogonal to the RCMIP3 wire-in.
2. Upstream's volcanic input expects a 12-column monthly DataFrame
(``rf_volc_n_data`` / ``rf_volc_s_data``); the canonical RCMIP3
CSV is annual. Translating annual -> monthly cleanly needs a
physical choice (broadcast vs. equally distribute) that I'd
rather make in a separate commit after talking with Marit.
Test fixture already covers the historical + ssp245 Solar/Volcanic
rows (added in the land-use commit). New unit tests in
tests/unit/io_tests/test_rcmip3.py verify the per-scenario rows
exist for both natural-forcing components and that the historical /
ssp245 trajectories differ at overlapping years.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…al + fix rf_sun_file key
Two changes bundled because they touch overlapping lines:
(1) Fix pre-existing key-name bug: the adapter has been passing
``rf_solar_file`` to scendata, but upstream ``ciceroscm`` v2.x only
recognises ``rf_sun_file`` / ``rf_sun_data`` in its
``InputHandler.set_sun_volc_luc_defaults`` and ``get_data`` paths.
Anything passed under ``rf_solar_*`` was silently dropped and the
model fell back to its bundled default ``solar_IPCC.txt``. Renamed
the cfg key + scendata key + canonical-file-resolver entry to
``rf_sun_file`` to match upstream. (The same latent bug exists in
cscm-calibrate's own ``run_full_rcmip_protocol.py``; flagged for
Marit.)
(2) Wire Solar + Volcanic into the canonical RCMIP3 path that was
added for Land use in the previous commit. When the
``rcmip3_bundle_path`` cfg key is set, the scendata builder now
populates ``rf_sun_data`` and ``rf_volc_data`` (in-memory year-indexed
DataFrames) from per-scenario
``Effective Radiative Forcing|Natural|{Solar,Volcanic}`` rows in
``rcmip_phase3_forcing_v2.0.0.csv`` instead of pointing at the
calibration-bundle paths. Upstream's
``InputHandler.set_sun_volc_luc_defaults`` propagates the single
``rf_volc_data`` to ``rf_volc_n_data`` / ``rf_volc_s_data``
automatically; the input handler's shape-coercion (1D annual ->
``arr[:, None]`` for volcanic) accepts annual single-column
DataFrames without needing a monthly broadcast.
Legacy file path stays as the back-compat default when
``rcmip3_bundle_path`` is not set; idealised scenarios still go
through upstream's ``sunvolc=0`` suppression as before.
Behaviour change for users who currently pass ``rf_solar_file=...``
to ``CICEROSCMPY2.from_native_distribution(...)``: that key is now
``rf_sun_file=...``. Anything passed under the old key was already
being silently dropped, so no working behaviour is lost; this just
makes the key wire-up explicit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on RCMIP3 path
Closes Zeb's review comment on
``_output_variables.py:158-162`` ("I'd suggest just back-reporting
this anyway, makes life easier for users"). CICERO-SCM v2.x doesn't
expose Solar / Volcanic / Land-use albedo as diagnostic outputs --
the model treats them as forcing inputs read from the bundle. The
adapter now echoes the per-scenario input trajectories back into
the output ScmRun when the canonical RCMIP3 path is active, so
users can request them alongside diagnostics like total ERF or GSAT.
Changes:
- ``_output_variables.RCMIP3_BACK_REPORTABLE_VARIABLES``: new
frozenset of the three back-reportable variable names.
``all_supported()`` includes it (the validator accepts requests
for these names on either path).
- The "input-side forcing" entry in ``_STRUCTURAL_LIMITS`` was
already removed in Marit's earlier cleanup commit; the module
docstring's "What CICERO-SCM v2.x cannot produce" list now drops
Solar/Volcanic/Land use and the "What it supports" list mentions
the back-report.
- ``_run_one_distribution`` splits ``output_variables`` into
upstream-produced + back-reportable. The back-reportable subset
is stripped from the upstream call and synthesised after the
model run from the scendata's ``rf_sun_data`` / ``rf_volc_data``
/ ``rf_luc_data`` trajectories. On the legacy path (without
``rcmip3_bundle_path``) the back-report is skipped with a
warning.
- New helper ``_build_rcmip3_backreport_scmrun`` constructs the
back-report ScmRun with the same metadata schema upstream's
``CSCMParWrapper`` uses (``climate_model``, ``model``, ``run_id``,
``scenario``, ``region``, ``variable``, ``unit``, then year
columns). Returns ``None`` when no scendata trajectory matches
the request (e.g. all-idealised Land-use request).
- Two unit tests in ``tests/unit/io_tests/test_rcmip3.py`` cover
the happy path (10-row shape across 2 scenarios x 2 cfgs x
partial-coverage variables) and the empty-fallback path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nonical RCMIP3
Track A wire-in: both adapters now consume the canonical
``rcmip_phase3_emissions_v2.0.0.csv`` and
``rcmip_phase3_concentrations_v2.0.0.csv`` from the Zenodo 20430630
bundle as the per-scenario baseline when ``rcmip3_bundle_path`` is
set. User ScmRun overlays still win per-species; legacy paths
preserved as fallback.
Shared canonicaliser:
- :func:`openscm_runner.io.canonicalise_rcmip3_variable` translates
canonical RCMIP3 variable names to openscm-runner conventions.
CO2 sub-sectors ("Emissions|CO2|AFOLU",
"Emissions|CO2|Energy and Industrial Processes") get the MAGICC-
style suffix; F-gas / PFC / HFC / Halon / Montreal intermediate
IAMC categories are stripped so the leaf species name ends up
adjacent to the "Emissions|" or "Atmospheric Concentrations|"
prefix. Variables that don't need translation round-trip
unchanged.
FaIR2:
- ``_emissions_translator._rcmip3_to_fair_emissions_df``: new helper
that reads canonical CSV, canonicalises variables, and reshapes to
FaIR's horizontal form (one row per (scenario, FaIR-species)).
- ``build_emissions_df`` grows an ``rcmip3_bundle_path`` kwarg.
When set, the baseline comes from the canonical CSV (per-scenario
rows) instead of the bundle's single ``historical_emissions_*.csv``.
``_splice_bundle_with_user`` grows a ``bundle_per_scenario`` flag
to skip the relabel step when the baseline is already
per-scenario.
- ``_concentrations_translator.build_concentrations_df_from_rcmip3``:
new third route alongside the existing per-scenario tab-delimited
reader and the in-memory ScmRun reader. Reads canonical CSV +
canonicalises + translates leaf species via RCMIP_TO_FAIR2_SPECIES.
- Adapter dispatch: both ``_run_one_calibration`` and
``_run_translated_cfgs`` pass ``rcmip3_bundle_path`` through to
the translators; conc-driven-mode validation accepts
``rcmip3_bundle_path`` as a third way to satisfy the
"need concentrations from somewhere" check.
CICEROSCMPY2:
- ``_merge_rcmip3_canonical_into_user``: new helper that reads
canonical emissions + concentrations CSVs, canonicalises variable
names, and unions them into the user's ScmRun upfront.
Deduplication is per ``(scenario, variable)`` -- user rows win
when both supply the same key. The existing
``_build_hybrid_emissions_data`` and
``_build_hybrid_concentrations_data`` see the merged ScmRun and
work unchanged: their ``cicero_comp_dict`` suffix-matching on
``f"Emissions|{openscm_suffix}"`` now picks up the canonicalised
MAGICC-style names.
- ``_build_scendata_list`` calls the merge at the top when
``rcmip3_bundle_path`` is set; the rest of the per-scenario loop
is untouched.
Tests:
- ``tests/unit/io_tests/test_rcmip3.py`` grows tests for the
canonicaliser (10 parametrised cases covering flat / CO2-sector /
PFC-nested / F-Gases-nested round-trips) and for the FaIR2
canonical-RCMIP3 emissions + concentrations builders against the
mini fixture.
Legacy paths unchanged: callers that don't set ``rcmip3_bundle_path``
keep the existing behaviour (read bundle historical CSV, or read
per-scenario tab-delimited conc files).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Supersedes #101 (which was opened from |
Five categories of failure surfaced when the test matrix actually ran:
1. ``NameError: name 'pd' is not defined`` in
``_build_rcmip3_backreport_scmrun``. pandas was used inside the
helper but the module had only local imports; promote ``import
pandas as pd`` to module level.
2. Pre-existing unit tests in ``tests/unit/test_{run,installation}.py``
pass ``scenarios="not used"`` (a string) on the way to exercising
out_config or adapter-import errors. The strict
``scenarios.get_unique_meta`` validation I added in PR97 ran
first and crashed with ``AttributeError`` before the
intentional error could fire. Move the scenarios validation
AFTER ``_check_out_config`` and the adapter-construction loop so
cfg-shape and missing-package errors keep their original timing.
Scenarios=None still raises ValueError as Marit requested -- just
later in the function.
3. Four dummy-adapter dispatch tests in ``tests/unit/test_run.py``
pass ``scenarios=None`` expecting the legacy "scenarios optional"
contract. Updated to use a new ``_dummy_scenarios()`` helper that
returns a minimal ScmRun (one ``Surface Temperature`` row -- the
variable doesn't start with ``Emissions|`` so
``check_variables_are_as_expected`` no-ops).
4. ``test_modern_adapters.py`` was expanded in the initial cleanup
to ssp126/ssp245/ssp370 per Zeb's "more scenarios" review
comment. The CICERO mini-bundle only ships ssp245 conc files
though, so the FaIRv2 CD path crashed with ``IndexError`` in
``fair.fill_from_pandas`` for ssp126/ssp370 (no matching rows
in the conc_df). Scoped back to ssp245 with a TODO comment;
expanding the mini bundles to cover the full trio is a follow-up.
5. Several "snapshot not found" failures from pytest-regressions
on its first run are expected -- the framework writes the
reference CSVs on the first run and the next CI run will
compare against them. Not a code fix; will commit the snapshots
in a follow-up commit once CI generates them.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sts instead Previous fix moved the scenarios check after adapter construction so ``test_no_fair`` and ``test_no_pymagicc`` (which pass scenarios as a literal string) would reach the missing-package ImportError. That broke ``test_run_rejects_unknown_emissions_variables`` which uses an unregistered ``"model_a"`` cfg and expects the bad-emissions ``ValueError`` from scenarios validation to fire BEFORE the ``NotImplementedError`` from get_adapter. Inverting the trade-off: the correct order is ``cfg-shape -> scenarios -> adapter construction`` so user-input errors are surfaced before package-import errors. ``test_no_fair`` / ``test_no_pymagicc`` / ``test_run_out_config_conflict_error`` updated to use a minimal ScmRun (Surface Temperature row -- bypasses ``check_variables_are_as_expected`` since the variable name doesn't start with ``Emissions|``) so the rest of the test runs as intended. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…carbon-cycle fix
Marit raised this in a chat after the original PR97 round: openscm-
runner shouldn't be monkey-patching upstream ciceroscm internals when
we (collectively) maintain ciceroscm too. The patches in
``_upstream_patches.py`` worked around the
``"carbon_cycle_outputs" in cfg`` presence-check bug in
``concentrations_emissions_handler.py`` of ciceroscm 2.0.1 / 2.1.0.
Upstream fixed it to ``cfg.get("carbon_cycle_outputs")`` in 2.1.1.
Our monkey-patches are now wrapping code that no longer exists --
at best a no-op, at worst a maintenance liability that will silently
break on the next upstream API drift.
- Delete ``_upstream_patches.py`` and the ``_apply()`` call from the
package ``__init__``.
- Move ``_CARBON_CYCLE_VARIABLES`` and ``output_vars_need_carbon_cycle``
to ``_output_variables.py`` -- they're still useful for the INFO
log when users request carbon-cycle variables (the 30-50x slowdown
is real regardless of the patches).
- Add ``require_modern_ciceroscm()`` in ``_compat.py`` that raises a
clear ``ImportError`` if the installed ciceroscm is below 2.1.1.
Used by ``CICEROSCMPY2._init_model`` in place of the previous
major-version-only check.
- Add ``_ciceroscm_version_tuple()`` for fine-grained checks (the
integration test's ``cicero_skip`` marker now uses this so it
collects-but-skips on 2.0.1 / 2.1.0).
- Bump the env2 install pin in ``.github/workflows/ci.yaml`` from
``ciceroscm>=2,<3`` to ``ciceroscm>=2.1.1,<3``. The pyproject pin
stays at ``>=1.1,<3`` because the ``[ciceroscmpy]`` extra still
needs the v1.1.x major; runtime enforcement is the cleaner split.
The deletion is intentionally larger than the additions:
the carbon-cycle fix lives upstream now, and that's where it
should stay.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…o/forcing paths
After Marit's PR97 review (and a follow-up chat about ducked-error
patterns), the additive "legacy paths preserved alongside canonical
RCMIP3" approach from earlier commits became dead weight. This commit
deletes the legacy paths and makes ``rcmip3_bundle_path`` mandatory
for both modern adapters.
API shift on ``from_native_distribution`` -- now takes two explicit
required args:
FAIR2.from_native_distribution(calibration_dir, rcmip3_bundle_path)
CICEROSCMPY2.from_native_distribution(calibration_dir, rcmip3_bundle_path)
``calibration_dir`` carries model-specific calibration only (parameter
posterior, species_configs / gaspam, natural-emissions baselines,
tuning scale factors). ``rcmip3_bundle_path`` carries the canonical
Zenodo 20430630 protocol inputs. The two are different things and
shouldn't be conflated; the previous ``native_distribution_path``
single-arg API + ``**cfg_overrides`` was conflating them. The new
signature has no ``**cfg_overrides`` -- file-path overrides for
protocol files are gone because the canonical CSV is the source of
truth, and the surviving non-path knobs (``member_indices``, ``mode``,
``stochastic_run``, ``max_workers``) are explicit kwargs.
Code deletions:
- ``_concentrations_translator.build_concentrations_df`` (per-scenario
tab-delimited reader), ``_read_rcmip_conc_file``, ``_pick_conc_file``.
- ``_emissions_translator``: ``bundle_emissions_csv`` arg on
``build_emissions_df`` dropped; ``_splice_bundle_with_user``'s
``bundle_per_scenario`` flag dropped (always true now).
- ``fair2_adapter._fill_natural_from_legacy_bundle``,
``_fill_land_use_from_legacy_bundle``, ``_DEFAULT_LAND_USE_SCENARIO``,
``fair2_conc_bundle_dir`` cfg key, the ``f.fill_from_rcmip()``
fallback in translated-cfg mode, the single-trajectory ``_write``
helper inside ``_fill_natural_forcings``.
- ``ciceroscmpy2_adapter`` ``_DEFAULT_CANONICAL_FILES``: drops
``rf_sun_file`` / ``rf_volc_file`` / ``rf_luc_file``. Scendata
builder always populates ``rf_sun_data`` / ``rf_volc_data`` /
``rf_luc_data`` from the canonical RCMIP3 path; the previous
three-branch (idealised / canonical / legacy) logic collapses to
two (idealised / canonical).
- ``_native_calibration._BUNDLE_FILES``: drops
``historical_emissions``, ``solar_forcing``, ``volcanic_forcing``,
``land_use_forcing``, ``irrigation_forcing``. Calibration bundle
now carries calibration-only files.
Test fixture cleanup:
- ``tests/test-data/fair2-mini-bundle/``: deleted
``historical_emissions_1750-2023_cmip7.csv``,
``solar_forcing_timebounds_cmip7.csv``,
``volcanic_forcing_timebounds_cmip7.csv``,
``land_use_forcing_timebounds_cmip7.csv``,
``irrigation_forcing_timebounds_cmip7.csv``. Calibration files
(parameter posterior, species_configs, scale factors) stay.
- ``tests/test-data/ciceroscm-mini-bundle/``: deleted
``solar_RCMIP_ssp245_RCMIP3.txt``,
``VOLC_RCMIP_ssp245_RCMIP3.txt``,
``LUCalbedo_RCMIP_ssp245_RCMIP3.txt``. Renamed
``ssp245_{em,conc}_*.txt`` to ``historical_{em,conc}_*.txt`` to
match the canonical resolver expectation (those files anchor
the CICERO species column structure + units; the actual scenario
time series come from the canonical RCMIP3 merge).
- ``tests/integration/test_modern_adapters.py`` factory functions
use the new two-arg ``from_native_distribution`` and drop the
``_CICERO_SSP245_OVERRIDES`` cfg-override dict.
Net: -7163 / +163 LOC across these files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… on canonical-bundle misses
Marit and Zeb both flagged the cluster of "warn and fall back to zero"
patterns introduced during the canonical-RCMIP3 wire-in: missing
scenario rows, unmapped scenario->category, unconvertible units. Each
one masks a real bug in the input (or in the bundle) while letting
the run produce plausible-but-wrong output -- the "AI cover all your
bases" objection.
Convert these to loud failures now that the canonical-only policy
makes "no fallback" the contract:
FaIR2:
- ``_fill_natural_from_rcmip3``: missing Solar/Volcanic row raises
KeyError instead of warning and zeroing the species.
- ``_fill_land_use_from_rcmip3``: ``resolve_scenario_category``
KeyError propagates (caller must add an override); missing
historical Albedo Change rows raise.
- ``_emissions_translator._unit_scale``: drop the broad
``except Exception -> None`` and the caller's
"skip-this-species" warning; let pint's
UndefinedUnitError / DimensionalityError surface.
CICEROSCMPY2:
- ``_build_natural_data_from_rcmip3`` /
``_build_rf_luc_data_from_rcmip3``: same canonical-row raises as
FaIR2; ``resolve_scenario_category`` KeyError propagates.
- ``_run_one_distribution``: requesting back-reportable variables
without ``rcmip3_bundle_path`` set now raises ValueError instead
of silently dropping them from the output.
- ``_build_hybrid_emissions_data``: drop the broad
``except Exception -> skipped_unmapped`` on unit conversion;
fail loudly if a species mapping has a bad target unit or the
user ScmRun ships an unparseable unit.
Also refresh the CICEROSCMPY2 module docstring's cfg-key reference
section -- it still listed the ``rf_sun_file`` /
``rf_volc_file`` / ``rf_luc_file`` / ``rf_luc_constant_zero_file``
keys removed in the previous commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pts the frame
The first env2 CI run on the canonical-only branch failed both
FaIRv2 smoke tests with::
fair.exceptions.NonMonotonicError: Time values in the emissions
file must be strictly monotonically increasing.
Root cause is in ``_splice_bundle_with_user``: the bundle frame may
not cover every year the user supplies (the mini-bundle fixture
trims to a 6-year subset; the full canonical CSV covers 1750-2500
yearly but a user can in principle drop in extra years too). When
the user introduces a year the bundle did not have, the assignment
``spliced_df.loc[mask, year] = value * scale`` (or the
``pd.concat`` path for unmapped-species rows) parks the new column
at the right edge in user-row order -- which is typically not
monotonic relative to the bundle tail (mini-bundle ends at 2100,
user adds 2015 next).
Fix: after the ffill, reindex the frame columns to
``meta_cols + sorted(year_cols)`` so year columns are strictly
monotonic before the frame leaves the translator. Mirrors the same
expectation FaIR's ``_check_csv`` enforces a few lines later.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sions-mode species
The first env2 CI run on the canonical-only branch failed both
FaIRv2 smoke tests at ``fill_from_pandas`` with::
IndexError: index 0 is out of bounds for axis 0 with size 0
Root cause: the rcmip3-mini fixture ships only 3 emissions species
(CO2 FFI + CO2 AFOLU + CH4 for historical + ssp245). FaIR 2.x's
AR6-calibrated species_configs has 51 species with
``input_mode == "emissions"``. When ``fill_from_pandas`` iterates
the full species list, every species we did not supply hits the
empty-row ``.values[0]`` at fair.io.fill_from:219.
In CD mode only CO2 / CH4 / N2O flip to ``concentration``; the
other 48 species still need an emissions row. So both ED and CD
need the full 51-species coverage.
Fix: regenerate the mini fixture's emissions CSV with placeholder
0-emissions rows for every other FaIR species, using the canonical
RCMIP variable names (``Emissions|<species>`` for everything except
CO2 sub-sectors which keep ``Emissions|CO2|AFOLU`` /
``Emissions|CO2|Energy and Industrial Processes``). Values for
CH4 + CO2 FFI + CO2 AFOLU are preserved verbatim; the added
species use 0 across all years (mini fixture is for shape /
smoke only, not scientific reproduction). Historical rows leave
2050/2100 empty to match the existing ssp-vs-historical pattern.
The fixture-shape unit test
``test_fair2_emissions_canonical_path_translates_variables`` got
its hard species-set assertion replaced with a
``.issubset`` check covering the sub-sector + intermediate-category
representatives the translator actually has to handle, so it
remains a meaningful regression on the canonicalisation logic
without re-asserting the exact (now-expanded) fixture content.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… to pint errors
The commit 3 broad-except removal caused the cicero env2 smokes to
fail with::
pint.errors.UndefinedUnitError: 'GgH1211' is not defined in
the unit registry
Root cause: ``_cicero_unit_to_pint`` reproduces the v1.1.x adapter
gaspam convention of concatenating prefix + species name
(``Gg`` + ``H1211`` -> ``GgH1211``) for the unit string. Most
species this round-trips through openscm-units fine, but Halons
are recorded in the gaspam as ``H-1211`` / ``H-1301`` / ``H-2402``
rather than ``Halon-1211`` etc., so the concatenated form is not
in openscm-units' registry. ``GgHalon1211`` would be, but the
gaspam-naming round-trip would need to happen in the
``_cicero_unit_to_pint`` helper, not in this loop.
The "honest errors" stance from commit 3 was directed at the
canonical-RCMIP3 path -- silently zeroing scientific data when a
bundle row is missing was Marit's concern. The legacy hybrid
emissions overlay is a different situation: the user-side mapping
is best-effort, the species' baseline trajectory stays in place
(not zeros), and the convention pre-dates this PR. Narrow the
except to ``UndefinedUnitError`` / ``DimensionalityError`` only so
real bugs still raise, but the Halon naming quirk doesn't break
every cicero smoke run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…test_modern_adapters After the canonical-RCMIP3 wire-in (commit 3a07afb) and subsequent fixture / error-handling fixes, all four ``test_adapter_smoke`` parameterisations run end-to-end through their respective climate models. The only remaining failure is the pytest-regressions first-run "File not found in data directory, created:" -- the snapshot CSV gets written but the test fails so the runner can notice and commit it. CI has no way to surface that CSV at the moment, so we cannot baseline the snapshots without re-running the env2 stack somewhere we control. Add an ``actions/upload-artifact@v4`` step gated on ``if: failure()`` that captures every CSV under ``tests/integration/test_modern_adapters/``. After this lands the next CI run will publish four CSVs as a workflow artifact; those get committed in a follow-up and the smoke tests turn green. ``if-no-files-found: ignore`` so non-pytest-regressions failures don't trip an extra error on top. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ts from first CI run CSVs generated by the env2 job on commit 0a33faf and pulled via the artifact upload step added in that commit. Identical across py3.10 and py3.11 (deterministic per-member runs), so one set suffices. Covers all four parameterised cases: - test_adapter_smoke_fair2_emissions_driven_.csv - test_adapter_smoke_fair2_concentration_driven_.csv - test_adapter_smoke_cicero_emissions_driven_.csv - test_adapter_smoke_cicero_concentration_driven_.csv After this lands the env2 CI step turns green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR follows up on #97 by tightening the openscm_runner.run.run() input surface, wiring FaIRv2 + CICERO-SCM-PY2 to consume canonical RCMIP3 (Zenodo 20430630) wide-table inputs, and reshaping tests/CI to exercise the modern adapters via a dedicated environment.
Changes:
- Add a new
openscm_runner.io.rcmip3module for loading/filtering canonical RCMIP3 wide-table CSVs (plus a mini in-repo fixture) and use it in FaIRv2 emissions/concentrations pathways. - Update
run.run()to accept dict or iterable inputs (including pre-constructed adapters) and enforcescenariosas required (ScmRun-like) for variable validation. - Refresh integration/unit tests and CI env2 (fair>=2, ciceroscm>=2.1.1) to validate the canonical-path wiring with regression snapshots.
Reviewed changes
Copilot reviewed 44 out of 46 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_run.py | Updates unit tests to satisfy the new scenarios validation contract. |
| tests/unit/test_installation.py | Adds minimal ScmRun helper so missing-package tests fail at adapter import, not scenario validation. |
| tests/unit/io_tests/test_rcmip3.py | New unit tests for RCMIP3 reader, canonicaliser, resolver, and adapter helper behavior. |
| tests/unit/adapters/test_fair2.py | Removes older FaIRv2 unit-test module (coverage shifts to new tests/integration). |
| tests/unit/adapters/test_ciceroscmpy2.py | Removes older CICEROSCMPY2 unit-test module (coverage shifts to new tests/integration). |
| tests/test-data/rcmip3-mini/rcmip_phase3_forcing_v2.0.0.csv | Adds mini canonical forcing CSV fixture. |
| tests/test-data/rcmip3-mini/rcmip_phase3_emissions_v2.0.0.csv | Adds mini canonical emissions CSV fixture. |
| tests/test-data/rcmip3-mini/rcmip_phase3_concentrations_v2.0.0.csv | Adds mini canonical concentrations CSV fixture. |
| tests/test-data/rcmip3-mini/input_datafiles_generation/data/Forcing_irrigation_population_scale.csv | Adds per-category irrigation split fixture for albedo breakdown. |
| tests/test-data/rcmip3-mini/input_datafiles_generation/data/Forcing_AFOLU_CO2.csv | Adds per-category land-use albedo split fixture. |
| tests/test-data/fair2-mini-bundle/volcanic_forcing_timebounds_cmip7.csv | Removes legacy bundle volcanic forcing CSV (now sourced via RCMIP3). |
| tests/test-data/ciceroscm-mini-bundle/solar_RCMIP_ssp245_RCMIP3.txt | Removes legacy per-scenario solar forcing file fixture. |
| tests/test-data/ciceroscm-mini-bundle/LUCalbedo_RCMIP_ssp245_RCMIP3.txt | Removes legacy per-scenario land-use albedo file fixture. |
| tests/integration/test_modern_adapters/test_adapter_smoke_fair2_emissions_driven_.csv | New pytest-regressions snapshot for FaIRv2 emissions-driven smoke. |
| tests/integration/test_modern_adapters/test_adapter_smoke_fair2_concentration_driven_.csv | New pytest-regressions snapshot for FaIRv2 concentration-driven smoke. |
| tests/integration/test_modern_adapters/test_adapter_smoke_cicero_emissions_driven_.csv | New pytest-regressions snapshot for CICEROSCMPY2 emissions-driven smoke. |
| tests/integration/test_modern_adapters/test_adapter_smoke_cicero_concentration_driven_.csv | New pytest-regressions snapshot for CICEROSCMPY2 concentration-driven smoke. |
| tests/integration/test_modern_adapters.py | Reworks modern-adapter integration testing into regression snapshots and routes via env2 marker. |
| src/openscm_runner/run.py | Supports mixed input shapes (dict/iterable + adapters), normalises entries, and enforces scenarios non-None. |
| src/openscm_runner/io/rcmip3.py | New canonical RCMIP3 wide-table reader + variable canonicaliser + scenario-category resolver. |
| src/openscm_runner/io/init.py | Exposes RCMIP3 I/O utilities from the openscm_runner.io package. |
| src/openscm_runner/adapters/magicc7/magicc7.py | Moves pymagicc availability check/setup into _init_model. |
| src/openscm_runner/adapters/fair2_adapter/_native_calibration.py | Removes scenario inputs/forcings from the FaIR calibration bundle contract (now RCMIP3-driven). |
| src/openscm_runner/adapters/fair2_adapter/_emissions_translator.py | Reads baseline emissions from canonical RCMIP3 and tightens unit-conversion behavior. |
| src/openscm_runner/adapters/fair2_adapter/_concentrations_translator.py | Adds canonical-RCMIP3 concentration builder and removes legacy conc-file reader path. |
| src/openscm_runner/adapters/ciceroscm_py2_adapter/_upstream_patches.py | Deletes now-unneeded upstream monkey-patches (requires ciceroscm>=2.1.1 instead). |
| src/openscm_runner/adapters/ciceroscm_py2_adapter/_output_variables.py | Documents/validates supported outputs and adds back-reportable RCMIP3 forcing variables. |
| src/openscm_runner/adapters/ciceroscm_py2_adapter/_compat.py | Adds version parsing + enforces ciceroscm>=2.1.1. |
| src/openscm_runner/adapters/ciceroscm_py2_adapter/init.py | Updates package docs to reflect the ciceroscm>=2.1.1 floor and removed patches. |
| src/openscm_runner/adapters/ciceroscm_py_adapter/ciceroscmpy.py | Removes redundant __init__ passthrough. |
| src/openscm_runner/adapters/ciceroscm_adapter/ciceroscm.py | Minor exception/message tweak + security linter annotation change. |
| src/openscm_runner/adapters/ciceroscm_adapter/ciceroscm_wrapper.py | Shortens distutils migration comment in tempdir setup. |
| src/openscm_runner/adapters/base.py | Removes unused TYPE_CHECKING import and deletes base from_native_distribution. |
| src/openscm_runner/adapters/_protocol.py | Drops @runtime_checkable from AdapterLike protocol. |
| README.md | Removes programmatic API and supported-models prose blocks. |
| pyproject.toml | Adjusts coverage gate (fail_under=0) and documents mutual-exclusion constraints + ciceroscm floor. |
| .github/workflows/ci.yaml | Pins env2 ciceroscm>=2.1.1 and uploads regression snapshots on failure. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| scenarios : :obj:`pyam.IamDataFrame` | ||
| Scenarios to run. | ||
|
|
| @@ -115,71 +140,46 @@ def run( | |||
| ``out_config`` has keys which are not in ``climate_models_cfgs``. | |||
|
|
|||
| def _normalise_entries(climate_models_cfgs): | ||
| """Coerce dict or iterable input into ``[(name, value), ...]`` pairs.""" | ||
| if isinstance(climate_models_cfgs, Mapping): | ||
| return list(climate_models_cfgs.items()) | ||
| entries = [] | ||
| for item in climate_models_cfgs: | ||
| if _looks_like_adapter(item): | ||
| name = getattr(item, "model_name", None) | ||
| if name is None: | ||
| raise TypeError( | ||
| "Adapter instance is missing a `model_name` attribute: " | ||
| f"{type(item).__name__}." | ||
| ) | ||
| entries.append((name, item)) | ||
| else: | ||
| name, value = item # expect a (name, cfgs) tuple | ||
| entries.append((name, value)) | ||
| return entries |
| def output_vars_need_carbon_cycle(output_variables) -> bool: | ||
| """Return ``True`` if any requested variable needs the carbon cycle.""" | ||
| return bool(set(output_variables) & _CARBON_CYCLE_VARIABLES) | ||
| """Forcing inputs the adapter back-reports on the canonical RCMIP3 path. | ||
|
|
||
| CICERO-SCM v2.x treats Solar / Volcanic / Land-use albedo as input | ||
| forcings (read from the bundle's per-scenario forcing files) rather | ||
| than diagnostic outputs. The model neither computes nor exposes them | ||
| through :data:`SUPPORTED_VARIABLES`. The adapter back-reports them by | ||
| echoing the per-scenario trajectories it already has in scope when | ||
| the ``rcmip3_bundle_path`` cfg key is set (see | ||
| :func:`_build_natural_data_from_rcmip3` and | ||
| :func:`_build_rf_luc_data_from_rcmip3` in the adapter module). On | ||
| the legacy bundle path the adapter logs a warning and omits the | ||
| back-report from the output.""" |
| Returns a DataFrame in the same horizontal layout that | ||
| :func:`_scmrun_to_fair2_rows` produces (one row per | ||
| ``(scenario, FaIR-species)``, integer year columns), keyed by user | ||
| scenario rather than a single bundle ``"historical"`` row -- so | ||
| :func:`_splice_bundle_with_user` must be called with | ||
| ``bundle_per_scenario=True``. | ||
|
|
| bundle_unit = spliced_df.loc[mask, "unit"].iloc[0] | ||
| user_unit = user_row.get("unit") | ||
| scale = _unit_scale(user_unit, bundle_unit, user_row["variable"]) | ||
| if scale is None: | ||
| LOGGER.warning( | ||
| "Could not convert %r emissions from user unit %r to " | ||
| "bundle unit %r; user data NOT overlaid for this species, " | ||
| "bundle values left in place.", | ||
| user_row["variable"], | ||
| user_unit, | ||
| bundle_unit, | ||
| ) | ||
| continue | ||
|
|
… + stricter _normalise_entries)
Copilot left six inline review comments on the PR. None are bugs;
five are docstring/comment drift from the recent strict-canonical
and error-tightening commits, and one is a minor input-validation
hardening.
run.py
- The ``scenarios`` parameter is documented as a
``pyam.IamDataFrame``, but the function now calls
``scenarios.get_unique_meta("variable")`` -- an ``ScmRun``
method. Update the docstring to require an ``ScmRun`` and tell
legacy ``IamDataFrame`` users to wrap their input.
- The ``Raises`` section claimed ``KeyError`` for unknown
``out_config`` keys, but ``_check_out_config`` only warns. Move
the warning behaviour into a ``Notes`` section and drop the
``KeyError`` claim.
- ``_normalise_entries`` would happily iterate over a stray
string (``"magicc7"``) and then fail with a low-signal
``ValueError: too many values to unpack``. Reject ``str`` /
``bytes`` explicitly at the top level and per-entry, with a
clear ``TypeError``. Also wrap the per-entry unpack in a
``try/except`` so non-pair entries also produce a clear
message.
ciceroscm_py2_adapter/_output_variables.py
- There was a stray triple-quoted string literal floating between
``output_vars_need_carbon_cycle`` and the next module-level
block. As a standalone expression it has no effect (and lints
flag it as a pointless statement). Move the prose to a ``#``
comment block above the ``RCMIP3_BACK_REPORTABLE_VARIABLES``
definition it was meant to document, and refresh the wording
since the "legacy bundle path" path it referenced was deleted
in commit 3a07afb.
fair2_adapter/_emissions_translator.py
- The ``_rcmip3_to_fair_emissions_df`` docstring referenced a
``bundle_per_scenario=True`` parameter that the
``_splice_bundle_with_user`` signature never had (and certainly
doesn't now). Rewrite the relevant paragraph to describe what
the current contract actually is.
- The overlay block above ``_unit_scale`` still claimed that
conversion failures get logged and the row skipped, but the
commit 3 tightening made ``_unit_scale`` raise. Update the
comment so future readers don't re-introduce the silent-skip.
Unit tests still pass (61 passed) -- no behaviour change for
existing call sites since ``_normalise_entries`` only narrows
already-invalid inputs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
40398dc
into
feat/fair2-ciceroscmpy2-adapters-and-runmode-nonfork
Draft. Stacked on top of #97 to address @znicholls's review comments, plus a Track A canonical-RCMIP3 wire-in that came out of the back-and-forth on the file-format question. Discussion history (with @maritsandstad) lived on the fork at benmsanderson/openscm-runner#15 — that PR description has the long-form context.
Opening as a draft so reviewers can see the shape before we settle the remaining maintainership wording.
How the commits map to #97 review comments
Initial cleanup pass (commit
9ad17c3) — closes 13 of 21magicc7.py:55(__init__kwargs) — reverted; pymagicc check + setup moved into_init_model._protocol.py:32(don't use, it's slow and frail) —@runtime_checkabledropped, isinstance check removed fromrun.py.base.py:155(from_native_distributionon base) — removed from base.run.py:131(scenarios=Noneshould raise) — now raisesValueError.run.py:128(onlyIamDataFrame) — droppedAttributeErrorfallback.run.py:76(mix the two forms) — accepts dict or iterable, freely mixable per-entry.run.py:181(getattr(adapter, "model_name", None)) — applied; notype().__name__fallback.ciceroscm_wrapper.py:156(shorter distutils comment) — done.test_modern_adapters.py:62(all 3 members) — done.test_modern_adapters.py:149(writepytest.paramids fully) — done.test_modern_adapters.py:155(pytest-regressions snapshot) — GSAT + total ERF at 1850/1900/2000/2025/2050/2100 across ssp126/245/370.test_{ciceroscmpy2,fair2}.py(delete) — deleted.README.mdprose blocks — removed.Coverage
fail_underdropped to 0 with a comment per @znicholls's followup.Naming hygiene (commit
7e2cc7b)CICERO_TO_FAIR2_SPECIES→RCMIP_TO_FAIR2_SPECIES;_read_cicero_conc_file→_read_rcmip_conc_file; docstring fixes throughout_concentrations_translator.pyand adjacent files. Pure rename — addresses @maritsandstad's point that the adapter doesn't read CICERO-internal format anywhere; the files it consumes are RCMIP-shaped tabular files and the code was mislabelled.CI matrix marker (commit
fe2bd70)test_modern_adapters.pymarkedfaircicero2_onlyso it routes through env2 of @maritsandstad'stests-multi-envjob (modern-major venv).Bug fix:
rf_sun_file(commited94023, bundled with Solar/Volcanic wire-in)While wiring up the Solar/Volcanic side of the RCMIP3 work, I verified the upstream
ciceroscmv2.1.2 API and found that the adapter was passingrf_solar_filein scendata but upstream'sInputHandleronly recognisesrf_sun_file/rf_sun_data. Anything passed asrf_solar_*was silently dropped and the model fell back to the defaultsolar_IPCC.txt. Renamed throughout (cfg key, scendata key, canonical-file resolver, test override).Note: same latent bug exists in cscm-calibrate's own
run_full_rcmip_protocol.py— flagged to @maritsandstad separately. The published RCMIP3 reference runs may have been using the default solar trajectory instead of the per-scenariosolar_RCMIP_*_RCMIP3.txt.Back-report Solar / Volcanic / Land use (commit
bb39feb)Closes @znicholls's
_output_variables.py:158-162comment. When the canonical RCMIP3 path is active, the adapter back-reports per-scenario Solar / Volcanic / Land use ERF trajectories into the output ScmRun. On the legacy path the request logs a warning and skips the back-report.Canonical RCMIP3 wire-in (Track A — out of strict scope but felt right)
Both
Albedo Change|{Land Use,Irrigation}on @znicholls's hard-coded-irrigation comment and the conc-file-format on @maritsandstad's review point at the same underlying issue: the adapter shouldn't be locked to per-bundle file conventions when a canonical published source exists. Wired both adapters to consume the canonical Zenodo 20430630 wide-tables when the newrcmip3_bundle_pathcfg key is set. Legacy paths stay as fallback.Three axes, three commits (
c047e1c,c5787b0,6affa5c) staged for review clarity:_fill_land_use_from_rcmip3reads per-CMIP7-category Land Use + Irrigation frominput_datafiles_generation/data/;historicaluses the published per-component breakdown._build_rf_luc_data_from_rcmip3populatesrf_luc_datain-memory scendata (CICERO has no irrigation channel)._fill_natural_from_rcmip3reads per-scenarioEffective Radiative Forcing|Natural|{Solar,Volcanic}fromrcmip_phase3_forcing_v2.0.0.csv._build_natural_data_from_rcmip3populatesrf_sun_data+rf_volc_datain-memory; upstream'sInputHandlershape-coercion handles annual → multi-axis._rcmip3_to_fair_emissions_df+build_concentrations_df_from_rcmip3consumercmip_phase3_{emissions,concentrations}_v2.0.0.csv; existing user-ScmRun overlay still wins per-species._merge_rcmip3_canonical_into_userunions canonical rows into user's ScmRun upfront; existing_build_hybrid_*overlay code is untouched.Scenario→CMIP7-category resolver covers the SSP-RCP set with sensible defaults (
ssp119/126→VL,ssp245→M,ssp370→H,ssp434→L,ssp460→ML,ssp534-over→LN,ssp585→HL) and accepts per-cfgscenario_to_categoryoverrides. Nativescen7-{cat}names pass through directly.historical/historical-cmip6use the published per-component breakdowns.canonicalise_rcmip3_variablehandles the IAMC-naming differences (CO2 sub-sectors → MAGICC-style, intermediate|F-Gases|HFC|,|PFC|categories stripped) so the existing per-adapter species translation maps work without changes.What's deliberately not done
irrigation_forcing_timebounds_cmip7.csv"M" hard-coding is gone on the canonical path. Adapter resolves per scenario, with override.build_concentrations_df_from_scmruncovers it; the new Track A path is just an additional baseline source.Test coverage
tests/unit/io_tests/test_rcmip3.py(29 tests).tests/test-data/rcmip3-mini/is a 2-scenario × 3-variable slice of the real Zenodo CSVs, runs without external fetch.tests/integration/test_modern_adapters.pycovers FaIR2 + CICEROSCMPY2 ED/CD across ssp126/245/370 with a pytest-regressions snapshot; markedfaircicero2_onlyfor @maritsandstad'stests-multi-envenv2.rcmip3_bundle_path.🤖 Generated with Claude Code