Address PR97 review comments + RCMIP3 wire-in by benmsanderson · Pull Request #101 · openscm/openscm-runner

benmsanderson · 2026-06-04T17:59:58Z

Draft. Stacked on top of #97 to address @znicholls's review comments, plus a Track A canonical-RCMIP3 wire-in that came out of the back-and-forth on the file-format question. Discussion history (with @maritsandstad) lived on the fork at benmsanderson/openscm-runner#15 — that PR description has the long-form context.

Opening as a draft so reviewers can see the shape before we settle the remaining maintainership wording.

How the commits map to #97 review comments

Initial cleanup pass (commit `9ad17c3`) — closes 13 of 21

magicc7.py:55 (__init__ kwargs) — reverted; pymagicc check + setup moved into _init_model.
_protocol.py:32 (don't use, it's slow and frail) — @runtime_checkable dropped, isinstance check removed from run.py.
base.py:155 (from_native_distribution on base) — removed from base.
run.py:131 (scenarios=None should raise) — now raises ValueError.
run.py:128 (only IamDataFrame) — dropped AttributeError fallback.
run.py:76 (mix the two forms) — accepts dict or iterable, freely mixable per-entry.
run.py:181 (getattr(adapter, "model_name", None)) — applied; no type().__name__ fallback.
ciceroscm_wrapper.py:156 (shorter distutils comment) — done.
test_modern_adapters.py:62 (all 3 members) — done.
test_modern_adapters.py:149 (write pytest.param ids fully) — done.
test_modern_adapters.py:155 (pytest-regressions snapshot) — GSAT + total ERF at 1850/1900/2000/2025/2050/2100 across ssp126/245/370.
test_{ciceroscmpy2,fair2}.py (delete) — deleted.
README.md prose blocks — removed.

Coverage fail_under dropped to 0 with a comment per @znicholls's followup.

Naming hygiene (commit `7e2cc7b`)

CICERO_TO_FAIR2_SPECIES → RCMIP_TO_FAIR2_SPECIES; _read_cicero_conc_file → _read_rcmip_conc_file; docstring fixes throughout _concentrations_translator.py and adjacent files. Pure rename — addresses @maritsandstad's point that the adapter doesn't read CICERO-internal format anywhere; the files it consumes are RCMIP-shaped tabular files and the code was mislabelled.

CI matrix marker (commit `fe2bd70`)

test_modern_adapters.py marked faircicero2_only so it routes through env2 of @maritsandstad's tests-multi-env job (modern-major venv).

Bug fix: `rf_sun_file` (commit `ed94023`, bundled with Solar/Volcanic wire-in)

While wiring up the Solar/Volcanic side of the RCMIP3 work, I verified the upstream ciceroscm v2.1.2 API and found that the adapter was passing rf_solar_file in scendata but upstream's InputHandler only recognises rf_sun_file / rf_sun_data. Anything passed as rf_solar_* was silently dropped and the model fell back to the default solar_IPCC.txt. Renamed throughout (cfg key, scendata key, canonical-file resolver, test override).

Note: same latent bug exists in cscm-calibrate's own run_full_rcmip_protocol.py — flagged to @maritsandstad separately. The published RCMIP3 reference runs may have been using the default solar trajectory instead of the per-scenario solar_RCMIP_*_RCMIP3.txt.

Back-report Solar / Volcanic / Land use (commit `bb39feb`)

Closes @znicholls's _output_variables.py:158-162 comment. When the canonical RCMIP3 path is active, the adapter back-reports per-scenario Solar / Volcanic / Land use ERF trajectories into the output ScmRun. On the legacy path the request logs a warning and skips the back-report.

Canonical RCMIP3 wire-in (Track A — out of strict scope but felt right)

Both Albedo Change|{Land Use,Irrigation} on @znicholls's hard-coded-irrigation comment and the conc-file-format on @maritsandstad's review point at the same underlying issue: the adapter shouldn't be locked to per-bundle file conventions when a canonical published source exists. Wired both adapters to consume the canonical Zenodo 20430630 wide-tables when the new rcmip3_bundle_path cfg key is set. Legacy paths stay as fallback.

Three axes, three commits (c047e1c, c5787b0, 6affa5c) staged for review clarity:

Axis	FaIR2	CICEROSCMPY2
Land use + Irrigation forcings	`_fill_land_use_from_rcmip3` reads per-CMIP7-category Land Use + Irrigation from `input_datafiles_generation/data/`; `historical` uses the published per-component breakdown.	`_build_rf_luc_data_from_rcmip3` populates `rf_luc_data` in-memory scendata (CICERO has no irrigation channel).
Solar / Volcanic forcings	`_fill_natural_from_rcmip3` reads per-scenario `Effective Radiative Forcing\|Natural\|{Solar,Volcanic}` from `rcmip_phase3_forcing_v2.0.0.csv`.	`_build_natural_data_from_rcmip3` populates `rf_sun_data` + `rf_volc_data` in-memory; upstream's `InputHandler` shape-coercion handles annual → multi-axis.
Emissions + Concentrations	`_rcmip3_to_fair_emissions_df` + `build_concentrations_df_from_rcmip3` consume `rcmip_phase3_{emissions,concentrations}_v2.0.0.csv`; existing user-ScmRun overlay still wins per-species.	`_merge_rcmip3_canonical_into_user` unions canonical rows into user's ScmRun upfront; existing `_build_hybrid_*` overlay code is untouched.

Scenario→CMIP7-category resolver covers the SSP-RCP set with sensible defaults (ssp119/126→VL, ssp245→M, ssp370→H, ssp434→L, ssp460→ML, ssp534-over→LN, ssp585→HL) and accepts per-cfg scenario_to_category overrides. Native scen7-{cat} names pass through directly. historical / historical-cmip6 use the published per-component breakdowns.

canonicalise_rcmip3_variable handles the IAMC-naming differences (CO2 sub-sectors → MAGICC-style, intermediate |F-Gases|HFC|, |PFC| categories stripped) so the existing per-adapter species translation maps work without changes.

What's deliberately not done

Item	Status	Why
Maintainership wording for CICEROSCMPY2 + FaIRv2 (#3, #6)	Deferred	@maritsandstad ok with co-maintaining CICEROSCMPY2 ("baseline ok to be (co-)maintainer"). @chrisroadmap not yet pinged for FaIRv2 — would appreciate a tag here if you want to push that thread forward.
Hard-coded irrigation question (#17)	Implicitly addressed by Track A	The bundled `irrigation_forcing_timebounds_cmip7.csv` "M" hard-coding is gone on the canonical path. Adapter resolves per scenario, with override.
Concentrations / emissions ScmRun-only path	Already exists	`build_concentrations_df_from_scmrun` covers it; the new Track A path is just an additional baseline source.

Test coverage

Unit tests for the RCMIP3 reader, scenario→category resolver, canonicaliser, FaIR2 canonical-path builders, and CICEROSCMPY2 back-report helper all live in tests/unit/io_tests/test_rcmip3.py (29 tests).
Mini fixture under tests/test-data/rcmip3-mini/ is a 2-scenario × 3-variable slice of the real Zenodo CSVs, runs without external fetch.
Integration tests/integration/test_modern_adapters.py covers FaIR2 + CICEROSCMPY2 ED/CD across ssp126/245/370 with a pytest-regressions snapshot; marked faircicero2_only for @maritsandstad's tests-multi-env env2.
Legacy paths (existing bundle CSVs, ScmRun-only conc) are untouched — same code, no behaviour change for callers that don't opt into rcmip3_bundle_path.

🤖 Generated with Claude Code

Addresses the concrete code/test items from openscm#97 review (still open: the four maintainership questions and the back-reporting of input forcings in CICEROSCMPY2): - magicc7: drop the **kwargs __init__ override; move the pymagicc availability check and magicc_scenario_setup init into _init_model where they belong on the new base. - _protocol: drop @runtime_checkable; the structural Protocol stays for type-hint use. Remove the isinstance(adapter, AdapterLike) runtime check in run.py — duck-typing is enough. - base.py: drop the default NotImplementedError from_native_distribution stub from _Adapter; adapters that ship a native bundle (FAIR2, CICEROSCMPY2) already define it themselves. - run.py: raise ValueError when scenarios is None; require IamDataFrame-shaped input (no AttributeError fallback); accept either a dict (values: cfgs list or pre-built adapter) or an iterable of (name, cfgs) tuples and/or adapter instances, freely mixable; require model_name on adapter instances rather than falling back to type().__name__. - ciceroscm_wrapper: trim the distutils-removal comment. - test_modern_adapters: run all 3 mini-bundle members; use the full ssp126/ssp245/ssp370 trio; replace the "0 < 2100 GSAT < 7" smoke assert with a pytest-regressions snapshot of GSAT + total ERF at 1850/1900/2000/2025/2050/2100; expand parametrize ids ("fair2-emissions-driven", etc). - delete tests/unit/adapters/test_{ciceroscmpy2,fair2}.py per request. - README: drop the "Supported climate models and run modes" and "Programmatic API" prose sections (live in code / notebooks). - pyproject: drop fail_under to 0 with a comment to bump back up once the adapter rewrite lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Marit's review of openscm#97 corrected the characterisation of the conc / em files the FaIR2 and CICEROSCMPY2 adapters consume: they are RCMIP-format tabular files, not a CICERO-internal format. This is the naming-only first step of the follow-up to switch both adapters to consuming the canonical RCMIP wide-tables from Zenodo 20430630. - _concentrations_translator.py: rename CICERO_TO_FAIR2_SPECIES to RCMIP_TO_FAIR2_SPECIES; rename _read_cicero_conc_file to _read_rcmip_conc_file; rename local cicero_name to rcmip_name; update module + function docstrings to say RCMIP-format throughout. - fair2_adapter.py: update the ValueError message to say RCMIP-format ``{scen}_conc_*`` instead of CICERO-format. - ciceroscmpy2_adapter.py: update docstrings for ``historical_em_file`` / ``historical_conc_file`` and the _build_hybrid_emissions_data / _build_hybrid_concentrations_data helpers to say RCMIP-format. CICERO-SCM-internal API names (the v1.1.x Fortran adapter, the _OPENSCM_TO_CICERO_CONC map, _cicero_unit_to_pint, cicero_species locals in the species-overlay code) are unchanged: those refer to CICERO-SCM upstream conventions, not file format. No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Shared reader module for the canonical RCMIP Phase 3 wide-table CSVs (Zenodo 20430630 -- concentrations, emissions, forcing). Both adapters will consume this in the follow-up commit so we can drop the per-scenario tab-delimited input path the FaIR2 + CICEROSCMPY2 adapters currently rely on, and the hard-coded ``_DEFAULT_LAND_USE_SCENARIO = "M"`` choice in FaIR2. API: - ``load_rcmip3(path, kind, scenarios=None, variables=None, region="World", model=None)`` returns a filtered wide-format DataFrame in the canonical layout (Model, Scenario, Region, Variable, Unit, Activity_Id, Type, Priority, Mip_Era, Version, 1750, ..., 2500). - Convenience wrappers ``load_rcmip3_concentrations`` / ``_emissions`` / ``_forcings`` for the three kinds. - ``load_rcmip3_albedo_categories(path, category)`` reads the per-category Land Use + Irrigation series from the bundle's upstream ``input_datafiles_generation/data/Forcing_AFOLU_CO2.csv`` and ``Forcing_irrigation_population_scale.csv`` files. Necessary because the canonical forcing CSV only publishes the lumped ``Albedo Change`` row for ssp scenarios; per-component breakdown for ssp inputs comes from the per-category upstream files. Path resolution accepts either the bundle root, the ``RCMIP3_input_datafiles`` subdir, or a direct CSV path. Mini fixture under ``tests/test-data/rcmip3-mini/`` is a 2-scenario x 3-variable x 6-year slice of the real CSVs (no external fetch needed for tests). Unit tests at ``tests/unit/io/test_rcmip3.py`` cover the filter surface, error paths, and metadata-column validation. No adapter wiring yet; that's the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…orcings Replaces the hard-coded ``_DEFAULT_LAND_USE_SCENARIO = "M"`` choice in FaIR2's ``_fill_natural_forcings`` (and the equivalent ``rf_luc_file`` single-trajectory path in CICEROSCMPY2) with a per-scenario lookup against the canonical RCMIP Phase 3 bundle (Zenodo 20430630). Opt-in via the new ``rcmip3_bundle_path`` cfg key on either adapter; legacy paths remain as the back-compat default. How it works: - :func:`openscm_runner.io.rcmip3.resolve_scenario_category` maps the user's scenario name to a CMIP7 ScenarioMIP category (``VL`` / ``LN`` / ``L`` / ``ML`` / ``M`` / ``H`` / ``HL``). The defaults cover the published SSP-RCP scenario set (ssp119 -> VL, ssp126 -> VL, ssp245 -> M, ssp370 -> H, ssp434 -> L, ssp460 -> ML, ssp534-over -> LN, ssp585 -> HL); callers override via the ``scenario_to_category`` cfg key, and ``scen7-{cat}`` native CMIP7 names skip the map entirely. - ``historical`` / ``historical-cmip6`` use the per-component ``Effective Radiative Forcing|Anthropogenic|Albedo Change|{Land Use,Irrigation}`` rows published directly in ``rcmip_phase3_forcing_v2.0.0.csv``; the per-category lookup is bypassed. - Idealised scenarios (``protocol_land_use_forcing == "constant_zero"``) keep zeroing out via the existing suppression mask in both adapters. - Unknown scenarios log a warning and leave Land use / Irrigation at zero, with a pointer to ``scenario_to_category``. FaIR2 surface: - ``_fill_natural_forcings`` factors the Land use + Irrigation block out into two helpers: ``_fill_land_use_from_legacy_bundle`` (existing behaviour, kept) and ``_fill_land_use_from_rcmip3`` (new). The choice is driven by the ``rcmip3_bundle_path`` kwarg. - Both call sites (``_run_native_cfgs`` -> ``_run_one_calibration`` and ``_run_translated_cfgs``) pass the cfg-level ``rcmip3_bundle_path`` / ``scenario_to_category`` through. - Translated-cfg path validates uniqueness of both kwargs across cfgs (same rule as ``emissions_bundle``). CICEROSCMPY2 surface: - ``_build_scendata_list`` grew a third LUC branch: when ``cfg.get("rcmip3_bundle_path")`` is set, populates the existing in-memory ``rf_luc_data`` scendata slot from :func:`_build_rf_luc_data_from_rcmip3` instead of writing ``rf_luc_file``. Idealised (zeros) and legacy (path) branches unchanged. - ``_build_rf_luc_data_from_rcmip3`` returns a year-indexed single-column DataFrame matching the shape upstream's ``DistributionRun`` expects. CICEROSCM has no irrigation channel, so only Land Use is wired in; Irrigation is FaIR-only. Solar / Volcanic deferred to a follow-up commit; this PR is land-use only per Ben + Marit's scope. Test fixture under ``tests/test-data/rcmip3-mini/`` expanded with the historical Albedo Change|{Land Use,Irrigation} breakdown rows so the new code paths are exercised end-to-end without external fetch. Unit tests in ``tests/unit/io/test_rcmip3.py`` cover the resolver, the per-category loader, and the historical-breakdown canonical CSV path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…trix Marit's tests-multi-env job (added in 0d9f5b1 / merged via openscm#100) runs two isolated venvs: * env1 -- fair>=1.6,<2 + ciceroscm>=1.1,<2, selects ``-m "faircicero1x_only or magicc"``. * env2 -- fair>=2,<3 + ciceroscm>=2,<3, selects ``-m "(faircicero2_only or magicc) and not faircicero1x_only"``. Unmarked tests are caught by the main ``tests`` job, which currently resolves to legacy majors via the poetry lockfile. The FaIRv2 + CICEROSCMPY2 integration smoke tests need the modern majors, so route the module through env2 via a module-level ``pytestmark``. The existing per-test ``fair2_skip`` / ``cicero_skip`` markers stay so the module still collects (and individually skips) under the main job's legacy resolution. The unit tests in tests/unit/io_tests/test_rcmip3.py stay unmarked -- they're pure pandas-based reader tests with no model dependency, so the main ``tests`` job (which runs everything unmarked) covers them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ings Extends the ``rcmip3_bundle_path`` cfg key wired in the previous land-use commit so it also covers Solar + Volcanic effective radiative forcings, reading per-scenario rows from ``rcmip_phase3_forcing_v2.0.0.csv`` (``Effective Radiative Forcing|Natural|{Solar,Volcanic}``) instead of broadcasting a single trajectory from the calibration bundle's ``solar_forcing`` / ``volcanic_forcing`` CSVs. Per-scenario coverage matters here: the canonical CSV's ``historical`` and ``ssp245`` solar rows differ at every overlapping year (historical uses observations, the SSP-RCP rows use the SSP-projection extension); the legacy single-trajectory broadcast loses that distinction. - ``_fill_natural_forcings``: branches on ``rcmip3_bundle_path``; delegates to ``_fill_natural_from_legacy_bundle`` (existing single-CSV-broadcast) or the new ``_fill_natural_from_rcmip3`` (per-scenario CSV lookup with ``natural_mask`` suppression and a warning-then-zero fallback for scenarios with no canonical row). - Docstring reorganised: both Solar/Volcanic and Land use/Irrigation now describe a "canonical RCMIP3" and "legacy bundle" path; the former two columns of the structure mirror each other now. CICEROSCMPY2 Solar+Volcanic is **not** in this commit. Two reasons: 1. Upstream ``ciceroscm`` v2 uses ``rf_sun_*`` (not ``rf_solar_*``) as the canonical solar key; the openscm-runner adapter currently passes ``rf_solar_file`` which upstream silently ignores in favour of its default ``solar_IPCC.txt``. This is a pre-existing bug that's orthogonal to the RCMIP3 wire-in. 2. Upstream's volcanic input expects a 12-column monthly DataFrame (``rf_volc_n_data`` / ``rf_volc_s_data``); the canonical RCMIP3 CSV is annual. Translating annual -> monthly cleanly needs a physical choice (broadcast vs. equally distribute) that I'd rather make in a separate commit after talking with Marit. Test fixture already covers the historical + ssp245 Solar/Volcanic rows (added in the land-use commit). New unit tests in tests/unit/io_tests/test_rcmip3.py verify the per-scenario rows exist for both natural-forcing components and that the historical / ssp245 trajectories differ at overlapping years. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…al + fix rf_sun_file key Two changes bundled because they touch overlapping lines: (1) Fix pre-existing key-name bug: the adapter has been passing ``rf_solar_file`` to scendata, but upstream ``ciceroscm`` v2.x only recognises ``rf_sun_file`` / ``rf_sun_data`` in its ``InputHandler.set_sun_volc_luc_defaults`` and ``get_data`` paths. Anything passed under ``rf_solar_*`` was silently dropped and the model fell back to its bundled default ``solar_IPCC.txt``. Renamed the cfg key + scendata key + canonical-file-resolver entry to ``rf_sun_file`` to match upstream. (The same latent bug exists in cscm-calibrate's own ``run_full_rcmip_protocol.py``; flagged for Marit.) (2) Wire Solar + Volcanic into the canonical RCMIP3 path that was added for Land use in the previous commit. When the ``rcmip3_bundle_path`` cfg key is set, the scendata builder now populates ``rf_sun_data`` and ``rf_volc_data`` (in-memory year-indexed DataFrames) from per-scenario ``Effective Radiative Forcing|Natural|{Solar,Volcanic}`` rows in ``rcmip_phase3_forcing_v2.0.0.csv`` instead of pointing at the calibration-bundle paths. Upstream's ``InputHandler.set_sun_volc_luc_defaults`` propagates the single ``rf_volc_data`` to ``rf_volc_n_data`` / ``rf_volc_s_data`` automatically; the input handler's shape-coercion (1D annual -> ``arr[:, None]`` for volcanic) accepts annual single-column DataFrames without needing a monthly broadcast. Legacy file path stays as the back-compat default when ``rcmip3_bundle_path`` is not set; idealised scenarios still go through upstream's ``sunvolc=0`` suppression as before. Behaviour change for users who currently pass ``rf_solar_file=...`` to ``CICEROSCMPY2.from_native_distribution(...)``: that key is now ``rf_sun_file=...``. Anything passed under the old key was already being silently dropped, so no working behaviour is lost; this just makes the key wire-up explicit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…on RCMIP3 path Closes Zeb's review comment on ``_output_variables.py:158-162`` ("I'd suggest just back-reporting this anyway, makes life easier for users"). CICERO-SCM v2.x doesn't expose Solar / Volcanic / Land-use albedo as diagnostic outputs -- the model treats them as forcing inputs read from the bundle. The adapter now echoes the per-scenario input trajectories back into the output ScmRun when the canonical RCMIP3 path is active, so users can request them alongside diagnostics like total ERF or GSAT. Changes: - ``_output_variables.RCMIP3_BACK_REPORTABLE_VARIABLES``: new frozenset of the three back-reportable variable names. ``all_supported()`` includes it (the validator accepts requests for these names on either path). - The "input-side forcing" entry in ``_STRUCTURAL_LIMITS`` was already removed in Marit's earlier cleanup commit; the module docstring's "What CICERO-SCM v2.x cannot produce" list now drops Solar/Volcanic/Land use and the "What it supports" list mentions the back-report. - ``_run_one_distribution`` splits ``output_variables`` into upstream-produced + back-reportable. The back-reportable subset is stripped from the upstream call and synthesised after the model run from the scendata's ``rf_sun_data`` / ``rf_volc_data`` / ``rf_luc_data`` trajectories. On the legacy path (without ``rcmip3_bundle_path``) the back-report is skipped with a warning. - New helper ``_build_rcmip3_backreport_scmrun`` constructs the back-report ScmRun with the same metadata schema upstream's ``CSCMParWrapper`` uses (``climate_model``, ``model``, ``run_id``, ``scenario``, ``region``, ``variable``, ``unit``, then year columns). Returns ``None`` when no scendata trajectory matches the request (e.g. all-idealised Land-use request). - Two unit tests in ``tests/unit/io_tests/test_rcmip3.py`` cover the happy path (10-row shape across 2 scenarios x 2 cfgs x partial-coverage variables) and the empty-fallback path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nonical RCMIP3 Track A wire-in: both adapters now consume the canonical ``rcmip_phase3_emissions_v2.0.0.csv`` and ``rcmip_phase3_concentrations_v2.0.0.csv`` from the Zenodo 20430630 bundle as the per-scenario baseline when ``rcmip3_bundle_path`` is set. User ScmRun overlays still win per-species; legacy paths preserved as fallback. Shared canonicaliser: - :func:`openscm_runner.io.canonicalise_rcmip3_variable` translates canonical RCMIP3 variable names to openscm-runner conventions. CO2 sub-sectors ("Emissions|CO2|AFOLU", "Emissions|CO2|Energy and Industrial Processes") get the MAGICC- style suffix; F-gas / PFC / HFC / Halon / Montreal intermediate IAMC categories are stripped so the leaf species name ends up adjacent to the "Emissions|" or "Atmospheric Concentrations|" prefix. Variables that don't need translation round-trip unchanged. FaIR2: - ``_emissions_translator._rcmip3_to_fair_emissions_df``: new helper that reads canonical CSV, canonicalises variables, and reshapes to FaIR's horizontal form (one row per (scenario, FaIR-species)). - ``build_emissions_df`` grows an ``rcmip3_bundle_path`` kwarg. When set, the baseline comes from the canonical CSV (per-scenario rows) instead of the bundle's single ``historical_emissions_*.csv``. ``_splice_bundle_with_user`` grows a ``bundle_per_scenario`` flag to skip the relabel step when the baseline is already per-scenario. - ``_concentrations_translator.build_concentrations_df_from_rcmip3``: new third route alongside the existing per-scenario tab-delimited reader and the in-memory ScmRun reader. Reads canonical CSV + canonicalises + translates leaf species via RCMIP_TO_FAIR2_SPECIES. - Adapter dispatch: both ``_run_one_calibration`` and ``_run_translated_cfgs`` pass ``rcmip3_bundle_path`` through to the translators; conc-driven-mode validation accepts ``rcmip3_bundle_path`` as a third way to satisfy the "need concentrations from somewhere" check. CICEROSCMPY2: - ``_merge_rcmip3_canonical_into_user``: new helper that reads canonical emissions + concentrations CSVs, canonicalises variable names, and unions them into the user's ScmRun upfront. Deduplication is per ``(scenario, variable)`` -- user rows win when both supply the same key. The existing ``_build_hybrid_emissions_data`` and ``_build_hybrid_concentrations_data`` see the merged ScmRun and work unchanged: their ``cicero_comp_dict`` suffix-matching on ``f"Emissions|{openscm_suffix}"`` now picks up the canonicalised MAGICC-style names. - ``_build_scendata_list`` calls the merge at the top when ``rcmip3_bundle_path`` is set; the rest of the per-scenario loop is untouched. Tests: - ``tests/unit/io_tests/test_rcmip3.py`` grows tests for the canonicaliser (10 parametrised cases covering flat / CO2-sector / PFC-nested / F-Gases-nested round-trips) and for the FaIR2 canonical-RCMIP3 emissions + concentrations builders against the mini fixture. Legacy paths unchanged: callers that don't set ``rcmip3_bundle_path`` keep the existing behaviour (read bundle historical CSV, or read per-scenario tab-delimited conc files). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

maritsandstad · 2026-06-04T19:22:06Z

@benmsanderson : I think we can merge this in now so we can get the CIs running and have everything going on in one place?

benmsanderson · 2026-06-04T20:58:38Z

Reopening from the same branch pushed directly to openscm/openscm-runner so the fork-PR secrets restriction stops blocking CI — see #102.

benmsanderson and others added 12 commits June 3, 2026 10:07

Merging in from the branch we are merging to

0d9f5b1

Moving io tests to avoid namespace overwriting

207f43d

Remove kwargs and superfluous super calls +

68f0d5a

benmsanderson closed this Jun 4, 2026

benmsanderson reopened this Jun 4, 2026

benmsanderson closed this Jun 4, 2026

benmsanderson mentioned this pull request Jun 4, 2026

Address PR97 review comments + RCMIP3 wire-in #102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address PR97 review comments + RCMIP3 wire-in#101

Address PR97 review comments + RCMIP3 wire-in#101
benmsanderson wants to merge 12 commits into
openscm:feat/fair2-ciceroscmpy2-adapters-and-runmode-nonforkfrom
benmsanderson:address/pr97-comments

benmsanderson commented Jun 4, 2026

Uh oh!

maritsandstad commented Jun 4, 2026

Uh oh!

benmsanderson commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

benmsanderson commented Jun 4, 2026

How the commits map to #97 review comments

Initial cleanup pass (commit 9ad17c3) — closes 13 of 21

Naming hygiene (commit 7e2cc7b)

CI matrix marker (commit fe2bd70)

Bug fix: rf_sun_file (commit ed94023, bundled with Solar/Volcanic wire-in)

Back-report Solar / Volcanic / Land use (commit bb39feb)

Canonical RCMIP3 wire-in (Track A — out of strict scope but felt right)

What's deliberately not done

Test coverage

Uh oh!

maritsandstad commented Jun 4, 2026

Uh oh!

benmsanderson commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Initial cleanup pass (commit `9ad17c3`) — closes 13 of 21

Naming hygiene (commit `7e2cc7b`)

CI matrix marker (commit `fe2bd70`)

Bug fix: `rf_sun_file` (commit `ed94023`, bundled with Solar/Volcanic wire-in)

Back-report Solar / Volcanic / Land use (commit `bb39feb`)