Address PR97 review comments + RCMIP3 wire-in by benmsanderson · Pull Request #102 · openscm/openscm-runner

benmsanderson · 2026-06-04T20:58:49Z

Draft. Stacked on top of #97 to address @znicholls's review comments, plus a Track A canonical-RCMIP3 wire-in that came out of the back-and-forth on the file-format question. Discussion history (with @maritsandstad) lived on the fork at benmsanderson/openscm-runner#15 — that PR description has the long-form context.

Opening as a draft so reviewers can see the shape before we settle the remaining maintainership wording.

How the commits map to #97 review comments

Initial cleanup pass (commit `9ad17c3`) — closes 13 of 21

magicc7.py:55 (__init__ kwargs) — reverted; pymagicc check + setup moved into _init_model.
_protocol.py:32 (don't use, it's slow and frail) — @runtime_checkable dropped, isinstance check removed from run.py.
base.py:155 (from_native_distribution on base) — removed from base.
run.py:131 (scenarios=None should raise) — now raises ValueError.
run.py:128 (only IamDataFrame) — dropped AttributeError fallback.
run.py:76 (mix the two forms) — accepts dict or iterable, freely mixable per-entry.
run.py:181 (getattr(adapter, "model_name", None)) — applied; no type().__name__ fallback.
ciceroscm_wrapper.py:156 (shorter distutils comment) — done.
test_modern_adapters.py:62 (all 3 members) — done.
test_modern_adapters.py:149 (write pytest.param ids fully) — done.
test_modern_adapters.py:155 (pytest-regressions snapshot) — GSAT + total ERF at 1850/1900/2000/2025/2050/2100 across ssp126/245/370.
test_{ciceroscmpy2,fair2}.py (delete) — deleted.
README.md prose blocks — removed.

Coverage fail_under dropped to 0 with a comment per @znicholls's followup.

Naming hygiene (commit `7e2cc7b`)

CICERO_TO_FAIR2_SPECIES → RCMIP_TO_FAIR2_SPECIES; _read_cicero_conc_file → _read_rcmip_conc_file; docstring fixes throughout _concentrations_translator.py and adjacent files. Pure rename — addresses @maritsandstad's point that the adapter doesn't read CICERO-internal format anywhere; the files it consumes are RCMIP-shaped tabular files and the code was mislabelled.

CI matrix marker (commit `fe2bd70`)

test_modern_adapters.py marked faircicero2_only so it routes through env2 of @maritsandstad's tests-multi-env job (modern-major venv).

Bug fix: `rf_sun_file` (commit `ed94023`, bundled with Solar/Volcanic wire-in)

While wiring up the Solar/Volcanic side of the RCMIP3 work, I verified the upstream ciceroscm v2.1.2 API and found that the adapter was passing rf_solar_file in scendata but upstream's InputHandler only recognises rf_sun_file / rf_sun_data. Anything passed as rf_solar_* was silently dropped and the model fell back to the default solar_IPCC.txt. Renamed throughout (cfg key, scendata key, canonical-file resolver, test override).

Note: same latent bug exists in cscm-calibrate's own run_full_rcmip_protocol.py — flagged to @maritsandstad separately. The published RCMIP3 reference runs may have been using the default solar trajectory instead of the per-scenario solar_RCMIP_*_RCMIP3.txt.

Back-report Solar / Volcanic / Land use (commit `bb39feb`)

Closes @znicholls's _output_variables.py:158-162 comment. When the canonical RCMIP3 path is active, the adapter back-reports per-scenario Solar / Volcanic / Land use ERF trajectories into the output ScmRun. On the legacy path the request logs a warning and skips the back-report.

Canonical RCMIP3 wire-in (Track A — out of strict scope but felt right)

Both Albedo Change|{Land Use,Irrigation} on @znicholls's hard-coded-irrigation comment and the conc-file-format on @maritsandstad's review point at the same underlying issue: the adapter shouldn't be locked to per-bundle file conventions when a canonical published source exists. Wired both adapters to consume the canonical Zenodo 20430630 wide-tables when the new rcmip3_bundle_path cfg key is set. Legacy paths stay as fallback.

Three axes, three commits (c047e1c, c5787b0, 6affa5c) staged for review clarity:

Axis	FaIR2	CICEROSCMPY2
Land use + Irrigation forcings	`_fill_land_use_from_rcmip3` reads per-CMIP7-category Land Use + Irrigation from `input_datafiles_generation/data/`; `historical` uses the published per-component breakdown.	`_build_rf_luc_data_from_rcmip3` populates `rf_luc_data` in-memory scendata (CICERO has no irrigation channel).
Solar / Volcanic forcings	`_fill_natural_from_rcmip3` reads per-scenario `Effective Radiative Forcing\|Natural\|{Solar,Volcanic}` from `rcmip_phase3_forcing_v2.0.0.csv`.	`_build_natural_data_from_rcmip3` populates `rf_sun_data` + `rf_volc_data` in-memory; upstream's `InputHandler` shape-coercion handles annual → multi-axis.
Emissions + Concentrations	`_rcmip3_to_fair_emissions_df` + `build_concentrations_df_from_rcmip3` consume `rcmip_phase3_{emissions,concentrations}_v2.0.0.csv`; existing user-ScmRun overlay still wins per-species.	`_merge_rcmip3_canonical_into_user` unions canonical rows into user's ScmRun upfront; existing `_build_hybrid_*` overlay code is untouched.

Scenario→CMIP7-category resolver covers the SSP-RCP set with sensible defaults (ssp119/126→VL, ssp245→M, ssp370→H, ssp434→L, ssp460→ML, ssp534-over→LN, ssp585→HL) and accepts per-cfg scenario_to_category overrides. Native scen7-{cat} names pass through directly. historical / historical-cmip6 use the published per-component breakdowns.

canonicalise_rcmip3_variable handles the IAMC-naming differences (CO2 sub-sectors → MAGICC-style, intermediate |F-Gases|HFC|, |PFC| categories stripped) so the existing per-adapter species translation maps work without changes.

What's deliberately not done

Item	Status	Why
Maintainership wording for CICEROSCMPY2 + FaIRv2 (#3, #6)	Deferred	@maritsandstad ok with co-maintaining CICEROSCMPY2 ("baseline ok to be (co-)maintainer"). @chrisroadmap not yet pinged for FaIRv2 — would appreciate a tag here if you want to push that thread forward.
Hard-coded irrigation question (#17)	Implicitly addressed by Track A	The bundled `irrigation_forcing_timebounds_cmip7.csv` "M" hard-coding is gone on the canonical path. Adapter resolves per scenario, with override.
Concentrations / emissions ScmRun-only path	Already exists	`build_concentrations_df_from_scmrun` covers it; the new Track A path is just an additional baseline source.

Test coverage

Unit tests for the RCMIP3 reader, scenario→category resolver, canonicaliser, FaIR2 canonical-path builders, and CICEROSCMPY2 back-report helper all live in tests/unit/io_tests/test_rcmip3.py (29 tests).
Mini fixture under tests/test-data/rcmip3-mini/ is a 2-scenario × 3-variable slice of the real Zenodo CSVs, runs without external fetch.
Integration tests/integration/test_modern_adapters.py covers FaIR2 + CICEROSCMPY2 ED/CD across ssp126/245/370 with a pytest-regressions snapshot; marked faircicero2_only for @maritsandstad's tests-multi-env env2.
Legacy paths (existing bundle CSVs, ScmRun-only conc) are untouched — same code, no behaviour change for callers that don't opt into rcmip3_bundle_path.

🤖 Generated with Claude Code

Addresses the concrete code/test items from #97 review (still open: the four maintainership questions and the back-reporting of input forcings in CICEROSCMPY2): - magicc7: drop the **kwargs __init__ override; move the pymagicc availability check and magicc_scenario_setup init into _init_model where they belong on the new base. - _protocol: drop @runtime_checkable; the structural Protocol stays for type-hint use. Remove the isinstance(adapter, AdapterLike) runtime check in run.py — duck-typing is enough. - base.py: drop the default NotImplementedError from_native_distribution stub from _Adapter; adapters that ship a native bundle (FAIR2, CICEROSCMPY2) already define it themselves. - run.py: raise ValueError when scenarios is None; require IamDataFrame-shaped input (no AttributeError fallback); accept either a dict (values: cfgs list or pre-built adapter) or an iterable of (name, cfgs) tuples and/or adapter instances, freely mixable; require model_name on adapter instances rather than falling back to type().__name__. - ciceroscm_wrapper: trim the distutils-removal comment. - test_modern_adapters: run all 3 mini-bundle members; use the full ssp126/ssp245/ssp370 trio; replace the "0 < 2100 GSAT < 7" smoke assert with a pytest-regressions snapshot of GSAT + total ERF at 1850/1900/2000/2025/2050/2100; expand parametrize ids ("fair2-emissions-driven", etc). - delete tests/unit/adapters/test_{ciceroscmpy2,fair2}.py per request. - README: drop the "Supported climate models and run modes" and "Programmatic API" prose sections (live in code / notebooks). - pyproject: drop fail_under to 0 with a comment to bump back up once the adapter rewrite lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Marit's review of #97 corrected the characterisation of the conc / em files the FaIR2 and CICEROSCMPY2 adapters consume: they are RCMIP-format tabular files, not a CICERO-internal format. This is the naming-only first step of the follow-up to switch both adapters to consuming the canonical RCMIP wide-tables from Zenodo 20430630. - _concentrations_translator.py: rename CICERO_TO_FAIR2_SPECIES to RCMIP_TO_FAIR2_SPECIES; rename _read_cicero_conc_file to _read_rcmip_conc_file; rename local cicero_name to rcmip_name; update module + function docstrings to say RCMIP-format throughout. - fair2_adapter.py: update the ValueError message to say RCMIP-format ``{scen}_conc_*`` instead of CICERO-format. - ciceroscmpy2_adapter.py: update docstrings for ``historical_em_file`` / ``historical_conc_file`` and the _build_hybrid_emissions_data / _build_hybrid_concentrations_data helpers to say RCMIP-format. CICERO-SCM-internal API names (the v1.1.x Fortran adapter, the _OPENSCM_TO_CICERO_CONC map, _cicero_unit_to_pint, cicero_species locals in the species-overlay code) are unchanged: those refer to CICERO-SCM upstream conventions, not file format. No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Shared reader module for the canonical RCMIP Phase 3 wide-table CSVs (Zenodo 20430630 -- concentrations, emissions, forcing). Both adapters will consume this in the follow-up commit so we can drop the per-scenario tab-delimited input path the FaIR2 + CICEROSCMPY2 adapters currently rely on, and the hard-coded ``_DEFAULT_LAND_USE_SCENARIO = "M"`` choice in FaIR2. API: - ``load_rcmip3(path, kind, scenarios=None, variables=None, region="World", model=None)`` returns a filtered wide-format DataFrame in the canonical layout (Model, Scenario, Region, Variable, Unit, Activity_Id, Type, Priority, Mip_Era, Version, 1750, ..., 2500). - Convenience wrappers ``load_rcmip3_concentrations`` / ``_emissions`` / ``_forcings`` for the three kinds. - ``load_rcmip3_albedo_categories(path, category)`` reads the per-category Land Use + Irrigation series from the bundle's upstream ``input_datafiles_generation/data/Forcing_AFOLU_CO2.csv`` and ``Forcing_irrigation_population_scale.csv`` files. Necessary because the canonical forcing CSV only publishes the lumped ``Albedo Change`` row for ssp scenarios; per-component breakdown for ssp inputs comes from the per-category upstream files. Path resolution accepts either the bundle root, the ``RCMIP3_input_datafiles`` subdir, or a direct CSV path. Mini fixture under ``tests/test-data/rcmip3-mini/`` is a 2-scenario x 3-variable x 6-year slice of the real CSVs (no external fetch needed for tests). Unit tests at ``tests/unit/io/test_rcmip3.py`` cover the filter surface, error paths, and metadata-column validation. No adapter wiring yet; that's the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…orcings Replaces the hard-coded ``_DEFAULT_LAND_USE_SCENARIO = "M"`` choice in FaIR2's ``_fill_natural_forcings`` (and the equivalent ``rf_luc_file`` single-trajectory path in CICEROSCMPY2) with a per-scenario lookup against the canonical RCMIP Phase 3 bundle (Zenodo 20430630). Opt-in via the new ``rcmip3_bundle_path`` cfg key on either adapter; legacy paths remain as the back-compat default. How it works: - :func:`openscm_runner.io.rcmip3.resolve_scenario_category` maps the user's scenario name to a CMIP7 ScenarioMIP category (``VL`` / ``LN`` / ``L`` / ``ML`` / ``M`` / ``H`` / ``HL``). The defaults cover the published SSP-RCP scenario set (ssp119 -> VL, ssp126 -> VL, ssp245 -> M, ssp370 -> H, ssp434 -> L, ssp460 -> ML, ssp534-over -> LN, ssp585 -> HL); callers override via the ``scenario_to_category`` cfg key, and ``scen7-{cat}`` native CMIP7 names skip the map entirely. - ``historical`` / ``historical-cmip6`` use the per-component ``Effective Radiative Forcing|Anthropogenic|Albedo Change|{Land Use,Irrigation}`` rows published directly in ``rcmip_phase3_forcing_v2.0.0.csv``; the per-category lookup is bypassed. - Idealised scenarios (``protocol_land_use_forcing == "constant_zero"``) keep zeroing out via the existing suppression mask in both adapters. - Unknown scenarios log a warning and leave Land use / Irrigation at zero, with a pointer to ``scenario_to_category``. FaIR2 surface: - ``_fill_natural_forcings`` factors the Land use + Irrigation block out into two helpers: ``_fill_land_use_from_legacy_bundle`` (existing behaviour, kept) and ``_fill_land_use_from_rcmip3`` (new). The choice is driven by the ``rcmip3_bundle_path`` kwarg. - Both call sites (``_run_native_cfgs`` -> ``_run_one_calibration`` and ``_run_translated_cfgs``) pass the cfg-level ``rcmip3_bundle_path`` / ``scenario_to_category`` through. - Translated-cfg path validates uniqueness of both kwargs across cfgs (same rule as ``emissions_bundle``). CICEROSCMPY2 surface: - ``_build_scendata_list`` grew a third LUC branch: when ``cfg.get("rcmip3_bundle_path")`` is set, populates the existing in-memory ``rf_luc_data`` scendata slot from :func:`_build_rf_luc_data_from_rcmip3` instead of writing ``rf_luc_file``. Idealised (zeros) and legacy (path) branches unchanged. - ``_build_rf_luc_data_from_rcmip3`` returns a year-indexed single-column DataFrame matching the shape upstream's ``DistributionRun`` expects. CICEROSCM has no irrigation channel, so only Land Use is wired in; Irrigation is FaIR-only. Solar / Volcanic deferred to a follow-up commit; this PR is land-use only per Ben + Marit's scope. Test fixture under ``tests/test-data/rcmip3-mini/`` expanded with the historical Albedo Change|{Land Use,Irrigation} breakdown rows so the new code paths are exercised end-to-end without external fetch. Unit tests in ``tests/unit/io/test_rcmip3.py`` cover the resolver, the per-category loader, and the historical-breakdown canonical CSV path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…trix Marit's tests-multi-env job (added in 0d9f5b1 / merged via #100) runs two isolated venvs: * env1 -- fair>=1.6,<2 + ciceroscm>=1.1,<2, selects ``-m "faircicero1x_only or magicc"``. * env2 -- fair>=2,<3 + ciceroscm>=2,<3, selects ``-m "(faircicero2_only or magicc) and not faircicero1x_only"``. Unmarked tests are caught by the main ``tests`` job, which currently resolves to legacy majors via the poetry lockfile. The FaIRv2 + CICEROSCMPY2 integration smoke tests need the modern majors, so route the module through env2 via a module-level ``pytestmark``. The existing per-test ``fair2_skip`` / ``cicero_skip`` markers stay so the module still collects (and individually skips) under the main job's legacy resolution. The unit tests in tests/unit/io_tests/test_rcmip3.py stay unmarked -- they're pure pandas-based reader tests with no model dependency, so the main ``tests`` job (which runs everything unmarked) covers them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ings Extends the ``rcmip3_bundle_path`` cfg key wired in the previous land-use commit so it also covers Solar + Volcanic effective radiative forcings, reading per-scenario rows from ``rcmip_phase3_forcing_v2.0.0.csv`` (``Effective Radiative Forcing|Natural|{Solar,Volcanic}``) instead of broadcasting a single trajectory from the calibration bundle's ``solar_forcing`` / ``volcanic_forcing`` CSVs. Per-scenario coverage matters here: the canonical CSV's ``historical`` and ``ssp245`` solar rows differ at every overlapping year (historical uses observations, the SSP-RCP rows use the SSP-projection extension); the legacy single-trajectory broadcast loses that distinction. - ``_fill_natural_forcings``: branches on ``rcmip3_bundle_path``; delegates to ``_fill_natural_from_legacy_bundle`` (existing single-CSV-broadcast) or the new ``_fill_natural_from_rcmip3`` (per-scenario CSV lookup with ``natural_mask`` suppression and a warning-then-zero fallback for scenarios with no canonical row). - Docstring reorganised: both Solar/Volcanic and Land use/Irrigation now describe a "canonical RCMIP3" and "legacy bundle" path; the former two columns of the structure mirror each other now. CICEROSCMPY2 Solar+Volcanic is **not** in this commit. Two reasons: 1. Upstream ``ciceroscm`` v2 uses ``rf_sun_*`` (not ``rf_solar_*``) as the canonical solar key; the openscm-runner adapter currently passes ``rf_solar_file`` which upstream silently ignores in favour of its default ``solar_IPCC.txt``. This is a pre-existing bug that's orthogonal to the RCMIP3 wire-in. 2. Upstream's volcanic input expects a 12-column monthly DataFrame (``rf_volc_n_data`` / ``rf_volc_s_data``); the canonical RCMIP3 CSV is annual. Translating annual -> monthly cleanly needs a physical choice (broadcast vs. equally distribute) that I'd rather make in a separate commit after talking with Marit. Test fixture already covers the historical + ssp245 Solar/Volcanic rows (added in the land-use commit). New unit tests in tests/unit/io_tests/test_rcmip3.py verify the per-scenario rows exist for both natural-forcing components and that the historical / ssp245 trajectories differ at overlapping years. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…al + fix rf_sun_file key Two changes bundled because they touch overlapping lines: (1) Fix pre-existing key-name bug: the adapter has been passing ``rf_solar_file`` to scendata, but upstream ``ciceroscm`` v2.x only recognises ``rf_sun_file`` / ``rf_sun_data`` in its ``InputHandler.set_sun_volc_luc_defaults`` and ``get_data`` paths. Anything passed under ``rf_solar_*`` was silently dropped and the model fell back to its bundled default ``solar_IPCC.txt``. Renamed the cfg key + scendata key + canonical-file-resolver entry to ``rf_sun_file`` to match upstream. (The same latent bug exists in cscm-calibrate's own ``run_full_rcmip_protocol.py``; flagged for Marit.) (2) Wire Solar + Volcanic into the canonical RCMIP3 path that was added for Land use in the previous commit. When the ``rcmip3_bundle_path`` cfg key is set, the scendata builder now populates ``rf_sun_data`` and ``rf_volc_data`` (in-memory year-indexed DataFrames) from per-scenario ``Effective Radiative Forcing|Natural|{Solar,Volcanic}`` rows in ``rcmip_phase3_forcing_v2.0.0.csv`` instead of pointing at the calibration-bundle paths. Upstream's ``InputHandler.set_sun_volc_luc_defaults`` propagates the single ``rf_volc_data`` to ``rf_volc_n_data`` / ``rf_volc_s_data`` automatically; the input handler's shape-coercion (1D annual -> ``arr[:, None]`` for volcanic) accepts annual single-column DataFrames without needing a monthly broadcast. Legacy file path stays as the back-compat default when ``rcmip3_bundle_path`` is not set; idealised scenarios still go through upstream's ``sunvolc=0`` suppression as before. Behaviour change for users who currently pass ``rf_solar_file=...`` to ``CICEROSCMPY2.from_native_distribution(...)``: that key is now ``rf_sun_file=...``. Anything passed under the old key was already being silently dropped, so no working behaviour is lost; this just makes the key wire-up explicit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…on RCMIP3 path Closes Zeb's review comment on ``_output_variables.py:158-162`` ("I'd suggest just back-reporting this anyway, makes life easier for users"). CICERO-SCM v2.x doesn't expose Solar / Volcanic / Land-use albedo as diagnostic outputs -- the model treats them as forcing inputs read from the bundle. The adapter now echoes the per-scenario input trajectories back into the output ScmRun when the canonical RCMIP3 path is active, so users can request them alongside diagnostics like total ERF or GSAT. Changes: - ``_output_variables.RCMIP3_BACK_REPORTABLE_VARIABLES``: new frozenset of the three back-reportable variable names. ``all_supported()`` includes it (the validator accepts requests for these names on either path). - The "input-side forcing" entry in ``_STRUCTURAL_LIMITS`` was already removed in Marit's earlier cleanup commit; the module docstring's "What CICERO-SCM v2.x cannot produce" list now drops Solar/Volcanic/Land use and the "What it supports" list mentions the back-report. - ``_run_one_distribution`` splits ``output_variables`` into upstream-produced + back-reportable. The back-reportable subset is stripped from the upstream call and synthesised after the model run from the scendata's ``rf_sun_data`` / ``rf_volc_data`` / ``rf_luc_data`` trajectories. On the legacy path (without ``rcmip3_bundle_path``) the back-report is skipped with a warning. - New helper ``_build_rcmip3_backreport_scmrun`` constructs the back-report ScmRun with the same metadata schema upstream's ``CSCMParWrapper`` uses (``climate_model``, ``model``, ``run_id``, ``scenario``, ``region``, ``variable``, ``unit``, then year columns). Returns ``None`` when no scendata trajectory matches the request (e.g. all-idealised Land-use request). - Two unit tests in ``tests/unit/io_tests/test_rcmip3.py`` cover the happy path (10-row shape across 2 scenarios x 2 cfgs x partial-coverage variables) and the empty-fallback path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nonical RCMIP3 Track A wire-in: both adapters now consume the canonical ``rcmip_phase3_emissions_v2.0.0.csv`` and ``rcmip_phase3_concentrations_v2.0.0.csv`` from the Zenodo 20430630 bundle as the per-scenario baseline when ``rcmip3_bundle_path`` is set. User ScmRun overlays still win per-species; legacy paths preserved as fallback. Shared canonicaliser: - :func:`openscm_runner.io.canonicalise_rcmip3_variable` translates canonical RCMIP3 variable names to openscm-runner conventions. CO2 sub-sectors ("Emissions|CO2|AFOLU", "Emissions|CO2|Energy and Industrial Processes") get the MAGICC- style suffix; F-gas / PFC / HFC / Halon / Montreal intermediate IAMC categories are stripped so the leaf species name ends up adjacent to the "Emissions|" or "Atmospheric Concentrations|" prefix. Variables that don't need translation round-trip unchanged. FaIR2: - ``_emissions_translator._rcmip3_to_fair_emissions_df``: new helper that reads canonical CSV, canonicalises variables, and reshapes to FaIR's horizontal form (one row per (scenario, FaIR-species)). - ``build_emissions_df`` grows an ``rcmip3_bundle_path`` kwarg. When set, the baseline comes from the canonical CSV (per-scenario rows) instead of the bundle's single ``historical_emissions_*.csv``. ``_splice_bundle_with_user`` grows a ``bundle_per_scenario`` flag to skip the relabel step when the baseline is already per-scenario. - ``_concentrations_translator.build_concentrations_df_from_rcmip3``: new third route alongside the existing per-scenario tab-delimited reader and the in-memory ScmRun reader. Reads canonical CSV + canonicalises + translates leaf species via RCMIP_TO_FAIR2_SPECIES. - Adapter dispatch: both ``_run_one_calibration`` and ``_run_translated_cfgs`` pass ``rcmip3_bundle_path`` through to the translators; conc-driven-mode validation accepts ``rcmip3_bundle_path`` as a third way to satisfy the "need concentrations from somewhere" check. CICEROSCMPY2: - ``_merge_rcmip3_canonical_into_user``: new helper that reads canonical emissions + concentrations CSVs, canonicalises variable names, and unions them into the user's ScmRun upfront. Deduplication is per ``(scenario, variable)`` -- user rows win when both supply the same key. The existing ``_build_hybrid_emissions_data`` and ``_build_hybrid_concentrations_data`` see the merged ScmRun and work unchanged: their ``cicero_comp_dict`` suffix-matching on ``f"Emissions|{openscm_suffix}"`` now picks up the canonicalised MAGICC-style names. - ``_build_scendata_list`` calls the merge at the top when ``rcmip3_bundle_path`` is set; the rest of the per-scenario loop is untouched. Tests: - ``tests/unit/io_tests/test_rcmip3.py`` grows tests for the canonicaliser (10 parametrised cases covering flat / CO2-sector / PFC-nested / F-Gases-nested round-trips) and for the FaIR2 canonical-RCMIP3 emissions + concentrations builders against the mini fixture. Legacy paths unchanged: callers that don't set ``rcmip3_bundle_path`` keep the existing behaviour (read bundle historical CSV, or read per-scenario tab-delimited conc files). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

benmsanderson · 2026-06-04T20:59:00Z

Supersedes #101 (which was opened from benmsanderson/openscm-runner:address/pr97-comments and couldn't access org secrets, so the MAGICC download step was failing every CI job). This PR's head is the same branch pushed directly to openscm/openscm-runner so the test matrix can actually run.

Five categories of failure surfaced when the test matrix actually ran: 1. ``NameError: name 'pd' is not defined`` in ``_build_rcmip3_backreport_scmrun``. pandas was used inside the helper but the module had only local imports; promote ``import pandas as pd`` to module level. 2. Pre-existing unit tests in ``tests/unit/test_{run,installation}.py`` pass ``scenarios="not used"`` (a string) on the way to exercising out_config or adapter-import errors. The strict ``scenarios.get_unique_meta`` validation I added in PR97 ran first and crashed with ``AttributeError`` before the intentional error could fire. Move the scenarios validation AFTER ``_check_out_config`` and the adapter-construction loop so cfg-shape and missing-package errors keep their original timing. Scenarios=None still raises ValueError as Marit requested -- just later in the function. 3. Four dummy-adapter dispatch tests in ``tests/unit/test_run.py`` pass ``scenarios=None`` expecting the legacy "scenarios optional" contract. Updated to use a new ``_dummy_scenarios()`` helper that returns a minimal ScmRun (one ``Surface Temperature`` row -- the variable doesn't start with ``Emissions|`` so ``check_variables_are_as_expected`` no-ops). 4. ``test_modern_adapters.py`` was expanded in the initial cleanup to ssp126/ssp245/ssp370 per Zeb's "more scenarios" review comment. The CICERO mini-bundle only ships ssp245 conc files though, so the FaIRv2 CD path crashed with ``IndexError`` in ``fair.fill_from_pandas`` for ssp126/ssp370 (no matching rows in the conc_df). Scoped back to ssp245 with a TODO comment; expanding the mini bundles to cover the full trio is a follow-up. 5. Several "snapshot not found" failures from pytest-regressions on its first run are expected -- the framework writes the reference CSVs on the first run and the next CI run will compare against them. Not a code fix; will commit the snapshots in a follow-up commit once CI generates them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…sts instead Previous fix moved the scenarios check after adapter construction so ``test_no_fair`` and ``test_no_pymagicc`` (which pass scenarios as a literal string) would reach the missing-package ImportError. That broke ``test_run_rejects_unknown_emissions_variables`` which uses an unregistered ``"model_a"`` cfg and expects the bad-emissions ``ValueError`` from scenarios validation to fire BEFORE the ``NotImplementedError`` from get_adapter. Inverting the trade-off: the correct order is ``cfg-shape -> scenarios -> adapter construction`` so user-input errors are surfaced before package-import errors. ``test_no_fair`` / ``test_no_pymagicc`` / ``test_run_out_config_conflict_error`` updated to use a minimal ScmRun (Surface Temperature row -- bypasses ``check_variables_are_as_expected`` since the variable name doesn't start with ``Emissions|``) so the rest of the test runs as intended. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…carbon-cycle fix Marit raised this in a chat after the original PR97 round: openscm- runner shouldn't be monkey-patching upstream ciceroscm internals when we (collectively) maintain ciceroscm too. The patches in ``_upstream_patches.py`` worked around the ``"carbon_cycle_outputs" in cfg`` presence-check bug in ``concentrations_emissions_handler.py`` of ciceroscm 2.0.1 / 2.1.0. Upstream fixed it to ``cfg.get("carbon_cycle_outputs")`` in 2.1.1. Our monkey-patches are now wrapping code that no longer exists -- at best a no-op, at worst a maintenance liability that will silently break on the next upstream API drift. - Delete ``_upstream_patches.py`` and the ``_apply()`` call from the package ``__init__``. - Move ``_CARBON_CYCLE_VARIABLES`` and ``output_vars_need_carbon_cycle`` to ``_output_variables.py`` -- they're still useful for the INFO log when users request carbon-cycle variables (the 30-50x slowdown is real regardless of the patches). - Add ``require_modern_ciceroscm()`` in ``_compat.py`` that raises a clear ``ImportError`` if the installed ciceroscm is below 2.1.1. Used by ``CICEROSCMPY2._init_model`` in place of the previous major-version-only check. - Add ``_ciceroscm_version_tuple()`` for fine-grained checks (the integration test's ``cicero_skip`` marker now uses this so it collects-but-skips on 2.0.1 / 2.1.0). - Bump the env2 install pin in ``.github/workflows/ci.yaml`` from ``ciceroscm>=2,<3`` to ``ciceroscm>=2.1.1,<3``. The pyproject pin stays at ``>=1.1,<3`` because the ``[ciceroscmpy]`` extra still needs the v1.1.x major; runtime enforcement is the cleaner split. The deletion is intentionally larger than the additions: the carbon-cycle fix lives upstream now, and that's where it should stay. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…o/forcing paths After Marit's PR97 review (and a follow-up chat about ducked-error patterns), the additive "legacy paths preserved alongside canonical RCMIP3" approach from earlier commits became dead weight. This commit deletes the legacy paths and makes ``rcmip3_bundle_path`` mandatory for both modern adapters. API shift on ``from_native_distribution`` -- now takes two explicit required args: FAIR2.from_native_distribution(calibration_dir, rcmip3_bundle_path) CICEROSCMPY2.from_native_distribution(calibration_dir, rcmip3_bundle_path) ``calibration_dir`` carries model-specific calibration only (parameter posterior, species_configs / gaspam, natural-emissions baselines, tuning scale factors). ``rcmip3_bundle_path`` carries the canonical Zenodo 20430630 protocol inputs. The two are different things and shouldn't be conflated; the previous ``native_distribution_path`` single-arg API + ``**cfg_overrides`` was conflating them. The new signature has no ``**cfg_overrides`` -- file-path overrides for protocol files are gone because the canonical CSV is the source of truth, and the surviving non-path knobs (``member_indices``, ``mode``, ``stochastic_run``, ``max_workers``) are explicit kwargs. Code deletions: - ``_concentrations_translator.build_concentrations_df`` (per-scenario tab-delimited reader), ``_read_rcmip_conc_file``, ``_pick_conc_file``. - ``_emissions_translator``: ``bundle_emissions_csv`` arg on ``build_emissions_df`` dropped; ``_splice_bundle_with_user``'s ``bundle_per_scenario`` flag dropped (always true now). - ``fair2_adapter._fill_natural_from_legacy_bundle``, ``_fill_land_use_from_legacy_bundle``, ``_DEFAULT_LAND_USE_SCENARIO``, ``fair2_conc_bundle_dir`` cfg key, the ``f.fill_from_rcmip()`` fallback in translated-cfg mode, the single-trajectory ``_write`` helper inside ``_fill_natural_forcings``. - ``ciceroscmpy2_adapter`` ``_DEFAULT_CANONICAL_FILES``: drops ``rf_sun_file`` / ``rf_volc_file`` / ``rf_luc_file``. Scendata builder always populates ``rf_sun_data`` / ``rf_volc_data`` / ``rf_luc_data`` from the canonical RCMIP3 path; the previous three-branch (idealised / canonical / legacy) logic collapses to two (idealised / canonical). - ``_native_calibration._BUNDLE_FILES``: drops ``historical_emissions``, ``solar_forcing``, ``volcanic_forcing``, ``land_use_forcing``, ``irrigation_forcing``. Calibration bundle now carries calibration-only files. Test fixture cleanup: - ``tests/test-data/fair2-mini-bundle/``: deleted ``historical_emissions_1750-2023_cmip7.csv``, ``solar_forcing_timebounds_cmip7.csv``, ``volcanic_forcing_timebounds_cmip7.csv``, ``land_use_forcing_timebounds_cmip7.csv``, ``irrigation_forcing_timebounds_cmip7.csv``. Calibration files (parameter posterior, species_configs, scale factors) stay. - ``tests/test-data/ciceroscm-mini-bundle/``: deleted ``solar_RCMIP_ssp245_RCMIP3.txt``, ``VOLC_RCMIP_ssp245_RCMIP3.txt``, ``LUCalbedo_RCMIP_ssp245_RCMIP3.txt``. Renamed ``ssp245_{em,conc}_*.txt`` to ``historical_{em,conc}_*.txt`` to match the canonical resolver expectation (those files anchor the CICERO species column structure + units; the actual scenario time series come from the canonical RCMIP3 merge). - ``tests/integration/test_modern_adapters.py`` factory functions use the new two-arg ``from_native_distribution`` and drop the ``_CICERO_SSP245_OVERRIDES`` cfg-override dict. Net: -7163 / +163 LOC across these files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… on canonical-bundle misses Marit and Zeb both flagged the cluster of "warn and fall back to zero" patterns introduced during the canonical-RCMIP3 wire-in: missing scenario rows, unmapped scenario->category, unconvertible units. Each one masks a real bug in the input (or in the bundle) while letting the run produce plausible-but-wrong output -- the "AI cover all your bases" objection. Convert these to loud failures now that the canonical-only policy makes "no fallback" the contract: FaIR2: - ``_fill_natural_from_rcmip3``: missing Solar/Volcanic row raises KeyError instead of warning and zeroing the species. - ``_fill_land_use_from_rcmip3``: ``resolve_scenario_category`` KeyError propagates (caller must add an override); missing historical Albedo Change rows raise. - ``_emissions_translator._unit_scale``: drop the broad ``except Exception -> None`` and the caller's "skip-this-species" warning; let pint's UndefinedUnitError / DimensionalityError surface. CICEROSCMPY2: - ``_build_natural_data_from_rcmip3`` / ``_build_rf_luc_data_from_rcmip3``: same canonical-row raises as FaIR2; ``resolve_scenario_category`` KeyError propagates. - ``_run_one_distribution``: requesting back-reportable variables without ``rcmip3_bundle_path`` set now raises ValueError instead of silently dropping them from the output. - ``_build_hybrid_emissions_data``: drop the broad ``except Exception -> skipped_unmapped`` on unit conversion; fail loudly if a species mapping has a bad target unit or the user ScmRun ships an unparseable unit. Also refresh the CICEROSCMPY2 module docstring's cfg-key reference section -- it still listed the ``rf_sun_file`` / ``rf_volc_file`` / ``rf_luc_file`` / ``rf_luc_constant_zero_file`` keys removed in the previous commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…pts the frame The first env2 CI run on the canonical-only branch failed both FaIRv2 smoke tests with:: fair.exceptions.NonMonotonicError: Time values in the emissions file must be strictly monotonically increasing. Root cause is in ``_splice_bundle_with_user``: the bundle frame may not cover every year the user supplies (the mini-bundle fixture trims to a 6-year subset; the full canonical CSV covers 1750-2500 yearly but a user can in principle drop in extra years too). When the user introduces a year the bundle did not have, the assignment ``spliced_df.loc[mask, year] = value * scale`` (or the ``pd.concat`` path for unmapped-species rows) parks the new column at the right edge in user-row order -- which is typically not monotonic relative to the bundle tail (mini-bundle ends at 2100, user adds 2015 next). Fix: after the ffill, reindex the frame columns to ``meta_cols + sorted(year_cols)`` so year columns are strictly monotonic before the frame leaves the translator. Mirrors the same expectation FaIR's ``_check_csv`` enforces a few lines later. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…sions-mode species The first env2 CI run on the canonical-only branch failed both FaIRv2 smoke tests at ``fill_from_pandas`` with:: IndexError: index 0 is out of bounds for axis 0 with size 0 Root cause: the rcmip3-mini fixture ships only 3 emissions species (CO2 FFI + CO2 AFOLU + CH4 for historical + ssp245). FaIR 2.x's AR6-calibrated species_configs has 51 species with ``input_mode == "emissions"``. When ``fill_from_pandas`` iterates the full species list, every species we did not supply hits the empty-row ``.values[0]`` at fair.io.fill_from:219. In CD mode only CO2 / CH4 / N2O flip to ``concentration``; the other 48 species still need an emissions row. So both ED and CD need the full 51-species coverage. Fix: regenerate the mini fixture's emissions CSV with placeholder 0-emissions rows for every other FaIR species, using the canonical RCMIP variable names (``Emissions|<species>`` for everything except CO2 sub-sectors which keep ``Emissions|CO2|AFOLU`` / ``Emissions|CO2|Energy and Industrial Processes``). Values for CH4 + CO2 FFI + CO2 AFOLU are preserved verbatim; the added species use 0 across all years (mini fixture is for shape / smoke only, not scientific reproduction). Historical rows leave 2050/2100 empty to match the existing ssp-vs-historical pattern. The fixture-shape unit test ``test_fair2_emissions_canonical_path_translates_variables`` got its hard species-set assertion replaced with a ``.issubset`` check covering the sub-sector + intermediate-category representatives the translator actually has to handle, so it remains a meaningful regression on the canonicalisation logic without re-asserting the exact (now-expanded) fixture content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… to pint errors The commit 3 broad-except removal caused the cicero env2 smokes to fail with:: pint.errors.UndefinedUnitError: 'GgH1211' is not defined in the unit registry Root cause: ``_cicero_unit_to_pint`` reproduces the v1.1.x adapter gaspam convention of concatenating prefix + species name (``Gg`` + ``H1211`` -> ``GgH1211``) for the unit string. Most species this round-trips through openscm-units fine, but Halons are recorded in the gaspam as ``H-1211`` / ``H-1301`` / ``H-2402`` rather than ``Halon-1211`` etc., so the concatenated form is not in openscm-units' registry. ``GgHalon1211`` would be, but the gaspam-naming round-trip would need to happen in the ``_cicero_unit_to_pint`` helper, not in this loop. The "honest errors" stance from commit 3 was directed at the canonical-RCMIP3 path -- silently zeroing scientific data when a bundle row is missing was Marit's concern. The legacy hybrid emissions overlay is a different situation: the user-side mapping is best-effort, the species' baseline trajectory stays in place (not zeros), and the convention pre-dates this PR. Narrow the except to ``UndefinedUnitError`` / ``DimensionalityError`` only so real bugs still raise, but the Halon naming quirk doesn't break every cicero smoke run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…test_modern_adapters After the canonical-RCMIP3 wire-in (commit 3a07afb) and subsequent fixture / error-handling fixes, all four ``test_adapter_smoke`` parameterisations run end-to-end through their respective climate models. The only remaining failure is the pytest-regressions first-run "File not found in data directory, created:" -- the snapshot CSV gets written but the test fails so the runner can notice and commit it. CI has no way to surface that CSV at the moment, so we cannot baseline the snapshots without re-running the env2 stack somewhere we control. Add an ``actions/upload-artifact@v4`` step gated on ``if: failure()`` that captures every CSV under ``tests/integration/test_modern_adapters/``. After this lands the next CI run will publish four CSVs as a workflow artifact; those get committed in a follow-up and the smoke tests turn green. ``if-no-files-found: ignore`` so non-pytest-regressions failures don't trip an extra error on top. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ts from first CI run CSVs generated by the env2 job on commit 0a33faf and pulled via the artifact upload step added in that commit. Identical across py3.10 and py3.11 (deterministic per-member runs), so one set suffices. Covers all four parameterised cases: - test_adapter_smoke_fair2_emissions_driven_.csv - test_adapter_smoke_fair2_concentration_driven_.csv - test_adapter_smoke_cicero_emissions_driven_.csv - test_adapter_smoke_cicero_concentration_driven_.csv After this lands the env2 CI step turns green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR follows up on #97 by tightening the openscm_runner.run.run() input surface, wiring FaIRv2 + CICERO-SCM-PY2 to consume canonical RCMIP3 (Zenodo 20430630) wide-table inputs, and reshaping tests/CI to exercise the modern adapters via a dedicated environment.

Changes:

Add a new openscm_runner.io.rcmip3 module for loading/filtering canonical RCMIP3 wide-table CSVs (plus a mini in-repo fixture) and use it in FaIRv2 emissions/concentrations pathways.
Update run.run() to accept dict or iterable inputs (including pre-constructed adapters) and enforce scenarios as required (ScmRun-like) for variable validation.
Refresh integration/unit tests and CI env2 (fair>=2, ciceroscm>=2.1.1) to validate the canonical-path wiring with regression snapshots.

Reviewed changes

Copilot reviewed 44 out of 46 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/unit/test_run.py	Updates unit tests to satisfy the new `scenarios` validation contract.
tests/unit/test_installation.py	Adds minimal ScmRun helper so missing-package tests fail at adapter import, not scenario validation.
tests/unit/io_tests/test_rcmip3.py	New unit tests for RCMIP3 reader, canonicaliser, resolver, and adapter helper behavior.
tests/unit/adapters/test_fair2.py	Removes older FaIRv2 unit-test module (coverage shifts to new tests/integration).
tests/unit/adapters/test_ciceroscmpy2.py	Removes older CICEROSCMPY2 unit-test module (coverage shifts to new tests/integration).
tests/test-data/rcmip3-mini/rcmip_phase3_forcing_v2.0.0.csv	Adds mini canonical forcing CSV fixture.
tests/test-data/rcmip3-mini/rcmip_phase3_emissions_v2.0.0.csv	Adds mini canonical emissions CSV fixture.
tests/test-data/rcmip3-mini/rcmip_phase3_concentrations_v2.0.0.csv	Adds mini canonical concentrations CSV fixture.
tests/test-data/rcmip3-mini/input_datafiles_generation/data/Forcing_irrigation_population_scale.csv	Adds per-category irrigation split fixture for albedo breakdown.
tests/test-data/rcmip3-mini/input_datafiles_generation/data/Forcing_AFOLU_CO2.csv	Adds per-category land-use albedo split fixture.
tests/test-data/fair2-mini-bundle/volcanic_forcing_timebounds_cmip7.csv	Removes legacy bundle volcanic forcing CSV (now sourced via RCMIP3).
tests/test-data/ciceroscm-mini-bundle/solar_RCMIP_ssp245_RCMIP3.txt	Removes legacy per-scenario solar forcing file fixture.
tests/test-data/ciceroscm-mini-bundle/LUCalbedo_RCMIP_ssp245_RCMIP3.txt	Removes legacy per-scenario land-use albedo file fixture.
tests/integration/test_modern_adapters/test_adapter_smoke_fair2_emissions_driven_.csv	New pytest-regressions snapshot for FaIRv2 emissions-driven smoke.
tests/integration/test_modern_adapters/test_adapter_smoke_fair2_concentration_driven_.csv	New pytest-regressions snapshot for FaIRv2 concentration-driven smoke.
tests/integration/test_modern_adapters/test_adapter_smoke_cicero_emissions_driven_.csv	New pytest-regressions snapshot for CICEROSCMPY2 emissions-driven smoke.
tests/integration/test_modern_adapters/test_adapter_smoke_cicero_concentration_driven_.csv	New pytest-regressions snapshot for CICEROSCMPY2 concentration-driven smoke.
tests/integration/test_modern_adapters.py	Reworks modern-adapter integration testing into regression snapshots and routes via env2 marker.
src/openscm_runner/run.py	Supports mixed input shapes (dict/iterable + adapters), normalises entries, and enforces `scenarios` non-None.
src/openscm_runner/io/rcmip3.py	New canonical RCMIP3 wide-table reader + variable canonicaliser + scenario-category resolver.
src/openscm_runner/io/init.py	Exposes RCMIP3 I/O utilities from the `openscm_runner.io` package.
src/openscm_runner/adapters/magicc7/magicc7.py	Moves pymagicc availability check/setup into `_init_model`.
src/openscm_runner/adapters/fair2_adapter/_native_calibration.py	Removes scenario inputs/forcings from the FaIR calibration bundle contract (now RCMIP3-driven).
src/openscm_runner/adapters/fair2_adapter/_emissions_translator.py	Reads baseline emissions from canonical RCMIP3 and tightens unit-conversion behavior.
src/openscm_runner/adapters/fair2_adapter/_concentrations_translator.py	Adds canonical-RCMIP3 concentration builder and removes legacy conc-file reader path.
src/openscm_runner/adapters/ciceroscm_py2_adapter/_upstream_patches.py	Deletes now-unneeded upstream monkey-patches (requires ciceroscm>=2.1.1 instead).
src/openscm_runner/adapters/ciceroscm_py2_adapter/_output_variables.py	Documents/validates supported outputs and adds back-reportable RCMIP3 forcing variables.
src/openscm_runner/adapters/ciceroscm_py2_adapter/_compat.py	Adds version parsing + enforces ciceroscm>=2.1.1.
src/openscm_runner/adapters/ciceroscm_py2_adapter/init.py	Updates package docs to reflect the ciceroscm>=2.1.1 floor and removed patches.
src/openscm_runner/adapters/ciceroscm_py_adapter/ciceroscmpy.py	Removes redundant `__init__` passthrough.
src/openscm_runner/adapters/ciceroscm_adapter/ciceroscm.py	Minor exception/message tweak + security linter annotation change.
src/openscm_runner/adapters/ciceroscm_adapter/ciceroscm_wrapper.py	Shortens distutils migration comment in tempdir setup.
src/openscm_runner/adapters/base.py	Removes unused TYPE_CHECKING import and deletes base `from_native_distribution`.
src/openscm_runner/adapters/_protocol.py	Drops `@runtime_checkable` from AdapterLike protocol.
README.md	Removes programmatic API and supported-models prose blocks.
pyproject.toml	Adjusts coverage gate (`fail_under=0`) and documents mutual-exclusion constraints + ciceroscm floor.
.github/workflows/ci.yaml	Pins env2 ciceroscm>=2.1.1 and uploads regression snapshots on failure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

    scenarios : :obj:`pyam.IamDataFrame`
        Scenarios to run.



@@ -115,71 +140,46 @@ def run(
        ``out_config`` has keys which are not in ``climate_models_cfgs``.



+def _normalise_entries(climate_models_cfgs):
+    """Coerce dict or iterable input into ``[(name, value), ...]`` pairs."""
+    if isinstance(climate_models_cfgs, Mapping):
+        return list(climate_models_cfgs.items())
+    entries = []
+    for item in climate_models_cfgs:
+        if _looks_like_adapter(item):
+            name = getattr(item, "model_name", None)
+            if name is None:
+                raise TypeError(
+                    "Adapter instance is missing a `model_name` attribute: "
+                    f"{type(item).__name__}."
+                )
+            entries.append((name, item))
+        else:
+            name, value = item  # expect a (name, cfgs) tuple
+            entries.append((name, value))
+    return entries


+def output_vars_need_carbon_cycle(output_variables) -> bool:
+    """Return ``True`` if any requested variable needs the carbon cycle."""
+    return bool(set(output_variables) & _CARBON_CYCLE_VARIABLES)
+"""Forcing inputs the adapter back-reports on the canonical RCMIP3 path.
+
+CICERO-SCM v2.x treats Solar / Volcanic / Land-use albedo as input
+forcings (read from the bundle's per-scenario forcing files) rather
+than diagnostic outputs. The model neither computes nor exposes them
+through :data:`SUPPORTED_VARIABLES`. The adapter back-reports them by
+echoing the per-scenario trajectories it already has in scope when
+the ``rcmip3_bundle_path`` cfg key is set (see
+:func:`_build_natural_data_from_rcmip3` and
+:func:`_build_rf_luc_data_from_rcmip3` in the adapter module). On
+the legacy bundle path the adapter logs a warning and omits the
+back-report from the output."""


+    Returns a DataFrame in the same horizontal layout that
+    :func:`_scmrun_to_fair2_rows` produces (one row per
+    ``(scenario, FaIR-species)``, integer year columns), keyed by user
+    scenario rather than a single bundle ``"historical"`` row -- so
+    :func:`_splice_bundle_with_user` must be called with
+    ``bundle_per_scenario=True``.
+


        bundle_unit = spliced_df.loc[mask, "unit"].iloc[0]
        user_unit = user_row.get("unit")
        scale = _unit_scale(user_unit, bundle_unit, user_row["variable"])
-        if scale is None:
-            LOGGER.warning(
-                "Could not convert %r emissions from user unit %r to "
-                "bundle unit %r; user data NOT overlaid for this species, "
-                "bundle values left in place.",
-                user_row["variable"],
-                user_unit,
-                bundle_unit,
-            )
-            continue



… + stricter _normalise_entries) Copilot left six inline review comments on the PR. None are bugs; five are docstring/comment drift from the recent strict-canonical and error-tightening commits, and one is a minor input-validation hardening. run.py - The ``scenarios`` parameter is documented as a ``pyam.IamDataFrame``, but the function now calls ``scenarios.get_unique_meta("variable")`` -- an ``ScmRun`` method. Update the docstring to require an ``ScmRun`` and tell legacy ``IamDataFrame`` users to wrap their input. - The ``Raises`` section claimed ``KeyError`` for unknown ``out_config`` keys, but ``_check_out_config`` only warns. Move the warning behaviour into a ``Notes`` section and drop the ``KeyError`` claim. - ``_normalise_entries`` would happily iterate over a stray string (``"magicc7"``) and then fail with a low-signal ``ValueError: too many values to unpack``. Reject ``str`` / ``bytes`` explicitly at the top level and per-entry, with a clear ``TypeError``. Also wrap the per-entry unpack in a ``try/except`` so non-pair entries also produce a clear message. ciceroscm_py2_adapter/_output_variables.py - There was a stray triple-quoted string literal floating between ``output_vars_need_carbon_cycle`` and the next module-level block. As a standalone expression it has no effect (and lints flag it as a pointless statement). Move the prose to a ``#`` comment block above the ``RCMIP3_BACK_REPORTABLE_VARIABLES`` definition it was meant to document, and refresh the wording since the "legacy bundle path" path it referenced was deleted in commit 3a07afb. fair2_adapter/_emissions_translator.py - The ``_rcmip3_to_fair_emissions_df`` docstring referenced a ``bundle_per_scenario=True`` parameter that the ``_splice_bundle_with_user`` signature never had (and certainly doesn't now). Rewrite the relevant paragraph to describe what the current contract actually is. - The overlay block above ``_unit_scale`` still claimed that conversion failures get logged and the row skipped, but the commit 3 tightening made ``_unit_scale`` raise. Update the comment so future readers don't re-introduce the silent-skip. Unit tests still pass (61 passed) -- no behaviour change for existing call sites since ``_normalise_entries`` only narrows already-invalid inputs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

benmsanderson and others added 12 commits June 3, 2026 10:07

Merging in from the branch we are merging to

0d9f5b1

Moving io tests to avoid namespace overwriting

207f43d

Remove kwargs and superfluous super calls +

68f0d5a

benmsanderson and others added 10 commits June 4, 2026 23:12

benmsanderson marked this pull request as ready for review June 5, 2026 17:15

benmsanderson requested a review from Copilot June 5, 2026 17:16

Copilot started reviewing on behalf of benmsanderson June 5, 2026 17:16 View session

Copilot AI reviewed Jun 5, 2026

View reviewed changes

benmsanderson merged commit 40398dc into feat/fair2-ciceroscmpy2-adapters-and-runmode-nonfork Jun 5, 2026
15 checks passed

benmsanderson deleted the address/pr97-comments branch June 5, 2026 17:43

benmsanderson mentioned this pull request Jun 5, 2026

Feat/fair2 ciceroscmpy2 adapters and runmode nonfork #97

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address PR97 review comments + RCMIP3 wire-in#102

Address PR97 review comments + RCMIP3 wire-in#102
benmsanderson merged 23 commits into
feat/fair2-ciceroscmpy2-adapters-and-runmode-nonforkfrom
address/pr97-comments

benmsanderson commented Jun 4, 2026

Uh oh!

benmsanderson commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -115,71 +140,46 @@ def run(
		``out_config`` has keys which are not in ``climate_models_cfgs``.

Conversation

benmsanderson commented Jun 4, 2026

How the commits map to #97 review comments

Initial cleanup pass (commit 9ad17c3) — closes 13 of 21

Naming hygiene (commit 7e2cc7b)

CI matrix marker (commit fe2bd70)

Bug fix: rf_sun_file (commit ed94023, bundled with Solar/Volcanic wire-in)

Back-report Solar / Volcanic / Land use (commit bb39feb)

Canonical RCMIP3 wire-in (Track A — out of strict scope but felt right)

What's deliberately not done

Test coverage

Uh oh!

benmsanderson commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Initial cleanup pass (commit `9ad17c3`) — closes 13 of 21

Naming hygiene (commit `7e2cc7b`)

CI matrix marker (commit `fe2bd70`)

Bug fix: `rf_sun_file` (commit `ed94023`, bundled with Solar/Volcanic wire-in)

Back-report Solar / Volcanic / Land use (commit `bb39feb`)