Skip to content

feat: add CMIP7 data requirements support for Example provider#510

Merged
lewisjared merged 9 commits intomainfrom
cmip7-data-requirements
Feb 6, 2026
Merged

feat: add CMIP7 data requirements support for Example provider#510
lewisjared merged 9 commits intomainfrom
cmip7-data-requirements

Conversation

@lewisjared
Copy link
Contributor

@lewisjared lewisjared commented Feb 4, 2026

Description

Add CMIP7 data requirements support, enabling the Example provider to fetch CMIP6 data from ESGF and translate it to CMIP7 format.

Key Changes

  • CMIP7Request class (climate_ref_core/esgf/cmip7.py): Fetches CMIP6 data from ESGF and converts it to CMIP7 format using the CMIP7 CV converter
  • Example provider updates: Support both CMIP6 and CMIP7 source types with flexible data requirements
  • Test cases CLI: Add support for CMIP7 datasets in ref test-cases command
  • Error handling improvements: Better messages for missing providers, graceful handling of permission errors and OR logic edge cases
  • Caching: Use platformdirs cache for converted CMIP7 files

Checklist

Please confirm that this pull request has done the following:

  • Tests added
  • Documentation added (where applicable)
  • Changelog item added to changelog/

Add CMIP7Request class for fetching CMIP6 data from ESGF and
translating it to CMIP7 format using the CMIP7 CV converter.

Changes:
- Add CMIP7Request in climate_ref_core/esgf/cmip7.py for fetching
  and converting CMIP6 data to CMIP7 format
- Update Example provider to support both CMIP6 and CMIP7 source types
- Add test cases CLI support for CMIP7 datasets
- Use platformdirs cache for converted CMIP7 files
- Handle OR logic gracefully when source type is missing
- Handle permission errors gracefully during conversion
- Improve error message when provider is not configured
- Add comprehensive tests for CMIP7Request class
- Add tests for facet conversion and metadata mapping
- Add tests for file caching and conversion
- Add tests for AddSupplementaryDataset with CMIP7 source type
- Add tests for solver OR logic with missing source types
@codecov
Copy link

codecov bot commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 93.75000% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ages/climate-ref/src/climate_ref/cli/test_cases.py 80.95% 2 Missing and 2 partials ⚠️
...limate-ref-core/src/climate_ref_core/esgf/cmip7.py 96.73% 2 Missing and 1 partial ⚠️
...ate-ref-example/src/climate_ref_example/example.py 77.77% 1 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
...imate-ref-core/src/climate_ref_core/constraints.py 96.27% <100.00%> (ø)
...-ref-core/src/climate_ref_core/dataset_registry.py 100.00% <100.00%> (ø)
...ate-ref-core/src/climate_ref_core/esgf/__init__.py 100.00% <100.00%> (ø)
...limate-ref-ilamb/src/climate_ref_ilamb/standard.py 94.93% <100.00%> (+0.03%) ⬆️
packages/climate-ref/src/climate_ref/solver.py 98.92% <100.00%> (+2.37%) ⬆️
...ate-ref-example/src/climate_ref_example/example.py 93.61% <77.77%> (-3.95%) ⬇️
...limate-ref-core/src/climate_ref_core/esgf/cmip7.py 96.73% <96.73%> (ø)
...ages/climate-ref/src/climate_ref/cli/test_cases.py 91.92% <80.95%> (+5.83%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds CMIP7 “virtual” dataset support by fetching CMIP6 data from ESGF, converting it to CMIP7 conventions, and enabling diagnostics/providers/CLI to work with CMIP7 source types.

Changes:

  • Introduces CMIP7Request in climate_ref_core to fetch CMIP6 via ESGF and convert/cache results as CMIP7.
  • Extends solver + Example provider to support OR-ed data requirements (CMIP6 or CMIP7) and adds related tests.
  • Updates ref test-cases CLI to build catalogs for CMIP7 datasets and improves provider-missing messaging.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/climate-ref/tests/unit/test_solver.py Adds unit tests for OR-logic behavior, including missing source-type handling.
packages/climate-ref/src/climate_ref/solver.py Implements OR-logic handling when multiple requirement sets are provided.
packages/climate-ref/src/climate_ref/config.py Docstring cleanup for default_providers().
packages/climate-ref/src/climate_ref/cli/test_cases.py Adds CMIP7 adapter support and improves errors for unknown providers.
packages/climate-ref-ilamb/src/climate_ref_ilamb/standard.py Minor tweak to metric name splitting logic.
packages/climate-ref-example/tests/integration/test_diagnostics.py Uses validator API to detect regression data presence before running validation.
packages/climate-ref-example/src/climate_ref_example/example.py Adds CMIP7 alternative data requirements and CMIP7 test case requests.
packages/climate-ref-core/tests/unit/test_esgf_cmip7.py Adds unit tests for CMIP7 request/conversion/cache behavior.
packages/climate-ref-core/tests/unit/test_constraints.py Adds tests ensuring constraints defaults work correctly for CMIP7.
packages/climate-ref-core/src/climate_ref_core/esgf/cmip7.py New CMIP7 request implementation with CMIP6→CMIP7 conversion + caching.
packages/climate-ref-core/src/climate_ref_core/esgf/init.py Exports CMIP7Request.
packages/climate-ref-core/src/climate_ref_core/dataset_registry.py Uses rsplit for safer hash parsing.
packages/climate-ref-core/src/climate_ref_core/constraints.py Extends AddSupplementaryDataset.from_defaults to support CMIP7 facets.
docs/gen_doc_stubs.py Uses rsplit for safer module name splitting.
changelog/510.feature.md Changelog entry for CMIP7 data requirements support.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Use context manager for xr.open_dataset to prevent file handle leaks
- Fix type annotation for cmip7_facets from dict[str, str] to dict[str, Any]
- Optimize metric/region split to avoid redundant string splitting
- Buffer executions in OR-logic to only set any_matched when executions produced
- Catch InvalidDiagnosticException in ExecutionSolver.solve to skip unmatched diagnostics
Base automatically changed from cmip7-database-v2 to main February 5, 2026 13:21
* origin/main:
  Add changelog
  Add codecov.yml with relaxed coverage thresholds
  feat(cmip7): add license_id and external_variables to CMIP7 model
  Add changelog
  Faster listing of regression data
  Bump version: 0.9.0 → 0.9.1
  docs: fix admonition syntax and typo in getting-started guides
  fix(tests): update patch paths for lazy imports
  chore: add changelog and fix lint after merge
  test: add unit tests for providers, datasets, config, and models
  ci: only run helm jobs when helm/ changes or on main
  perf: move SourceDatasetType to lightweight module and add more lazy imports
  perf(cli): optimize startup time with lazy imports
  chore: add changelog for PR #511
  feat(cli): skip database backup for read-only commands
  fix: Use the right path
  docs: add changelog entries for PR #508
  feat(providers): add ingest_data hook for provider-level dataset ingestion
  docs: update configuration and dataset download instructions; remove empty tutorials section
… tests

The source code uses `xr.open_dataset` as a context manager (`with`
statement), so the mock needs `__enter__`/`__exit__` configured
rather than just setting `return_value`.
* origin/main:
  Also update output collection of regional historical trend diagnostic
  Add another variable to test case
  Do not try to push to container registry from forks
  Update regression test output
  Add changelog
  Split trends recipe and fix the other two regional recipes
  Update recipes
  Update regional historical annual cycle and timeseries
…cException removal

Main branch removed the InvalidDiagnosticException import since
_solve_from_data_requirements now returns empty instead of raising.
Simplify the OR-logic branch to match: remove the redundant
raise+catch pattern and let _solve_from_data_requirements handle
missing source types silently. Re-add import for the final
"no matches" raise.
Add 46 new tests across three test files:

- test_esgf_cmip7: PermissionError re-raise, default facets, conversion
  failures, multiple files per row, metadata edge cases, passthrough params
- test_solver: matches_filter direct tests, execution_slug, build_definition,
  selectors, dataset_key with multiple source types, empty DataFrame,
  execute=False, extract_covered_datasets edge cases
- test_test_cases: _build_catalog success/multi-dir, _solve_test_case,
  CMIP7/PMPClimatology/mixed source types, force flag, --only-missing,
  --if-changed, non-existent provider errors, list with paths
@lewisjared lewisjared merged commit 4ba94e7 into main Feb 6, 2026
17 checks passed
@lewisjared lewisjared deleted the cmip7-data-requirements branch February 6, 2026 04:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant