Skip to content

Add CMIP7 Support: Global Attributes#230

Closed
siligam wants to merge 17 commits intomainfrom
feat/CMIP7globalattrs
Closed

Add CMIP7 Support: Global Attributes#230
siligam wants to merge 17 commits intomainfrom
feat/CMIP7globalattrs

Conversation

@siligam
Copy link
Copy Markdown
Contributor

@siligam siligam commented Oct 28, 2025

Summary

This PR adds comprehensive CMIP7 support to pycmor, including global attributes generation, YAML configuration validation, and extensive user documentation. All changes maintain backward compatibility with CMIP6.

Changes

🎯 Core Functionality

Global Attributes (src/pycmor/std_lib/global_attributes.py)

  • Added CMIP7GlobalAttributes class extending GlobalAttributes
  • Fixed get_frequency(): Falls back to rule_dict when drv is None
  • Updated get_Conventions(): Returns "CF-1.10 CMIP-7.0" for CMIP7
  • Fixed variant label indices: Convert to strings for netCDF compliance
  • Added methods: get_product(), get_data_specs_version(), get_sub_experiment_id()

YAML Validation (src/pycmor/core/validate.py)

  • Added CMIP7 fields to general schema:
    • CMIP7_DReq_metadata (optional, for Data Request metadata file)
    • Made CMIP_Tables_Dir optional (not needed for CMIP7)
  • Added CMIP7 fields to rules schema:
    • compound_name (CMIP7 compound names like atmos.tas.tavg-h2m-hxy-u.mon.GLB)
    • realm, frequency, table_id (can come from compound_name)
    • grid, nominal_resolution (recommended metadata)
    • institution_id (required for CMIP7)
  • Fixed typo: instition_idinstitution_id (kept old one for backward compatibility)
  • Made flexible: cmor_variable optional when compound_name provided

📚 Documentation

New Documentation

  • doc/cmip7_configuration.rst (573 lines)

    • Complete YAML configuration guide for CMIP7
    • Minimum requirements and optional fields explained
    • Multiple complete examples (atmospheric, ocean, unstructured grids)
    • CMIP6 to CMIP7 migration guide
    • Troubleshooting section
    • Validation and running instructions
  • examples/cmip7-example.yaml (98 lines)

    • Working CMIP7 configuration examples
    • Examples with and without compound names
    • Atmospheric and ocean variables
    • Unstructured grid configuration

Updated Documentation

  • doc/quickstart.rst: Added note directing CMIP7 users to new guide
  • doc/index.rst: Added cmip7_configuration to table of contents
  • README.rst: Updated to mention CMIP6 and CMIP7 support

✅ Testing

Unit Tests (tests/unit/test_cmip7_global_attributes.py)

  • 11 tests covering CMIP7 global attributes:
    • Global attributes structure and presence
    • CMIP7-specific attributes (mip_era, Conventions, product)
    • Format validation (creation_date, tracking_id, variant_label)
    • Type validation (all attributes are strings)
    • Source type derivation from CVs
    • License text correctness
    • Further info URL generation

Integration Tests (tests/integration/test_cmip7_yaml_validation.py)

  • 10 tests covering YAML validation:
    • Minimal CMIP7 configuration
    • Full CMIP7 configuration with optional fields
    • CMIP7 without compound names
    • CMIP6 backward compatibility
    • Field validation (compound_name, institution_id, variant_label)
    • YAML file parsing

All 21 tests passing

Key Features

CMIP7 Compound Names

# Instead of:
cmor_variable: tas
frequency: mon
realm: atmos
table_id: Amon

# Use:
compound_name: atmos.tas.tavg-h2m-hxy-u.mon.GLB

Minimum YAML Configuration

general:
  cmor_version: "CMIP7"
  CV_Dir: "/path/to/CMIP7-CVs"
  CMIP7_DReq_metadata: "/path/to/dreq_metadata.json"

rules:
  - name: tas
    compound_name: atmos.tas.tavg-h2m-hxy-u.mon.GLB
    model_variable: temp2
    inputs:
      - path: /path/to/data
        pattern: "*.nc"
    
    # 5 Required identifiers
    source_id: AWI-CM-1-1-HR
    institution_id: AWI              # ← NEW in CMIP7!
    experiment_id: historical
    variant_label: r1i1p1f1
    grid_label: gn
    
    output_directory: /path/to/output

CMIP6 vs CMIP7 Comparison

Field CMIP6 CMIP7
Variable cmor_variable: tas compound_name: atmos.tas...
Metadata source CMOR Tables Data Request API
Frequency From table From compound_name or explicit
Realm model_component: atmos realm: atmos (or from compound_name)
Institution Optional institution_id required
Variant label r1i1p1 (3 parts) r1i1p1f1 (4 parts)
Tables directory Required Optional

Backward Compatibility

All CMIP6 configurations continue to work

  • Existing CMIP6 YAML files validate and run without changes
  • CMIP6 tests still passing
  • No breaking changes to existing functionality

Migration Path

Users can migrate from CMIP6 to CMIP7 by:

  1. Update general section:

    cmor_version: "CMIP7"
    CV_Dir: "/path/to/CMIP7-CVs"
    CMIP7_DReq_metadata: "/path/to/dreq_metadata.json"
  2. Update each rule:

    • Add compound_name (recommended) or keep cmor_variable + frequency + realm + table_id
    • Add institution_id (required)
    • Add grid and nominal_resolution (recommended)

See doc/cmip7_configuration.rst for complete migration guide.

Testing

# Run unit tests
pytest tests/unit/test_cmip7_global_attributes.py -v

# Run integration tests
pytest tests/integration/test_cmip7_yaml_validation.py -v

# Validate YAML configuration
pycmor validate config examples/cmip7-example.yaml

Documentation

  • Configuration Guide: doc/cmip7_configuration.rst
  • Data Request API: doc/cmip7_interface.rst (existing)
  • Controlled Vocabularies: doc/cmip7_controlled_vocabularies.rst (existing)
  • Example YAML: examples/cmip7-example.yaml

Checklist

  • Code follows project style guidelines
  • Tests added and passing (21 tests)
  • Documentation updated
  • Backward compatibility maintained
  • CMIP6 tests still passing
  • Example configurations provided
  • Migration guide included

Additional Notes

This PR focuses on the YAML configuration workflow, which is how most users interact with pycmor. The implementation:

  • Validates CMIP7-specific fields
  • Generates all 29 required CMIP7 global attributes
  • Provides comprehensive documentation for users
  • Maintains full backward compatibility with CMIP6

Future work could include:

  • Integration with CMIP7 Data Request API for runtime variable queries
  • DRS (Data Reference Syntax) updates for CMIP7
  • Additional CMIP7-specific pipelines or processing steps

siligam added 17 commits October 7, 2025 09:08
- Add cmip7 extra to lint_and_format job installation
- Add cmip7 extra to test job installation
- Ensures CMIP7 interface is tested in CI pipeline
- Exclude prototype/ directory from test collection (experimental code with missing deps)
- Skip test_cmip7_from_vendored_json (vendored JSON has limited data)
- Remove test_version_compatibility.py (tests removed wrapper)
- All CMIP7 interface tests pass (44 tests in test_cmip7_interface.py)
- Add CMIP7-CVs as git submodule (src-data branch)
- Implement CMIP7ControlledVocabularies class with support for:
  - Loading from vendored submodule
  - Directory-based CV structure (experiment/, project/)
  - JSON-LD format handling
- Add comprehensive test suite (18 tests)
- Add documentation in Sphinx format
- Fix import order (isort)
- Apply black formatting (whitespace, line breaks)
- Remove unused 'ref' variable in load_from_git method
- All linting checks now pass
- Add comprehensive CMIP7 global attributes implementation
  - Fix get_frequency() to fall back to rule_dict when drv is None
  - Update get_Conventions() to include CMIP-7.0
  - Convert variant label indices to strings for netCDF compliance
  - Add 11 unit tests for CMIP7 global attributes (all passing)

- Update validation schema for CMIP7 YAML configurations
  - Add CMIP7_DReq_metadata field to general schema
  - Make CMIP_Tables_Dir optional for CMIP7
  - Add compound_name field for CMIP7 compound names
  - Make cmor_variable optional when compound_name provided
  - Add CMIP7-specific fields: realm, frequency, table_id, grid, nominal_resolution
  - Fix institution_id typo (was instition_id)
  - Add 10 integration tests for YAML validation (all passing)

- Add comprehensive CMIP7 documentation
  - New doc/cmip7_configuration.rst: Complete YAML configuration guide
  - Update doc/quickstart.rst: Add note directing CMIP7 users to new guide
  - Update doc/index.rst: Add cmip7_configuration to TOC
  - Update README.rst: Mention CMIP6 and CMIP7 support
  - New examples/cmip7-example.yaml: Working configuration examples

Key features:
- CMIP7 compound names support (e.g., atmos.tas.tavg-h2m-hxy-u.mon.GLB)
- Backward compatibility with CMIP6 maintained
- All 21 tests passing (11 unit + 10 integration)
- User-focused documentation with migration guide
Resolved conflicts in src/pycmor/core/validate.py by keeping both:
- CMIP7 fields (frequency, table_id, grid, nominal_resolution)
- Time coordinate fields from main (time_units, time_calendar)

All tests passing after merge.
@siligam siligam requested review from mandresm and pgierz October 29, 2025 11:03
@pgierz pgierz added this to the CMIP7 milestone Nov 3, 2025
pgierz added a commit that referenced this pull request Nov 12, 2025
This merge adds comprehensive CMIP7 global attributes support to pycmor,
enabling CMIP7 workflows and unlocking previously failing integration tests.

Key Features:
- Complete CMIP7GlobalAttributes implementation (29 required attributes)
- YAML validation schema for CMIP7 configurations
- Comprehensive documentation (doc/cmip7_configuration.rst)
- Example configurations (examples/cmip7-example.yaml)
- 21 passing tests (11 unit + 10 integration)

Changes:
- src/pycmor/std_lib/global_attributes.py: Full CMIP7 implementation
- src/pycmor/core/validate.py: CMIP7 validation schema
- doc/cmip7_configuration.rst: User guide (573 lines)
- examples/cmip7-example.yaml: Working examples
- tests/unit/test_cmip7_global_attributes.py: 11 unit tests
- tests/integration/test_cmip7_yaml_validation.py: 10 integration tests

Impact:
- Unlocks 3 xfail integration tests in prep-release
- Enables CMIP7-compliant output file generation
- Maintains full CMIP6 backward compatibility

Co-authored-by: PavanSiligam <pavan.siligam@gmail.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

# Conflicts:
#	.github/workflows/CI-test.yaml
#	pytest.ini
#	setup.py
#	src/pycmor/data_request/table.py
#	tests/unit/data_request/test_variable.py
pgierz added a commit that referenced this pull request Nov 12, 2025
Now that PR #230 has been merged with full CMIP7GlobalAttributes
implementation, the 3 integration tests that were marked as expected
failures can now run successfully.

Changes:
- tests/integration/test_basic_pipeline.py:
  - Remove xfail marker from test_init[CMIP7]
  - Remove xfail marker from test_process[CMIP7]
- tests/integration/test_uxarray_pi.py:
  - Remove xfail marker from test_process_cmip7

These tests were failing due to NotImplementedError in
CMIP7GlobalAttributes.global_attributes(), which is now fully implemented.
pgierz added a commit that referenced this pull request Nov 12, 2025
The merge of PR #230 left conflict markers in the CI workflow file
at lines 226-290 in the meta test section. This commit resolves the
conflict by keeping the Docker-based approach from prep-release HEAD.

The Docker-based approach is preferred because:
- Uses consistent containerized test environment
- Properly sets environment variables for HDF5/NetCDF debugging
- Matches the pattern used for other test jobs
- Ensures reproducible test execution across CI runs
pgierz added a commit that referenced this pull request Nov 12, 2025
The files merged from PR #230 (feat/CMIP7globalattrs) need formatting
to comply with the project's style guidelines.

Changes:
- Apply black formatting to 8 files
- Apply isort import sorting to 5 files
- No functional changes, only style fixes

Files reformatted:
- src/pycmor/core/cmorizer.py
- src/pycmor/std_lib/global_attributes.py
- src/pycmor/data_request/table.py
- src/pycmor/data_request/cmip7_interface.py
- tests/unit/test_cmip7_global_attributes.py
- tests/unit/data_request/test_variable.py
- tests/unit/data_request/test_cmip7_interface.py
- tests/integration/test_cmip7_yaml_validation.py
@mandresm
Copy link
Copy Markdown
Contributor

Closes #210 #198

@pgierz
Copy link
Copy Markdown
Member

pgierz commented Nov 23, 2025

Closing, all features are in prep-release

@pgierz pgierz closed this Nov 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants