V1.1.0 - Added Shapefile mixin and fixed BinaryOPQuery bug
π What's New
Shapefile Masking Support
The headline feature of v1.1.0 is comprehensive shapefile masking for NetCDF files. Users can now clip climate data to geographic regions defined by shapefiles in a single, chainable operation.
Key Highlights:
# One-liner regional masking
masked = cdo.query("global_data.nc").mask_by_shapefile("region.shp").compute()
# Chain with other CDO operators
result = (
cdo.query("daily_precip.nc")
.mask_by_shapefile("sahel.shp")
.year_mean()
.field_mean()
.compute()
)Features:
- β Complete automated pipeline (load shapefile β create mask β apply β cleanup)
- β Support for both 1D (regular) and 2D (curvilinear) grids
- β Automatic CRS reprojection to WGS84
- β Multi-polygon shapefile support
- β
Custom coordinate naming (
lat_name,lon_nameparameters) - β Automatic temporary file cleanup
- β Secure temp file handling (no race conditions)
Installation:
pip install python-cdo-wrapper[shapefiles]This optional dependency group includes geopandas>=0.10.0 and shapely>=2.0.0.
π§ Technical Improvements
Enhanced Binary Operations
Fixed: Binary operators (add, sub, mul, div, min, max) no longer generate unnecessary brackets in CDO commands.
- Binary operators take exactly two inputs and CDO assigns them unambiguously right-to-left
- Commands are now cleaner:
cdo -sub -yearmean data.nc clim.nc - Previously generated:
cdo -sub [ -yearmean data.nc ] clim.nc - Follows CDO best practices and may improve performance
Security Enhancements
- Replaced unsafe
tempfile.mktemp()with securetempfile.mkstemp()in shapefile masking - Prevents race condition vulnerabilities when creating temporary mask files
- Follows Python security best practices
π Complete Feature List
New Operators
Shapefile Masking
mask_by_shapefile(shapefile_path, lat_name="lat", lon_name="lon")- Clip NetCDF to shapefile polygon extent
- Chainable with all other CDO operators
- Returns masked xarray.Dataset
Advanced Utilities
create_mask_from_shapefile(shapefile_path, reference_nc, lat_name, lon_name)- Create reusable binary mask files
- Useful for processing multiple files with same region
- Returns xarray.Dataset with mask variable
Information Operators (Query API)
All CDO information commands are now available as terminating query methods (from v1.0.1, included in this release):
Variable Information:
.showname()- List variable names.showcode()- List variable codes.showunit()- List variable units.showlevel()- List vertical levels
Time Information:
.showdate()- List timestamps.showtime()- List time values.ntime()- Count timesteps
Dataset Information:
.sinfo()- Structured file info (returnsSinfoResult).info()- General info (returnsInfoResult).vlist()- Variable list (returnsVlistResult).partab()- Parameter table (returnsPartabResult).nvar()- Count variables.nlevel()- Count levels
Grid Information:
.griddes()- Grid description (returnsGriddesResult).zaxisdes()- Z-axis description (returnsZaxisdesResult)
These methods execute immediately and return structured results, making them convenient for pipeline inspection.
π Migration Guide
From v1.0.x to v1.1.0
No breaking changes - v1.1.0 is fully backward compatible with v1.0.x.
Optional: Adopt Shapefile Masking
If you currently mask data manually with separate mask files:
Before (v1.0.x):
# Manual masking with pre-created mask
result = cdo.query("data.nc").select_mask("region_mask.nc").compute()After (v1.1.0):
# Direct shapefile masking
result = cdo.query("data.nc").mask_by_shapefile("region.shp").compute()Optional: Install Shapefile Support
# If you need shapefile masking
pip install --upgrade python-cdo-wrapper[shapefiles]
# Or just core package (no shapefile support)
pip install --upgrade python-cdo-wrapperπ¦ Installation & Upgrade
New Installation
# Core package
pip install python-cdo-wrapper
# With shapefile support
pip install python-cdo-wrapper[shapefiles]
# All optional features
pip install python-cdo-wrapper[shapefiles,dev,test]Upgrading from v1.0.x
pip install --upgrade python-cdo-wrapper
# Or with shapefiles:
pip install --upgrade python-cdo-wrapper[shapefiles]Prerequisites
- Python >= 3.9
- CDO >= 1.9.8 (recommended: >= 2.0.0)
- For shapefile masking: geopandas >= 0.10.0, shapely >= 2.0.0
π Documentation
Updated Documentation
- README.md - Added comprehensive shapefile masking section
- CHANGELOG.md - Detailed changelog for v1.1.0
- examples/shapefile_masking_example.py - 7 usage scenarios
Key Examples
Example 1: Simple Regional Analysis
from python_cdo_wrapper import CDO
cdo = CDO()
# Analyze temperature over Amazon basin
regional_temp = (
cdo.query("global_temperature.nc")
.mask_by_shapefile("amazon_basin.shp")
.year_mean()
.field_mean()
.compute()
)Example 2: Multi-step Climate Analysis
# Calculate regional precipitation anomaly
anomaly = (
cdo.query("monthly_precip.nc")
.mask_by_shapefile("sahel.shp")
.sub(F("climatology.nc"))
.year_mean()
.compute()
)Example 3: Reusable Masks for Multiple Files
from python_cdo_wrapper import create_mask_from_shapefile
# Create mask once
mask_ds = create_mask_from_shapefile(
shapefile_path="europe.shp",
reference_nc="template.nc"
)
mask_ds.to_netcdf("europe_mask.nc")
# Reuse for multiple files
file1 = cdo.query("data1.nc").select_mask("europe_mask.nc").compute()
file2 = cdo.query("data2.nc").select_mask("europe_mask.nc").compute()β‘ Performance Notes
Shapefile Masking Performance
For small to medium grids (< 100k points): Masking is fast, typically < 1 second.
For large grids (e.g., global 0.5Β° = ~260k points): The point-in-polygon test loop can take several seconds.
Optimization strategies:
- Pre-create and reuse masks for multiple files with the same region
- Use coarser grids when high resolution isn't needed
- Consider CDO's native region selection (
.select_region()) for simple rectangular boxes
Example of mask reuse:
# Create mask once (may be slow for large grids)
mask_ds = create_mask_from_shapefile("region.shp", "data.nc")
mask_ds.to_netcdf("region_mask.nc")
# Reuse for many files (fast)
for file in file_list:
result = cdo.query(file).select_mask("region_mask.nc").year_mean().compute()π§ͺ Testing
Test Coverage
- 9 unit tests for shapefile masking (all passing)
- 4 integration tests (require CDO installation)
- Security test verifying secure temp file creation
- Cleanup test verifying automatic temp file deletion
Running Tests
# Install with test dependencies
pip install python-cdo-wrapper[test,shapefiles]
# Run shapefile masking tests
pytest tests/test_shapefile_masking.py -v
# Run all tests
pytestπ Known Issues & Limitations
Shapefile Masking
- Large grids: Point-in-polygon tests can be slow (see Performance Notes above)
- Complex polygons: Very complex multi-polygon shapefiles may take longer to process
- CRS requirements: Shapefile must be in geographic coordinates or reprojectable to WGS84
Workarounds
- For repeated analyses on the same region, create and reuse mask files
- For simple rectangular regions, use
.select_region()instead - For very large grids, consider downsampling before masking
π Resources
- GitHub Repository: https://github.com/NarenKarthikBM/python-cdo-wrapper
- PyPI Package: https://pypi.org/project/python-cdo-wrapper/
- CDO Documentation: https://code.mpimet.mpg.de/projects/cdo/
- GeoPandas Documentation: https://geopandas.org/
π Acknowledgments
Special thanks to:
- The CDO team for the excellent climate data operators
- The geopandas and shapely communities for robust geospatial tools
π Full Changelog
For complete details, see CHANGELOG.md.
v1.1.0 Summary
- β
Added shapefile masking with
mask_by_shapefile()operator - β Fixed binary operation bracket generation
- β Enhanced security with secure temp file creation
- β Improved performance documentation
- β Enhanced test coverage for temp file cleanup
- β Updated documentation and examples
Questions or Issues? Please open an issue on GitHub: https://github.com/NarenKarthikBM/python-cdo-wrapper/issues