# python-cdo-wrapper v1.0.0 Demo Notebook

**Complete demonstration of the new Django ORM-inspired query API**

---

## Overview

This notebook demonstrates all major features introduced in **python-cdo-wrapper v1.0.0**, a complete architectural overhaul that brings a Django-inspired QuerySet pattern to climate data processing with CDO (Climate Data Operators).

### Key Features Demonstrated

- üîó **Lazy Query API** - Build pipelines with chainable methods, execute when ready
- üîç **Query Introspection** - Inspect CDO commands before execution
- üå≥ **Query Branching** - Clone queries for multiple analyses
- ‚ú® **F() Function** - One-liner anomaly calculations (Django F-expression pattern)
- üìä **150+ Operators** - Selection, statistics, arithmetic, interpolation
- üì¶ **Structured Results** - Type-safe dataclasses for info commands
- üîÑ **Backward Compatible** - v0.x API still works!

### Requirements

- **Python**: >= 3.9
- **CDO**: >= 1.9.8
- **python-cdo-wrapper**: >= 1.0.0

### Installation

```bash
# Install python-cdo-wrapper
pip install python-cdo-wrapper>=1.0.0

# Install CDO if needed
# macOS: brew install cdo
# Linux: sudo apt install cdo
# Conda: conda install -c conda-forge cdo
```

---

In [None]:
"""
Import required libraries and check environment.
"""

# Standard library imports

import numpy as np
import xarray as xr

# Import python-cdo-wrapper v1.0.0 API
from python_cdo_wrapper import CDO, F
from python_cdo_wrapper.types import GridSpec

In [None]:
"""
Check CDO installation and version.
Verifies that CDO >= 1.9.8 is available for F() function support.
"""

from python_cdo_wrapper.utils import get_cdo_version

try:
    version = get_cdo_version()
    print(f"‚úÖ CDO installed: {version}")
except Exception as e:
    print(f"‚ùå CDO not found: {e}")

### Create Sample Climate Data

We'll generate synthetic NetCDF files with realistic climate data for demonstration purposes:

- **sample_data.nc** - 3 years of daily data (2020-2022)
  - `tas`: Temperature on 4 pressure levels (1000, 850, 500, 250 hPa)
  - `pr`: Precipitation (surface)
  - Grid: 2¬∞ √ó 2¬∞ global coverage
  
- **climatology.nc** - Time mean for anomaly calculations

In [None]:
"""
Create synthetic climate data for demonstration.
Generates realistic temperature and precipitation fields.
"""


def create_sample_data(filename="sample_data.nc"):
    """
    Create synthetic climate data with spatial and temporal dimensions.

    Returns:
        xr.Dataset: The created dataset
    """

    # Define dimensions
    lon = np.arange(0, 360, 2)  # 180 longitude points (2¬∞ resolution)
    lat = np.arange(-90, 91, 2)  # 91 latitude points (2¬∞ resolution)
    time = xr.cftime_range(
        start="2020-01-01", periods=365 * 3, freq="D"
    )  # 3 years daily
    level = [1000, 850, 500, 250]  # Pressure levels in hPa

    # Create temperature field with realistic patterns
    lon_grid, lat_grid = np.meshgrid(lon, lat)

    # Base temperature: warm at equator, cold at poles
    base_temp = 288 - 50 * np.abs(lat_grid) / 90

    # Add seasonal cycle (10K amplitude)
    seasonal_cycle = 10 * np.sin(2 * np.pi * np.arange(len(time)) / 365)

    # Create 4D temperature array (time, level, lat, lon)
    tas_data = np.zeros((len(time), len(level), len(lat), len(lon)))

    for t in range(len(time)):
        for lev_idx, lev in enumerate(level):
            # Temperature decreases with altitude (standard lapse rate)
            temp_at_level = base_temp - (1000 - lev) * 0.0065
            # Add seasonal cycle
            temp_at_level = temp_at_level + seasonal_cycle[t]
            # Add random noise
            temp_at_level = temp_at_level + np.random.randn(len(lat), len(lon)) * 2
            tas_data[t, lev_idx, :, :] = temp_at_level

    # Create xarray Dataset with CF conventions
    ds = xr.Dataset(
        {
            "tas": (
                ["time", "level", "lat", "lon"],
                tas_data,
                {
                    "long_name": "Near-Surface Air Temperature",
                    "units": "K",
                    "standard_name": "air_temperature",
                },
            ),
            "pr": (
                ["time", "lat", "lon"],
                np.random.exponential(3, (len(time), len(lat), len(lon))),
                {
                    "long_name": "Precipitation",
                    "units": "mm/day",
                    "standard_name": "precipitation_flux",
                },
            ),
        },
        coords={
            "time": time,
            "level": level,
            "lat": lat,
            "lon": lon,
        },
        attrs={
            "title": "Sample Climate Data",
            "institution": "Python CDO Wrapper Demo",
            "source": "Synthetic data for demonstration",
            "Conventions": "CF-1.6",
        },
    )

    # Save to NetCDF
    ds.to_netcdf(filename)

    print(f"‚úÖ Created {filename}")
    print("   üìê Dimensions:")
    print(f"      - Time:  {len(time)} timesteps (2020-01-01 to 2022-12-31)")
    print(f"      - Level: {len(level)} pressure levels")
    print(f"      - Lat:   {len(lat)} points (-90¬∞ to 90¬∞)")
    print(f"      - Lon:   {len(lon)} points (0¬∞ to 358¬∞)")
    print("   üìä Variables:")
    print("      - tas: 4D temperature field (K)")
    print("      - pr:  3D precipitation field (mm/day)")

    return ds


# Create the sample dataset
print("üîÑ Generating sample climate data...\n")
sample_ds = create_sample_data("sample_data.nc")

---

## 1. Django ORM-Style Query API

The core innovation of v1.0.0 is the **Django QuerySet-inspired API**. Just like Django's ORM makes database queries intuitive and chainable, our query API makes CDO operations readable and composable.

### Key Concepts

- **Lazy Evaluation**: Operations are queued, not executed immediately
- **Chainable Methods**: Build complex pipelines step by step
- **Immutable Queries**: Each method returns a new query instance
- **Terminal Operations**: Call `.compute()` to execute the pipeline

This approach is inspired by Django's QuerySet pattern, which allows you to build complex database queries programmatically.

### 1.1 Basic Query Building (Lazy Evaluation)

Build a query **without executing it** - just like Django QuerySets!

In [None]:
"""
Demonstrate lazy query building.
Operations are added to the query but not executed until .compute() is called.
"""

# Initialize CDO instance
cdo = CDO()

# Build a query pipeline - NO EXECUTION YET!
query = (
    cdo.query("sample_data.nc")  # Start with input file
    .select_var("tas")  # Select temperature variable
    .select_level(850)  # Select 850 hPa level
    .select_year(2020, 2021)  # Select years 2020-2021
    .year_mean()  # Compute annual means
)

print("‚úÖ Query built successfully!")
print(f"\nüìã Query type: {type(query).__name__}")
print(f"üîß Query object: {query}")
print("\nüí° Note: No CDO command has been executed yet.")
print("   The query is lazy - it will only run when you call .compute()")

### 1.2 Query Execution

Call `.compute()` to execute the query and get results.

In [None]:
"""
Execute the query and get results as xarray Dataset.
"""

# NOW execute the query
result = query.compute()

print("‚úÖ Query executed successfully!")
print(f"\nüì¶ Result type: {type(result)}")
print(f"\nüìê Dimensions: {dict(result.dims)}")
print(f"üìä Variables: {list(result.data_vars)}")
print(f"üìç Coordinates: {list(result.coords)}")
print(f"\nüå°Ô∏è  Mean temperature: {float(result.tas.mean().values):.2f} K")

### 1.3 Complex Query Chaining

Chain multiple operations to build sophisticated analysis pipelines.

In [None]:
"""
Build a complex multi-step analysis pipeline.
This example extracts European winter temperature climatology.
"""

# Complex pipeline with multiple selections and aggregations
complex_query = (
    cdo.query("sample_data.nc")
    .select_var("tas")  # Select temperature
    .select_level(1000)  # Surface pressure level
    .select_region(lon1=-10, lon2=40, lat1=35, lat2=70)  # Europe bounding box
    .select_season("DJF")  # Winter (Dec-Jan-Feb)
    .year_mean()  # Annual winter means
    .field_mean()  # Spatial average
)

# Execute the pipeline
europe_winter = complex_query.compute()

print("‚úÖ European winter temperature (DJF) computed!")
print(f"\nüìê Result shape: {dict(europe_winter.dims)}")
print("üå°Ô∏è  Mean winter temperatures:")
for i, temp in enumerate(europe_winter.tas.values):
    year = 2020 + i
    print(f"   {year}: {float(temp):.2f} K ({float(temp) - 273.15:.2f} ¬∞C)")

---

## 2. Query Introspection

**Inspect queries before execution** - see exactly what CDO command will run!

This is one of the most powerful features of v1.0.0. You can:
- Preview the generated CDO command
- Get a human-readable explanation
- List all operations in the pipeline

Perfect for debugging and understanding complex queries.

In [None]:
"""
Inspect a query to see what will be executed.
Use this to debug and understand complex pipelines.
"""

# Build a query
query = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .select_level(850)
    .select_year(2020, 2021, 2022)
    .year_mean()
    .field_mean()
)

# 1. Get the raw CDO command
print("üîç Generated CDO Command:")
print("=" * 70)
print(query.get_command())

# 2. Get human-readable explanation
print("\nüìã Human-Readable Explanation:")
print("=" * 70)
print(query.explain())

# 3. List all operations in the pipeline
print("\nüîß Pipeline Operations:")
print("=" * 70)
for i, op in enumerate(query.get_operations(), 1):
    print(f"  {i}. {op.name}: {op.to_cdo_fragment()}")

print("\nüí° This transparency helps you understand and debug complex queries!")

---

## 3. Query Branching

**Create variations from a base query** using `.clone()`!

This is powerful for comparative analyses - define the common processing once, then branch for different aggregations or regions.

In [None]:
"""
Demonstrate query branching for comparative analyses.
Create a base query, then clone it for different temporal aggregations.
"""

# Create a base query with common operations
base = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .select_level(1000)  # Surface level
    .select_region(lon1=-10, lon2=40, lat1=35, lat2=70)  # Europe
)

print("üå≥ Base Query Created")
print("=" * 70)
print("Description: European surface temperature")
print(f"Command: {base.get_command()}")

# Branch 1: Annual means
print("\nüìä Branch 1: Annual Means")
print("-" * 70)
annual_mean = base.clone().year_mean().field_mean().compute()
print(f"‚úÖ Computed annual means for {len(annual_mean.time)} years")
print(f"   Values: {[f'{float(v):.2f}' for v in annual_mean.tas.values]} K")

# Branch 2: Monthly climatology
print("\nüìä Branch 2: Monthly Climatology")
print("-" * 70)
monthly_mean = base.clone().month_mean().field_mean().compute()
print("‚úÖ Computed monthly climatology")
print(f"   Shape: {dict(monthly_mean.dims)}")
print(f"   Coldest month: {float(monthly_mean.tas.min().values):.2f} K")
print(f"   Warmest month: {float(monthly_mean.tas.max().values):.2f} K")

# Branch 3: Seasonal means
print("\nüìä Branch 3: Seasonal Means")
print("-" * 70)
seasonal_mean = base.clone().season_mean().field_mean().compute()
print("‚úÖ Computed seasonal means")
print(f"   Shape: {dict(seasonal_mean.dims)}")

print("\n" + "=" * 70)
print("üéØ Three different analyses from one base query!")
print("üí° This avoids code duplication and ensures consistency.")

---

## 4. F() Function: One-Liner Anomaly Calculations

**The game-changer**: Calculate anomalies in a single line using the `F()` function!

Inspired by Django's F-expressions, the `F()` function enables binary operations between datasets. This is particularly powerful for anomaly calculations.

**Requirements**: CDO >= 1.9.8 (uses bracket notation internally)

### How it Works

```python
# Traditional approach (multiple steps):
data = cdo.select_var("sample.nc", "tas")
clim = cdo.time_mean(data)
anomaly = cdo.sub(data, clim)

# v1.0.0 approach (one line!):
anomaly = cdo.query("sample.nc").sub(F("climatology.nc")).compute()
```

### 4.1 Simple Anomaly Calculation

Calculate anomalies from climatology in one line!

In [None]:
"""
ONE-LINER ANOMALY CALCULATION!
Subtract climatology from data using F() function.
"""

# Calculate anomaly: data - climatology
anomaly = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .sub(F("climatology.nc"))  # F() references another file
    .compute()
)

print("‚úÖ Anomaly calculated in ONE LINE!")
print("\nüìä Anomaly Statistics:")
print(f"   Mean: {float(anomaly.tas.mean().values):>8.3f} K  (should be ~0)")
print(f"   Std:  {float(anomaly.tas.std().values):>8.3f} K")
print(f"   Min:  {float(anomaly.tas.min().values):>8.3f} K")
print(f"   Max:  {float(anomaly.tas.max().values):>8.3f} K")

print("\nüîç What happened under the hood:")
print("   CDO command: cdo -sub sample_data.nc climatology.nc")
print("\nüí° With F(), complex operations are readable and concise!")

### 4.2 Processing Both Sides Before Subtraction

F() supports chaining! Process both datasets before the binary operation.

In [None]:
"""
Calculate temperature difference between pressure levels.
Both sides are processed before subtraction.
"""

# Calculate: (1000 hPa mean) - (500 hPa mean)
temp_diff = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .select_level(1000)  # Left side: surface level
    .time_mean()
    .sub(
        F("sample_data.nc")  # Right side: upper level
        .select_var("tas")
        .select_level(500)
        .time_mean()
    )
    .compute()
)

print("‚úÖ Temperature Difference: 1000 hPa - 500 hPa")
print("\nüìä Vertical Temperature Gradient:")
print(f"   Mean difference: {float(temp_diff.tas.mean().values):>7.1f} K")
print(f"   Min difference:  {float(temp_diff.tas.min().values):>7.1f} K")
print(f"   Max difference:  {float(temp_diff.tas.max().values):>7.1f} K")

print("\nüîç Generated CDO command with operator chaining:")
print("   cdo -sub -timmean -sellevel,1000 -selname,tas sample_data.nc \\")
print("            -timmean -sellevel,500 -selname,tas sample_data.nc")
print("\nüí° F() handles complex nested operations in a single CDO command!")

### 4.3 Standardized Anomaly Calculation

Chain multiple F() operations: `(data - mean) / std`

**How it works**: Binary operations use CDO's operator chaining - all operations execute in a single command without temporary files.

In [None]:
"""
Calculate standardized anomaly: (data - mean) / std
This is common in climate analysis for comparing different variables.
"""

# First, create standard deviation file
print("üîÑ Creating standard deviation file...\n")
std_dev = cdo.query("sample_data.nc").select_var("tas").time_std().compute()
std_dev.to_netcdf("std_dev.nc")
print("‚úÖ Created std_dev.nc\n")

# Calculate standardized anomaly: (data - mean) / std
std_anomaly = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .sub(F("climatology.nc"))  # Subtract mean
    .div(F("std_dev.nc"))  # Divide by standard deviation
    .compute()
)

print("‚úÖ Standardized Anomaly Calculated!")
print("\nüìä Standardized Anomaly Statistics:")
print(f"   Mean: {float(std_anomaly.tas.mean().values):>8.3f}  (should be ~0)")
print(f"   Std:  {float(std_anomaly.tas.std().values):>8.3f}  (should be ~1)")
print(f"   Min:  {float(std_anomaly.tas.min().values):>8.3f}")
print(f"   Max:  {float(std_anomaly.tas.max().values):>8.3f}")

print("\nüí° Standardized anomalies allow fair comparison across variables!")

---

## 5. Selection Operators

**18 selection operators** for filtering data by:
- Variables, levels, time dimensions
- Spatial regions
- Masks and conditions

All selections use **named parameters** for clarity and type safety.

In [None]:
# Variable selection
print("üìä Selection Operators Demo\n")

# 1. Select variables
print("1Ô∏è‚É£ Select variables:")
ds = cdo.query("sample_data.nc").select_var("tas", "pr").compute()
print(f"   Variables: {list(ds.data_vars)}")

# 2. Select vertical levels
print("\n2Ô∏è‚É£ Select vertical levels:")
ds = cdo.query("sample_data.nc").select_var("tas").select_level(850, 500).compute()
print(f"   Levels: {list(ds.level.values)}")

# 3. Select years
print("\n3Ô∏è‚É£ Select years:")
ds = cdo.query("sample_data.nc").select_year(2020, 2021).compute()
print(f"   Time range: {ds.time[0].values} to {ds.time[-1].values}")

# 4. Select months
print("\n4Ô∏è‚É£ Select specific months (JJA):")
ds = cdo.query("sample_data.nc").select_month(6, 7, 8).compute()
print(f"   Timesteps: {len(ds.time)} (summer months only)")

# 5. Select seasons
print("\n5Ô∏è‚É£ Select seasons:")
ds = cdo.query("sample_data.nc").select_season("DJF", "JJA").compute()
print(f"   Timesteps: {len(ds.time)} (winter and summer)")

# 6. Select spatial region
print("\n6Ô∏è‚É£ Select spatial region (Europe):")
ds = (
    cdo.query("sample_data.nc")
    .select_region(lon1=-10, lon2=40, lat1=35, lat2=70)
    .compute()
)
print(
    f"   Lat range: {float(ds.lat.min().values):.1f} to {float(ds.lat.max().values):.1f}"
)
print(
    f"   Lon range: {float(ds.lon.min().values):.1f} to {float(ds.lon.max().values):.1f}"
)

print("\n‚úÖ All selection operators working!")

---

## 6. Statistical Operators

**50+ statistical operators** across multiple dimensions:

- **Temporal**: time, year, month, day, hour, season statistics
- **Spatial**: field, zonal, meridional statistics  
- **Vertical**: vertical integration, averaging
- **Running**: moving window statistics
- **Percentiles**: temporal and spatial percentiles

All statistics support: mean, sum, min, max, std, var, range

In [None]:
print("üìä Statistical Operators Demo\n")

# Base query
base = cdo.query("sample_data.nc").select_var("tas").select_level(1000)

# 1. Time statistics
print("1Ô∏è‚É£ Time statistics:")
time_mean = base.clone().time_mean().compute()
print(f"   Time mean shape: {time_mean.dims}")
time_std = base.clone().time_std().compute()
print(f"   Time std shape: {time_std.dims}")

# 2. Year statistics
print("\n2Ô∏è‚É£ Year statistics:")
year_mean = base.clone().year_mean().compute()
print(f"   Annual means: {len(year_mean.time)} years")

# 3. Month statistics
print("\n3Ô∏è‚É£ Monthly climatology:")
month_mean = base.clone().month_mean().compute()
print(f"   Monthly means: {len(month_mean.time)} months")

# 4. Season statistics
print("\n4Ô∏è‚É£ Seasonal statistics:")
season_mean = base.clone().season_mean().compute()
print(f"   Seasonal shape: {season_mean.dims}")

# 5. Field (spatial) statistics
print("\n5Ô∏è‚É£ Field statistics:")
field_mean = base.clone().field_mean().compute()
print(f"   Field mean shape: {field_mean.dims} (spatial dims removed)")
print(f"   Global mean temp: {float(field_mean.tas.mean().values):.2f} K")

# 6. Zonal mean
print("\n6Ô∏è‚É£ Zonal mean:")
zonal = base.clone().zonal_mean().compute()
print(f"   Zonal mean shape: {zonal.dims}")

# 7. Vertical statistics
print("\n7Ô∏è‚É£ Vertical integration:")
vert_int = cdo.query("sample_data.nc").select_var("tas").vert_int().compute()
print(f"   Vertical integral shape: {vert_int.dims}")

print("\n‚úÖ All statistical operators working!")

---

## 7. Arithmetic Operators

**30+ arithmetic operators** for mathematical operations:

- **Constants**: Add, subtract, multiply, divide by constants
- **Binary operations**: Between datasets using F()
- **Math functions**: abs, sqrt, exp, ln, log10, trigonometric
- **Masking**: Conditional operations with ifthen/ifthenelse

In [None]:
print("üî¢ Arithmetic Operators Demo\n")

base = cdo.query("sample_data.nc").select_var("tas")

# 1. Constant arithmetic
print("1Ô∏è‚É£ Convert Kelvin to Celsius:")
celsius = base.clone().sub_constant(273.15).compute()
print(f"   Original: {float(base.clone().compute().tas.mean().values):.2f} K")
print(f"   Celsius:  {float(celsius.tas.mean().values):.2f} ¬∞C")

# 2. Multiply by constant
print("\n2Ô∏è‚É£ Scale by factor:")
scaled = base.clone().mul_constant(1.1).compute()
print("   Scaled by 10%")

# 3. Math functions
print("\n3Ô∏è‚É£ Math functions:")

# Absolute value
anomaly_abs = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .sub(F("climatology.nc"))
    .abs()
    .compute()
)
print(f"   Absolute anomaly mean: {float(anomaly_abs.tas.mean().values):.2f} K")

# Square
squared = cdo.query("sample_data.nc").select_var("pr").sqr().compute()
print(f"   Squared precipitation shape: {squared.dims}")

print("\n‚úÖ Arithmetic operators working!")

---

## 8. Interpolation and Regridding

**9 interpolation operators** for spatial transformations:

- **Horizontal**: bilinear, bicubic, nearest-neighbor, distance-weighted, conservative
- **Vertical**: level interpolation, model-to-pressure level conversion
- **Grid specifications**: Use `GridSpec` class for target grids

In [None]:
print("üó∫Ô∏è  Interpolation and Regridding Demo\n")

# Original grid
original = cdo.query("sample_data.nc").select_var("tas").compute()
print(f"Original grid: {len(original.lon)} √ó {len(original.lat)}")

# 1. Bilinear interpolation to coarser grid
print("\n1Ô∏è‚É£ Bilinear interpolation to 5¬∞ √ó 5¬∞:")
grid_5deg = GridSpec(
    gridtype="lonlat",
    xsize=72,  # 360/5
    ysize=36,  # 180/5
    xfirst=0,
    xinc=5,
    yfirst=-87.5,
    yinc=5,
)

# Save grid spec to file
with open("grid_5deg.txt", "w") as f:
    f.write(grid_5deg.to_cdo_string())

regridded = (
    cdo.query("sample_data.nc").select_var("tas").remap_bil("grid_5deg.txt").compute()
)
print(f"   Regridded: {len(regridded.lon)} √ó {len(regridded.lat)}")

# 2. Conservative remapping (better for flux variables)
print("\n2Ô∏è‚É£ Conservative remapping for precipitation:")
pr_regrid = (
    cdo.query("sample_data.nc").select_var("pr").remap_con("grid_5deg.txt").compute()
)
print(f"   Precipitation regridded: {pr_regrid.dims}")
print(f"   Total precipitation conserved: {float(pr_regrid.pr.sum().values):.1f}")

print("\n‚úÖ Interpolation operators working!")

---

## 9. Structured Info Commands

**Type-safe result objects** instead of strings!

Get file information as **Python dataclasses** with:
- Full type hints and IDE autocompletion
- Helper methods and properties
- No manual string parsing

Available info commands:
- `sinfo()` - Complete file information
- `griddes()` - Grid descriptions
- `zaxisdes()` - Vertical coordinate info
- `vlist()` - Variable list
- `partab()` - Parameter table

In [None]:
print("‚ÑπÔ∏è  Structured Info Commands Demo\n")

# 1. sinfo - Complete file information
print("1Ô∏è‚É£ File Information (sinfo):")
info = cdo.sinfo("sample_data.nc")
print(f"   Type: {type(info).__name__}")
print(f"   File format: {info.file_format}")
print(f"   Number of variables: {info.nvar}")
print(f"   Variable names: {info.var_names}")
print(f"   Time range: {info.time_range}")

print("\n   Variables details:")
for var in info.variables:
    if var.name:
        print(f"     - {var.name}: {var.longname} [{var.units}]")

# 2. griddes - Grid information
print("\n2Ô∏è‚É£ Grid Information (griddes):")
grid = cdo.griddes("sample_data.nc")
print(f"   Type: {type(grid).__name__}")
print(f"   Number of grids: {len(grid.grids)}")
if grid.grids:
    g = grid.grids[0]
    print(f"   Grid type: {g.gridtype}")
    print(f"   Size: {g.xsize} √ó {g.ysize}")
    if hasattr(g, "xinc") and g.xinc:
        print(f"   Resolution: {g.xinc}¬∞ √ó {g.yinc}¬∞")

# 3. vlist - Variable list
print("\n3Ô∏è‚É£ Variable List (vlist):")
vlist = cdo.vlist("sample_data.nc")
print(f"   Type: {type(vlist).__name__}")
print("   Variables:")
for var in vlist.variables:
    print(f"     - {var.name}: Code {var.code}")

# 4. partab - Parameter table
print("\n4Ô∏è‚É£ Parameter Table (partab):")
partab = cdo.partab("sample_data.nc")
print(f"   Type: {type(partab).__name__}")
print(f"   Number of parameters: {len(partab.parameters)}")
for param in partab.parameters[:3]:  # Show first 3
    print(f"     - Code {param.code}: {param.name} [{param.units}]")

print("\n‚úÖ All structured info commands working!")
print("\nüí° Benefits:")
print("   - Type-safe access to all fields")
print("   - IDE autocompletion")
print("   - No string parsing required")
print("   - Helper methods (e.g., info.var_names, info.time_range)")

---

## 10. Advanced Query Methods

**Django-inspired shortcuts** for common query operations:

- `.first()` - Get first timestep
- `.last()` - Get last timestep  
- `.count()` - Count timesteps (returns int)
- `.exists()` - Check if data exists (returns bool)
- `.values(*vars)` - Alias for select_var()

In [None]:
print("üöÄ Advanced Query Methods Demo\n")

base_query = cdo.query("sample_data.nc").select_var("tas")

# 1. first() - Get first timestep
print("1Ô∏è‚É£ Get first timestep:")
first = base_query.clone().first()
print(f"   Shape: {first.dims}")
print(f"   Time: {first.time.values}")

# 2. last() - Get last timestep
print("\n2Ô∏è‚É£ Get last timestep:")
last = base_query.clone().last()
print(f"   Shape: {last.dims}")
print(f"   Time: {last.time.values}")

# 3. count() - Get number of timesteps
print("\n3Ô∏è‚É£ Count timesteps:")
n_timesteps = base_query.clone().count()
print(f"   Number of timesteps: {n_timesteps}")
print(f"   Type: {type(n_timesteps)}")

# 4. exists() - Check if data exists
print("\n4Ô∏è‚É£ Check if data exists:")
exists = base_query.clone().exists()
print(f"   Data exists: {exists}")
print(f"   Type: {type(exists)}")

# 5. values() - Alias for select_var()
print("\n5Ô∏è‚É£ Select variables with .values():")
ds = cdo.query("sample_data.nc").values("tas", "pr").compute()
print(f"   Variables: {list(ds.data_vars)}")

print("\n‚úÖ Advanced query methods working!")

---

## 11. File Operations

**Merge, split, and manage** NetCDF files:

- **Merge**: Combine files by time or variables
- **Split**: Separate by year, month, day, hour, variable, level
- **Concatenate**: Join multiple files
- **Copy**: Duplicate with optional format conversion

All operations return xarray Datasets for immediate use.

In [None]:
print("üìÅ File Operations Demo\n")

# 1. Split by year
print("1Ô∏è‚É£ Split by year:")
cdo.splityear("sample_data.nc", prefix="year_")
print("   Created: year_2020.nc, year_2021.nc, year_2022.nc")

# 2. Merge time series
print("\n2Ô∏è‚É£ Merge time series:")
merged = cdo.mergetime("year_2020.nc", "year_2021.nc", "year_2022.nc")
print(f"   Merged shape: {merged.dims}")
print(f"   Time range: {merged.time[0].values} to {merged.time[-1].values}")

# 3. Split by variable
print("\n3Ô∏è‚É£ Split by variable:")
cdo.splitname("sample_data.nc", prefix="var_")
print("   Created: var_tas.nc, var_pr.nc")

# 4. Merge variables
print("\n4Ô∏è‚É£ Merge variables:")
merged_vars = cdo.merge("var_tas.nc", "var_pr.nc")
print(f"   Variables: {list(merged_vars.data_vars)}")

print("\n‚úÖ File operations working!")

---

## 12. Real-World Example: Complete Analysis Workflow

**Putting it all together**: A complete climate analysis workflow.

This example demonstrates a realistic analysis combining:
- Regional selection (Europe)
- Temporal filtering (winter season)
- Multiple aggregations (annual, spatial, climatology)
- Anomaly calculations with F()
- Variability analysis

**Goal**: Analyze European winter temperature trends and variability

In [None]:
print("üåç Real-World Analysis: European Winter Temperature Trends\n")
print("=" * 60)

# Step 1: Extract European winter (DJF) surface temperature
print("\nüìç Step 1: Extract European winter temperature")
europe_winter = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .select_level(1000)
    .select_region(lon1=-10, lon2=40, lat1=35, lat2=70)
    .select_season("DJF")
)

print(f"   Command: {europe_winter.get_command()[:80]}...")

# Step 2: Calculate annual winter means
print("\nüìä Step 2: Calculate annual winter means")
annual_winter = europe_winter.clone().year_mean().field_mean().compute()
print(f"   Years: {len(annual_winter.time)}")
print("   Mean temperatures:")
for i, temp in enumerate(annual_winter.tas.values):
    year = 2020 + i
    print(f"     {year}: {float(temp):.2f} K ({float(temp) - 273.15:.2f} ¬∞C)")

# Step 3: Calculate spatial pattern
print("\nüó∫Ô∏è  Step 3: Calculate winter climatology (spatial pattern)")
winter_clim = europe_winter.clone().time_mean().compute()
print(f"   Climatology shape: {winter_clim.dims}")
print(f"   Spatial mean: {float(winter_clim.tas.mean().values):.2f} K")
print(f"   Spatial std:  {float(winter_clim.tas.std().values):.2f} K")

# Step 4: Calculate anomalies from climatology
print("\nüìâ Step 4: Calculate winter anomalies")
winter_clim.to_netcdf("europe_winter_clim.nc")

anomalies = (
    europe_winter.clone()
    .sub(F("europe_winter_clim.nc"))
    .year_mean()
    .field_mean()
    .compute()
)

print("   Anomalies (relative to 3-year mean):")
for i, anom in enumerate(anomalies.tas.values):
    year = 2020 + i
    sign = "+" if float(anom) >= 0 else ""
    print(f"     {year}: {sign}{float(anom):.3f} K")

# Step 5: Calculate variability
print("\nüìä Step 5: Analyze spatial variability")
winter_std = europe_winter.clone().time_std().compute()
print(f"   Temporal std (spatial map): {winter_std.dims}")
print(f"   Mean variability: {float(winter_std.tas.mean().values):.2f} K")
print(f"   Max variability:  {float(winter_std.tas.max().values):.2f} K")

print("\n" + "=" * 60)
print("‚úÖ Complete analysis workflow executed!")
print("\nüí° This entire analysis used:")
print("   - Query chaining for readable pipelines")
print("   - Query branching for multiple analyses")
print("   - F() function for anomaly calculation")
print("   - Lazy evaluation with .compute()")
print("\nüéØ Total lines of code: ~15 (vs. 50+ with traditional approach!)")

---

## 13. Comparison: v0.x vs v1.0.0

**Side-by-side comparison** showing the improvements in v1.0.0.

### What Changed?

**v0.x**: String-based CDO commands
- Hard to read and maintain
- No type checking
- Order-dependent parameters
- Can't inspect before execution

**v1.0.0**: Django ORM-style query API  
- Self-documenting code
- Full type safety
- Named parameters (order-independent)
- Query introspection
- Query branching and composition

In [None]:
print("‚öñÔ∏è  v0.x vs v1.0.0 Comparison\n")
print("=" * 70)

print("\nüìù Task: Calculate European winter mean temperature\n")

print("‚ùå v0.x approach (string-based):")
print("-" * 70)
print("from python_cdo_wrapper import cdo")
print("")
print(
    'ds, log = cdo("-fldmean -yearmean -selseason,DJF -sellonlatbox,-10,40,35,70 -sellevel,1000 -selname,tas input.nc")'
)
print("")
print("Issues:")
print("  - Hard to read (one long string)")
print("  - Parameter order matters")
print("  - No type checking")
print("  - No IDE autocompletion")
print("  - Can't inspect before execution")

print("\n‚úÖ v1.0.0 approach (query API):")
print("-" * 70)
print("from python_cdo_wrapper import CDO")
print("")
print("cdo = CDO()")
print("")
print("query = (")
print("    cdo.query('input.nc')")
print("    .select_var('tas')")
print("    .select_level(1000)")
print("    .select_region(lon1=-10, lon2=40, lat1=35, lat2=70)")
print("    .select_season('DJF')")
print("    .year_mean()")
print("    .field_mean()")
print(")")
print("")
print("print(query.get_command())  # Inspect first!")
print("ds = query.compute()")
print("")
print("Benefits:")
print("  ‚úÖ Self-documenting code")
print("  ‚úÖ Named parameters (order-independent)")
print("  ‚úÖ Full type checking")
print("  ‚úÖ IDE autocompletion")
print("  ‚úÖ Inspect before execution")
print("  ‚úÖ Clone and branch for variations")

print("\n" + "=" * 70)

# Actual execution to prove it works
result = (
    cdo.query("sample_data.nc")
    .select_var("tas")
    .select_level(1000)
    .select_region(lon1=-10, lon2=40, lat1=35, lat2=70)
    .select_season("DJF")
    .year_mean()
    .field_mean()
    .compute()
)

print("\n‚úÖ v1.0.0 query executed successfully!")
print(f"   Result shape: {result.dims}")
print(f"   Mean temperature: {float(result.tas.mean().values):.2f} K")

---

## 14. Cleanup

Clean up temporary files created during the demo.

In [None]:
import os

# Clean up temporary files
temp_files = [
    "sample_data.nc",
    "climatology.nc",
    "std_dev.nc",
    "europe_winter_clim.nc",
    "grid_5deg.txt",
    "year_2020.nc",
    "year_2021.nc",
    "year_2022.nc",
    "var_tas.nc",
    "var_pr.nc",
]

print("üßπ Cleaning up temporary files...\n")
for f in temp_files:
    if os.path.exists(f):
        os.remove(f)
        print(f"   ‚úÖ Removed {f}")

print("\n‚úÖ Cleanup complete!")

---

## Summary

This notebook demonstrated **all major v1.0.0 features** of python-cdo-wrapper!

### ‚úÖ Core Features Covered

1. **Django ORM-style Query API** - Lazy, chainable operations
2. **Query Introspection** - `.get_command()`, `.explain()`, `.get_operations()`
3. **Query Branching** - `.clone()` for multiple analyses from one base
4. **F() Function** - One-liner anomaly calculations (Django F-expression pattern)
5. **Structured Results** - Typed dataclasses for info commands

### ‚úÖ Operator Categories (150+ total)

- **Selection** (18 operators) - Variables, levels, time, space, regions
- **Statistics** (50+ operators) - Time, field, vertical, running, percentiles
- **Arithmetic** (30+ operators) - Constants, binary ops, math functions
- **Interpolation** (9 operators) - Horizontal and vertical regridding
- **Advanced Methods** - `.first()`, `.last()`, `.count()`, `.exists()`, `.values()`
- **File Operations** - Merge, split, concatenate, copy

### üéØ Key Benefits

- **Readable**: Self-documenting, chainable code
- **Type-safe**: Full IDE autocompletion and type checking
- **Flexible**: Query branching, cloning, and templates
- **Powerful**: One-liner anomaly calculations with F()
- **Inspectable**: See commands before execution
- **Backward compatible**: v0.x string-based API still works!

### üìö Next Steps

- üìñ Read the [Migration Guide](../MIGRATION_GUIDE.md) for upgrading from v0.x
- üìñ Check the [README](../README.md) for complete API reference
- üîç Explore [Real-World Examples](../README.md#real-world-climate-science-examples)
- üêõ Report issues on [GitHub](https://github.com/NarenKarthikBM/python-cdo-wrapper)

### üôè Acknowledgments

python-cdo-wrapper v1.0.0 is inspired by:
- **Django ORM** - QuerySet pattern and lazy evaluation
- **Django F-expressions** - Binary operations with F()
- **xarray** - NetCDF data structures
- **CDO** - Climate Data Operators (the engine underneath)

---

**python-cdo-wrapper v1.0.0** - Making climate data processing more Pythonic! üéâ

Built with ‚ù§Ô∏è for the climate science community.