# Example 3: Basic Dataset Schema

This example demonstrates how to validate xarray Datasets with multiple data variables, shared coordinates, and global attributes.

The `schema.yaml` file defines:

- **Data variables**: temperature, precipitation, pressure variables sharing the same dimensions and shape
  - All are float32 type
  - All have dimensions [time, lat, lon]
  - All have shape [12, 180, 360]
- **Coordinates**: time (integer array with 12 elements (months)), lat (float array with 180 elements (latitude)),lon (float array with 360 elements (longitude))
- **Global Attributes**: title (Dataset title), institution (data source institution), source (model or source information)

In [None]:
# Basic imports
import numpy as np
import xarray as xr

from xarray_validate import DatasetSchema, SchemaError

# Load schema from YAML file
schema = DatasetSchema.from_yaml("schema.yaml")

In [None]:
# Create coordinates
time = np.arange(12, dtype=np.int64)  # 12 months
lat = np.linspace(-89.5, 89.5, 180)
lon = np.linspace(-179.5, 179.5, 360)

# Create a Dataset that matches the schema
ds = xr.Dataset(
    data_vars={
        "temperature": (
            ["time", "lat", "lon"],
            np.random.randn(12, 180, 360).astype(np.float32),
        ),
        "precipitation": (
            ["time", "lat", "lon"],
            np.random.randn(12, 180, 360).astype(np.float32),
        ),
        "pressure": (
            ["time", "lat", "lon"],
            np.random.randn(12, 180, 360).astype(np.float32),
        ),
    },
    coords={
        "time": time,
        "lat": lat,
        "lon": lon,
    },
    attrs={
        "title": "Monthly Climate Data",
        "institution": "Example Climate Center",
        "source": "Climate Model v1.0",
    },
)

# Validate the Dataset
print("Validating Dataset against schema...")
result = schema.validate(ds)
print("Validation passed!")

In [None]:
# Try with missing data variable: remove required data variable 'pressure'
try:
    schema.validate(ds.drop_vars("pressure"))
except Exception as e:
    print(f"Validation failed as expected: {e}")

In [None]:
# Try with wrong data type: change temperature dtype to float64
ds_wrong_dtype = ds.copy()
ds_wrong_dtype["temperature"] = ds_wrong_dtype["temperature"].astype(np.float64)
try:
    schema.validate(ds_wrong_dtype)
except SchemaError as e:
    print(f"Validation failed as expected: {e}")

In [None]:
# Try with missing attribute: remove required attribute 'title'
ds_missing_attr = ds.copy()
del ds_missing_attr.attrs["title"]
try:
    schema.validate(ds_missing_attr)
except SchemaError as e:
    print(f"Validation failed as expected: {e}")