# CMIP6 Model Intercomparison with climval

**A Taylor Diagram in 15 lines of Python**

**Author:** Northflow Technologies (northflow.no)  
**Library:** [climval](https://github.com/northflowlabs/climval) â€” `pip install climval`  
**License:** Apache 2.0

---

## What this notebook shows

How well do CMIP6 climate models reproduce observed European surface temperature?

This notebook uses `climval` to compare **5 CMIP6 models** against **ERA5 reanalysis** over Europe (1985â€“2014), producing:

- A **Taylor Diagram** â€” the standard visualization for model skill assessment
- A **metric scorecard** â€” RMSE, MAE, Mean Bias, Pearson r, Taylor Skill Score per model
- An **exportable HTML report** â€” shareable, reproducible, citable

**Without climval:** ~150 lines of boilerplate per validation run.  
**With climval:** 15 lines. Same rigour. Exportable results.

---

### Models evaluated

| Model | Institution | Resolution |
|-------|-------------|------------|
| MPI-ESM1-2-HR | MPI Hamburg | ~100 km |
| EC-Earth3 | EC-Earth Consortium | ~100 km |
| CNRM-CM6-1 | CNRM-CERFACS | ~250 km |
| IPSL-CM6A-LR | IPSL | ~250 km |
| UKESM1-0-LL | MOHC | ~135 km |

**Reference:** ERA5 monthly mean 2m temperature, European domain (35Â°Nâ€“72Â°N, 15Â°Wâ€“45Â°E)

> **Note:** This notebook uses realistic synthetic data matching published CMIP6 skill scores.  
> To use real data, replace the `generate_*` functions with your xarray DataArrays from `intake-esm` or the Copernicus CDS.


## 1. Install & Import

In [None]:
# Uncomment on first run
# !pip install climval xarray numpy matplotlib pandas

import warnings
from datetime import datetime

warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt
from IPython.display import display

import climval
from climval import BenchmarkSuite, load_model
from climval.metrics import (
    RMSE,
    MAE,
    MeanBias,
    NormalizedRMSE,
    PearsonCorrelation,
    TaylorSkillScore,
)

print(f"climval  : v{climval.__version__}")
print(f"xarray   : v{xr.__version__}")
print(f"numpy    : v{np.__version__}")
print(f"pandas   : v{pd.__version__}")
print("\nReady.")

## 2. Generate Realistic Synthetic Data

We generate monthly mean 2m temperature (tas) over Europe for 1985â€“2014 (360 months).
Each CMIP6 model has realistic bias and variance characteristics based on published CMIP6 assessments.

**To use real data:** Replace `generate_era5_reference()` and `generate_cmip6_model()` with your own xarray DataArrays.

In [None]:
np.random.seed(42)
N = 360  # 30 years x 12 months

# Realistic European monthly temperature seasonal cycle (degC anomalies)
months = np.arange(N)
seasonal = 10.0 * np.sin(2 * np.pi * months / 12 - np.pi / 2)
trend = 0.02 * months / 12  # ~0.6 C warming over 30 years
noise = np.random.normal(0, 0.8, N)

era5 = seasonal + trend + noise
era5_std = np.std(era5)

def generate_model(corr, rel_std, bias):
    """Generate synthetic model output with target correlation/std/bias."""
    signal = corr * era5 / era5_std
    orthogonal = np.random.normal(0, 1, N)
    orthogonal -= np.dot(orthogonal, era5) / np.dot(era5, era5) * era5
    noise_component = (
        np.sqrt(max(0, 1 - corr**2)) * orthogonal / (np.std(orthogonal) + 1e-8)
    )
    model_raw = signal + noise_component
    model_raw = model_raw / np.std(model_raw) * rel_std * era5_std
    return model_raw + bias

# CMIP6-like characteristics inspired by literature ranges
models_config = {
    "MPI-ESM1-2-HR": dict(corr=0.97, rel_std=0.98, bias=-0.4),
    "EC-Earth3": dict(corr=0.96, rel_std=1.04, bias=+0.6),
    "CNRM-CM6-1": dict(corr=0.94, rel_std=1.08, bias=+1.1),
    "IPSL-CM6A-LR": dict(corr=0.91, rel_std=1.15, bias=-1.8),
    "UKESM1-0-LL": dict(corr=0.93, rel_std=0.88, bias=+0.9),
}

model_data = {name: generate_model(**cfg) for name, cfg in models_config.items()}

# Keep xarray objects available for users swapping to real workflows
time_index = pd.date_range(start="1985-01-01", periods=N, freq="MS")
ref_da = xr.DataArray(
    era5,
    coords={"time": time_index},
    dims=["time"],
    attrs={"units": "K", "long_name": "2m Temperature Anomaly", "source": "ERA5"},
)
model_das = {
    name: xr.DataArray(
        data,
        coords={"time": time_index},
        dims=["time"],
        attrs={"units": "K", "long_name": "2m Temperature Anomaly", "model": name},
    )
    for name, data in model_data.items()
}

print(f"ERA5 reference : {N} monthly timesteps (1985-2014)")
print(f"Models loaded  : {list(model_das.keys())}")
print(f"ERA5 std       : {era5_std:.3f} K")

## 3. Run Validation with climval

This is the core use case. Define your models, reference, and metrics â€” `BenchmarkSuite` handles the rest.

In [None]:
# Build a climval benchmark suite (portable: no credentials or local files required)
suite = BenchmarkSuite(name="CMIP6-Europe-1985-2014")

reference = load_model(
    preset="era5",
    name="ERA5",
    variables=["tas"],
    lat_range=(35.0, 72.0),
    lon_range=(-15.0, 45.0),
    time_start=datetime(1985, 1, 1),
    time_end=datetime(2014, 12, 31),
)
suite.register(reference, role="reference")

for model_name in model_data:
    candidate = load_model(
        name=model_name,
        variables=["tas"],
        lat_range=(35.0, 72.0),
        lon_range=(-15.0, 45.0),
        time_start=datetime(1985, 1, 1),
        time_end=datetime(2014, 12, 31),
    )
    suite.register(candidate)

report = suite.run(
    variables=["tas"],
    n_samples=N,
    seed=42,
)

# Flatten benchmark results for easy plotting/table use
results_by_model = {result.candidate: result.score_summary() for result in report.results}

print("Benchmark complete.")
print(f"Models evaluated : {len(results_by_model)}")
print(f"Metrics computed : {len(next(iter(results_by_model.values())))}")

## 4. Metric Scorecard

In [None]:
# Build scorecard from the benchmark report
scorecard_rows = []
for model_name, metric_values in results_by_model.items():
    row = {"Model": model_name}
    row.update({k: round(v, 4) for k, v in metric_values.items()})
    scorecard_rows.append(row)

df = pd.DataFrame(scorecard_rows).set_index("Model")

# Metric directionality for highlighting
lower_is_better = {"rmse", "mae", "mean_bias", "nrmse", "percentile_bias_p95", "percentile_bias_p5"}

def highlight_best(col):
    if col.name in lower_is_better:
        idx = col.abs().idxmin() if col.name == "mean_bias" else col.idxmin()
    else:
        idx = col.idxmax()
    return ["background-color: #d4edda; font-weight: bold" if i == idx else "" for i in col.index]

display(
    df.style
      .apply(highlight_best)
      .format("{:.4f}")
      .set_caption("climval Scorecard â€” CMIP6 vs ERA5 | Europe | 1985-2014 | Variable: tas")
)
print("\nGreen highlight = best performer per metric")

## 5. Taylor Diagram

A Taylor Diagram (Taylor 2001) summarises three statistics simultaneously:
- **Radial distance** from origin = standard deviation ratio (model / reference)
- **Azimuthal angle** = Pearson correlation coefficient  
- **Distance from REF point** = centred RMSE

A perfect model sits at the REF point (correlation=1, std ratio=1, RMSE=0).

In [None]:
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, polar=True)

# Taylor diagram uses first quadrant (0 to pi/2)
ax.set_thetamin(0)
ax.set_thetamax(90)

colors = ["#2196F3", "#4CAF50", "#FF9800", "#9C27B0", "#F44336"]
markers = ["o", "s", "^", "D", "P"]

ref_std = np.std(era5)
ax.plot(0, 1.0, marker="*", markersize=15, color="black", label="ERA5 (Reference)", zorder=5)

# Correlation guide lines
for corr in [0.7, 0.8, 0.9, 0.95, 0.99]:
    theta = np.arccos(corr)
    ax.plot([theta, theta], [0, 1.4], color="#cccccc", linewidth=0.8, linestyle="--")

# RMSE contours around reference point (normalized space)
theta_range = np.linspace(0, np.pi / 2, 300)
for rmse_val in [0.2, 0.4, 0.6, 0.8]:
    t_contour = []
    r_contour = []
    for t in theta_range:
        a, b, c = 1.0, -2 * np.cos(t), 1 - rmse_val**2
        disc = b**2 - 4 * a * c
        if disc >= 0:
            for r_sol in [(-b + np.sqrt(disc)) / (2 * a), (-b - np.sqrt(disc)) / (2 * a)]:
                if 0 <= r_sol <= 1.5:
                    t_contour.append(t)
                    r_contour.append(r_sol)
    if r_contour:
        pairs = sorted(zip(t_contour, r_contour))
        t_s, r_s = zip(*pairs)
        ax.plot(t_s, r_s, color="#ffcccc", linewidth=0.8, linestyle=":")

for i, model_name in enumerate(results_by_model):
    r = results_by_model[model_name]["pearson_r"]
    std_ratio = np.std(model_data[model_name]) / ref_std
    theta = np.arccos(np.clip(r, -1, 1))
    ax.plot(
        theta,
        std_ratio,
        marker=markers[i],
        markersize=11,
        color=colors[i],
        label=model_name,
        markeredgecolor="white",
        markeredgewidth=1.3,
    )

ax.set_rlabel_position(90)
ax.set_thetagrids(
    [0, 15, 30, 45, 60, 75, 90],
    [f"r={np.cos(np.deg2rad(a)):.2f}" for a in [0, 15, 30, 45, 60, 75, 90]],
    fontsize=8,
)
ax.set_rlim(0, 1.4)
ax.set_rticks([0.4, 0.6, 0.8, 1.0, 1.2, 1.4])
ax.set_title(
    "Taylor Diagram â€” CMIP6 vs ERA5\nEuropean 2m Temperature | 1985-2014",
    fontsize=13,
    fontweight="bold",
    pad=24,
)

ax.legend(loc="upper left", bbox_to_anchor=(1.05, 1.0), fontsize=10, framealpha=0.9)

# Reserve space for title and right-side legend so title remains visible in notebooks/Binder
fig.subplots_adjust(top=0.86, right=0.78)

plt.savefig("taylor_diagram_cmip6_era5.png", dpi=150, bbox_inches="tight", facecolor="white")
plt.show()
print("Saved: taylor_diagram_cmip6_era5.png")

## 6. Model Ranking Visualisation

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(14, 5))
fig.suptitle(
    "CMIP6 Model Skill â€” Europe | 1985-2014 | tas vs ERA5",
    fontsize=13,
    fontweight="bold",
    y=0.98,
)

model_names = list(results_by_model.keys())
rmse_vals = [results_by_model[m]["rmse"] for m in model_names]
bias_vals = [results_by_model[m]["mean_bias"] for m in model_names]
tss_vals = [results_by_model[m]["taylor_skill_score"] for m in model_names]
colors = ["#2196F3", "#4CAF50", "#FF9800", "#9C27B0", "#F44336"]

# RMSE
bars0 = axes[0].barh(model_names, rmse_vals, color=colors)
axes[0].set_xlabel("RMSE (K)", fontsize=11)
axes[0].set_title("RMSE (lower is better)", fontsize=11)
for bar, val in zip(bars0, rmse_vals):
    axes[0].text(bar.get_width() + 0.01, bar.get_y() + bar.get_height() / 2, f"{val:.3f}", va="center", fontsize=9)

# Mean Bias
bias_colors = ["#F44336" if v > 0 else "#2196F3" for v in bias_vals]
bars1 = axes[1].barh(model_names, bias_vals, color=bias_colors)
axes[1].set_xlabel("Mean Bias (K)", fontsize=11)
axes[1].set_title("Mean Bias (red=warm, blue=cold)", fontsize=11)
axes[1].axvline(0, color="black", linewidth=1.2)
for bar, val in zip(bars1, bias_vals):
    x = bar.get_width() + 0.02 if val >= 0 else bar.get_width() - 0.16
    axes[1].text(x, bar.get_y() + bar.get_height() / 2, f"{val:+.3f}", va="center", fontsize=9)

# Taylor Skill Score
bars2 = axes[2].barh(model_names, tss_vals, color=colors)
axes[2].set_xlabel("Taylor Skill Score", fontsize=11)
axes[2].set_title("Taylor Skill Score (higher is better)", fontsize=11)
axes[2].set_xlim(0, 1.05)
for bar, val in zip(bars2, tss_vals):
    axes[2].text(min(bar.get_width() + 0.01, 1.02), bar.get_y() + bar.get_height() / 2, f"{val:.3f}", va="center", fontsize=9)

for ax in axes:
    ax.spines["top"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.tick_params(labelsize=10)

# Keep suptitle visible in interactive windows as well as saved output
fig.tight_layout(rect=[0, 0, 1, 0.93])

plt.savefig("model_ranking.png", dpi=150, bbox_inches="tight", facecolor="white")
plt.show()
print("Saved: model_ranking.png")

## 7. Export Report with climval

In [None]:
# Export reproducible report in all formats
report.export("climval_cmip6_report.html")
report.export("climval_cmip6_report.json")
report.export("climval_cmip6_report.md")

print("Reports exported:")
print("  climval_cmip6_report.html    â€” shareable, self-contained")
print("  climval_cmip6_report.json    â€” machine-readable, archivable")
print("  climval_cmip6_report.md      â€” for GitHub / documentation")

## 8. The 15-Line Version

This is the core value proposition. Everything above â€” in 15 lines.


In [None]:
# Full CMIP6-style multi-model validation in ~15 lines
from datetime import datetime
from climval import BenchmarkSuite, load_model

suite = BenchmarkSuite(name="CMIP6-Europe-15-lines")
suite.register(load_model("era5", name="ERA5", variables=["tas"], lat_range=(35,72), lon_range=(-15,45), time_start=datetime(1985,1,1), time_end=datetime(2014,12,31)), role="reference")
for name in ["MPI-ESM1-2-HR", "EC-Earth3", "CNRM-CM6-1", "IPSL-CM6A-LR", "UKESM1-0-LL"]:
    suite.register(load_model(name=name, variables=["tas"], lat_range=(35,72), lon_range=(-15,45), time_start=datetime(1985,1,1), time_end=datetime(2014,12,31)))

report_15 = suite.run(variables=["tas"], n_samples=360, seed=42)
results_15 = {r.candidate: {k: round(v, 3) for k, v in r.score_summary().items()} for r in report_15.results}
print(results_15)

---

## Summary

| Model | Rank | Strength | Weakness |
|-------|------|----------|----------|
| MPI-ESM1-2-HR | ðŸ¥‡ 1st | Best correlation, minimal bias | â€” |
| EC-Earth3 | ðŸ¥ˆ 2nd | High correlation | Slight warm bias |
| UKESM1-0-LL | ðŸ¥‰ 3rd | Good correlation | Underestimates variability |
| CNRM-CM6-1 | 4th | Reasonable skill | Warm bias, high variance |
| IPSL-CM6A-LR | 5th | Lower correlation | Cold bias, excess variability |

**Results generated with climval (runtime version printed in Cell 1).**

---

### References

- Taylor, K. E. (2001). *Summarizing multiple aspects of model performance in a single diagram.* JGR Atmospheres. https://doi.org/10.1029/2000JD900719
- Gleckler, P. J., Taylor, K. E., & Doutriaux, C. (2008). *Performance metrics for climate models.* JGR Atmospheres. https://doi.org/10.1029/2007JD008972
- Bock, L. et al. (2020). *Quantifying progress across different CMIP phases.* JGR Atmospheres. https://doi.org/10.1029/2019JD032874
- Hersbach, H. et al. (2020). *The ERA5 global reanalysis.* QJRMS. https://doi.org/10.1002/qj.3803

---

**climval** is developed by [Northflow Technologies](https://northflow.no)  
GitHub: https://github.com/northflowlabs/climval  
PyPI: https://pypi.org/project/climval  
License: Apache 2.0