# Xarray-spatial
### User Guide: Zonal Crosstab
-----

Xarray-spatial's `zonal_crosstab` function calculates cross-tabulated statistics between two raster datasets, helping identify patterns and relationships in spatial data.

In this notebook, we'll analyze **how terrain steepness varies across different elevation zones** using synthetic terrain data. This type of analysis is useful for:
- **Hikers and outdoor planners**: Understanding which elevation ranges have the steepest terrain
- **Urban developers**: Identifying buildable land based on slope and elevation
- **Agriculture**: Finding suitable terrain for different farming practices

**Contents:**
- [Generate Terrain Data](#Generate-Terrain-Data)
- [Calculate Slope](#Calculate-Slope)
- [2D Zonal Crosstab](#2D-Zonal-Crosstab): Slope distribution by elevation zone

-----------

## Generate Terrain Data

We'll generate synthetic terrain using xarray-spatial's `generate_terrain` function, which creates realistic elevation data using fractal noise algorithms.

In [None]:
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

from datashader.transfer_functions import shade, stack, Images
from datashader.colors import Elevation

from xrspatial import generate_terrain, hillshade, slope
from xrspatial.classify import quantile
from xrspatial import zonal_crosstab
from xrspatial.zonal import stats as zonal_stats

In [None]:
# Generate synthetic terrain
W, H = 800, 600
x_range = (-20e6, 20e6)
y_range = (-20e6, 20e6)

template = xr.DataArray(np.zeros((H, W)))
terrain = generate_terrain(template, x_range=x_range, y_range=y_range, seed=42, zfactor=4000)
terrain.name = "Elevation"

print(f"Terrain dimensions: {terrain.shape}")
print(f"Elevation range: {float(terrain.min()):.0f} - {float(terrain.max()):.0f} meters")

#### Visualize the Terrain

Let's visualize our terrain using hillshading for a 3D effect.

In [None]:
# Create hillshade for visualization
illuminated = hillshade(terrain)

# Combine elevation colors with hillshade
stack(
    shade(illuminated, cmap=['gray', 'white'], alpha=255, how='linear'),
    shade(terrain, cmap=Elevation, alpha=128, how='linear')
)

## Calculate Slope

Slope measures the steepness of terrain at each location, calculated as the rate of change of elevation. The result is in degrees, where:
- **0°**: Flat terrain
- **< 15°**: Gentle slopes (easy walking, suitable for most construction)
- **15-30°**: Moderate slopes (hiking trails, some agricultural use)
- **> 30°**: Steep terrain (challenging hiking, limited development)

In [None]:
# Calculate slope in degrees
slope_agg = slope(terrain)
slope_agg.name = "Slope"

print(f"Slope range: {float(slope_agg.min()):.1f}° - {float(slope_agg.max()):.1f}°")

In [None]:
# Visualize both elevation and slope
elevation_img = shade(terrain, cmap=Elevation, how='linear')
elevation_img.name = "Elevation"

slope_img = shade(slope_agg, cmap=plt.get_cmap("YlOrRd"), how='linear')
slope_img.name = "Slope (degrees)"

imgs = Images(elevation_img, slope_img)
imgs.num_cols = 2
imgs

## 2D Zonal Crosstab

The `zonal_crosstab` function calculates cross-tabulated statistics between a **zones** raster and a **values** raster.

We'll:
1. Classify elevation into zones (valleys, foothills, mountains, peaks)
2. Classify slope into steepness categories
3. Cross-tabulate to see how slope is distributed across elevation zones

In [None]:
# Create elevation zones
n_elev_classes = 5
elevation_zones = quantile(terrain, k=n_elev_classes, name='Elevation Zones')

# Create slope categories
n_slope_classes = 5
slope_classes = quantile(slope_agg, k=n_slope_classes, name='Slope Classes')

print(f"Created {n_elev_classes} elevation zones and {n_slope_classes} slope classes")

### Visualize the Classified Data

In [None]:
shaded_elev_zones = shade(elevation_zones, cmap=Elevation, how='linear')
shaded_elev_zones.name = "Elevation Zones"

shaded_slope_classes = shade(slope_classes, cmap=plt.get_cmap("YlOrRd"), how='linear')
shaded_slope_classes.name = "Slope Classes"

imgs = Images(shaded_elev_zones, shaded_slope_classes)
imgs.num_cols = 2
imgs

### Helper Function for Zone Labels

This utility function extracts the value range for each classified zone.

In [None]:
def bin_ranges(classified_data, original_data, unit="", decimals=0):
    """Calculate the value range for each bin/class."""
    bins = np.unique(classified_data.data[~np.isnan(classified_data.data)])
    ranges = []
    for b in bins:
        bin_data = original_data.data[classified_data.data == b]
        min_val = np.nanmin(bin_data)
        max_val = np.nanmax(bin_data)
        ranges.append(f'{min_val:.{decimals}f}{unit} - {max_val:.{decimals}f}{unit}')
    return ranges

In [None]:
# Get human-readable labels for zones
elev_labels = bin_ranges(elevation_zones, terrain, unit='m', decimals=0)
slope_labels = bin_ranges(slope_classes, slope_agg, unit='°', decimals=1)

print("Elevation zones:")
zone_names = ["Valley", "Lowlands", "Foothills", "Mountains", "Peaks"]
for i, (name, label) in enumerate(zip(zone_names, elev_labels), 1):
    print(f"  Zone {i} ({name}): {label}")

In [None]:
print("Slope classes:")
slope_names = ["Flat", "Gentle", "Moderate", "Steep", "Very Steep"]
for i, (name, label) in enumerate(zip(slope_names, slope_labels), 1):
    print(f"  Class {i} ({name}): {label}")

### Run Zonal Crosstab

Now we calculate the cross-tabulation to see how slope classes are distributed across elevation zones.

In [None]:
# Calculate cross-tabulation with percentage aggregation
crosstab_result = zonal_crosstab(elevation_zones, slope_classes, agg='percentage')

# Add readable labels
crosstab_result['zone'] = [f"{name}\n({label})" for name, label in zip(zone_names, elev_labels)]
crosstab_result.columns = ['Elevation Zone', *[f"{name}\n({label})" for name, label in zip(slope_names, slope_labels)]]
crosstab_result.set_index('Elevation Zone', inplace=True)

crosstab_result

### Interpretation

Each cell shows the **percentage** of pixels in an elevation zone that fall into each slope class.

- **Valley zones**: Typically have higher percentages of flat and gentle slopes
- **Mountain/Peak zones**: Typically have higher percentages of steep and very steep terrain

This pattern reflects the natural relationship between elevation and terrain ruggedness: higher elevations tend to have more dramatic relief.

In [None]:
# Visualize as stacked bar chart
ax = crosstab_result.plot(kind="bar", stacked=True, figsize=(12, 6), colormap="YlOrRd")
ax.set_xlabel("Elevation Zone")
ax.set_ylabel("Percentage")
ax.set_title("Slope Distribution by Elevation Zone")
ax.legend(title="Slope Class", bbox_to_anchor=(1.02, 1), loc='upper left')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()

## Zonal Statistics

For calculating summary statistics like mean, min, max, std within each zone, use `xrspatial.zonal.stats` instead of `zonal_crosstab`.

Let's calculate the mean slope within each elevation zone.

In [None]:
# Calculate zonal statistics for slope within each elevation zone
mean_slope_by_elev = zonal_stats(elevation_zones, slope_agg, stats_funcs=['mean', 'std'])
mean_slope_by_elev['Elevation Zone'] = zone_names
mean_slope_by_elev = mean_slope_by_elev[['Elevation Zone', 'mean', 'std']]
mean_slope_by_elev.columns = ['Elevation Zone', 'Mean Slope (°)', 'Std Dev (°)']
mean_slope_by_elev.set_index('Elevation Zone', inplace=True)

mean_slope_by_elev

In [None]:
# Visualize mean slope by elevation zone
ax = mean_slope_by_elev['Mean Slope (°)'].plot(kind="bar", figsize=(10, 5), color="orangered", 
                                                yerr=mean_slope_by_elev['Std Dev (°)'], capsize=5)
ax.set_xlabel("Elevation Zone")
ax.set_ylabel("Mean Slope (degrees)")
ax.set_title("Mean Terrain Steepness by Elevation Zone")
plt.xticks(rotation=0)
plt.tight_layout()

## Summary

In this notebook, we demonstrated how to use `zonal_crosstab` to analyze the relationship between elevation and terrain steepness using synthetic terrain data.

**Key findings:**
- Higher elevation zones tend to have steeper terrain
- Valley and lowland areas have more flat and gentle slopes
- This pattern is typical of natural landscapes where mountains form through tectonic processes

**Use cases for `zonal_crosstab`:**
- Terrain analysis and characterization
- Land suitability assessment for development or agriculture
- Environmental and ecological studies
- Trail and recreation planning