# Variogram Analysis for Spatial Data

Semi-variograms help us see how similarity changes with distance. They are a core tool for spatial analysis.

## What You'll Learn

1. Computing experimental variograms
2. Fitting variogram models (spherical, exponential, Gaussian)
3. Understanding nugget, sill, and range
4. Checking for anisotropy
5. Model validation

## Key Concepts

- **Nugget**: Small-scale noise or measurement error
- **Sill**: Plateau where distance no longer adds variance
- **Range**: Distance where correlation becomes weak

Clear models, clean data, and simple checks produce better spatial estimates.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from pygeomodeling import load_spe9_data
from pygeomodeling.variogram import (
    compute_experimental_variogram,
    fit_variogram_model,
    predict_variogram,
    directional_variogram,
)
from pygeomodeling.variogram_plot import (
    plot_variogram,
    plot_variogram_comparison,
    plot_directional_variograms,
    plot_variogram_cloud,
)

print("✓ Imports successful!")

## 1. Load and Prepare Data

Start with the data. Remove outliers that come from errors.

In [None]:
# Load sample data
data = load_spe9_data('../../data/sample_small.grdecl')

# Extract permeability data
permx = data['properties']['PERMX']
nx, ny, nz = data['dimensions']

print(f"Grid dimensions: {nx} x {ny} x {nz}")
print(f"PERMX range: [{permx.min():.2f}, {permx.max():.2f}] mD")

In [None]:
# Create coordinate arrays and flatten for one layer
layer = 0  # Top layer
x = np.arange(nx)
y = np.arange(ny)
X, Y = np.meshgrid(x, y, indexing='ij')

coordinates = np.column_stack([X.ravel(), Y.ravel()])
values = permx[:, :, layer].ravel()

print(f"Number of points: {len(values)}")
print(f"Mean permeability: {values.mean():.2f} mD")
print(f"Std permeability: {values.std():.2f} mD")

## 2. Visualize the Data

Check for trends across space before computing variogram.

In [None]:
# Plot the spatial distribution
fig, ax = plt.subplots(figsize=(8, 6))
scatter = ax.scatter(coordinates[:, 0], coordinates[:, 1], 
                    c=values, s=200, cmap='viridis', 
                    edgecolors='black', linewidth=1)
plt.colorbar(scatter, ax=ax, label='Permeability (mD)')
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_title('Permeability Distribution (Layer 0)')
ax.set_aspect('equal')
plt.tight_layout()
plt.show()

## 3. Compute Experimental Variogram

Bin the distances. Compute average semi-variance by bin.

In [None]:
# Compute experimental variogram
lags, semi_variance, n_pairs = compute_experimental_variogram(
    coordinates=coordinates,
    values=values,
    n_lags=10,
    max_lag=None  # Use default (half of max distance)
)

print("Experimental Variogram:")
print(f"  Number of lags: {len(lags)}")
print(f"  Lag range: [{lags.min():.2f}, {lags.max():.2f}]")
print(f"  Semi-variance range: [{semi_variance.min():.2f}, {semi_variance.max():.2f}]")
print(f"\nPairs per lag:")
for i, (lag, sv, np_val) in enumerate(zip(lags, semi_variance, n_pairs)):
    print(f"  Lag {lag:.2f}: {sv:.2f} ({np_val} pairs)")

## 4. Visualize Variogram Cloud

Check for outliers before fitting.

In [None]:
# Plot variogram cloud
fig, ax = plot_variogram_cloud(coordinates, values, max_pairs=1000)
plt.show()

## 5. Fit Variogram Models

Choose a model that fits the process. Try different models and compare.

In [None]:
# Fit spherical model
model_spherical = fit_variogram_model(
    lags, semi_variance, 
    model_type='spherical',
    weights=np.sqrt(n_pairs)  # Weight by number of pairs
)

print("Spherical Model:")
print(model_spherical)
print()

In [None]:
# Fit exponential model
model_exponential = fit_variogram_model(
    lags, semi_variance,
    model_type='exponential',
    weights=np.sqrt(n_pairs)
)

print("Exponential Model:")
print(model_exponential)
print()

In [None]:
# Fit Gaussian model
model_gaussian = fit_variogram_model(
    lags, semi_variance,
    model_type='gaussian',
    weights=np.sqrt(n_pairs)
)

print("Gaussian Model:")
print(model_gaussian)

## 6. Visualize Fitted Models

Use a plot to check the fit. Avoid overfitting to noise.

In [None]:
# Plot spherical model
fig, ax = plot_variogram(
    lags, semi_variance,
    model=model_spherical,
    n_pairs=n_pairs,
    title="Spherical Variogram Model"
)
plt.show()

In [None]:
# Compare all models
fig, ax = plot_variogram_comparison(
    lags, semi_variance,
    models=[model_spherical, model_exponential, model_gaussian],
    n_pairs=n_pairs,
    title="Variogram Model Comparison"
)
plt.show()

## 7. Check for Anisotropy

Compute directional variograms to detect anisotropy.

In [None]:
# Plot directional variograms
fig, ax = plot_directional_variograms(
    coordinates, values,
    directions=[0, 45, 90, 135],
    tolerance=22.5,
    n_lags=8
)
plt.show()

## 8. Model Interpretation

### Understanding the Parameters

**Nugget Effect**: 
- Represents measurement error or micro-scale variability
- Lower is better (less noise)
- High nugget suggests data quality issues

**Sill**:
- Total variance in the data
- Where the variogram levels off
- Should be close to sample variance

**Range**:
- Distance of spatial influence
- Beyond this, points are uncorrelated
- Critical for kriging neighborhood

### Model Selection

- **Spherical**: Most common, good for many natural processes
- **Exponential**: Never quite reaches sill, gradual decay
- **Gaussian**: Very smooth at origin, rapid transition

Choose based on:
1. R² value (goodness of fit)
2. Physical plausibility
3. Cross-validation performance

## Summary

You've learned:

✓ How to compute experimental variograms  
✓ How to fit different variogram models  
✓ How to interpret nugget, sill, and range  
✓ How to check for anisotropy  
✓ How to visualize and validate models  

## Next Steps

- Use variogram models for kriging interpolation
- Apply to full 3D reservoir grids
- Integrate with uncertainty quantification
- Combine with machine learning models

**Key Takeaway**: Clear models, clean data, and simple checks produce better spatial estimates. The result is not just a pretty map. It is a stronger base for decisions that depend on place.