# **Module 2: Spatial Interpolation in Python**

### **Exercises**
#### Data
For the exercies, data are created and saved to the directory `./data-module-2/`.
- `mn-dem-points.shp` -  a dataset showing sampled DEM values for Minnesota based on USGS GMTED2010 dataset.
- `mn-grid.shp` -  regular grid covering the area of interest for interpolating Minnesota DEM observations.

In [None]:
# general use packages
import os
import numpy as np
import pandas as pd
from sklearn.metrics import r2_score
import math
import matplotlib.pyplot as plt

# geospatial packages
import geopandas as gpd
from shapely.geometry import MultiPoint
from shapely.ops import voronoi_diagram, triangulate
from pyinterpolate import inverse_distance_weighting
from pyinterpolate import kriging, build_experimental_variogram, TheoreticalVariogram

os.environ['PROJ_LIB'] = '/opt/conda/envs/user_default/share/proj'

**Question 1. Load the datasets `mn-dem-points.csv` and `mn-grid.shp` as `GeoDataFrames`. Set CRS of for your point dataset the same as CRS of the grid shapefile.**

In [None]:
grid = gpd.read_file("./data-module-2/mn-grid.shp")
print ("CRS of the unknown points is {}".format(grid.crs))

In [None]:
samples_df = pd.read_csv("./data-module-2/mn-dem-points.csv")
samples = gpd.GeoDataFrame(samples_df, geometry=gpd.points_from_xy(samples_df.X, samples_df.Y), crs=grid.crs)
print ("CRS of sampled points is {}".format(samples.crs))

**Question 2. Plot the two datasets on the same `figure` object. How does Elevation vary spatially across the study area?**

In [None]:
fig, ax = plt.subplots(figsize=(10,6))

grid.plot(ax=ax, markersize=2, facecolor="grey", edgecolor="none")
samples.plot(ax=ax, column="DEM", cmap="terrain", legend=True, scheme="JenksCaspall", markersize=8)

**Question 3. Prepare `unknown_points` and `known_points` arrays from the datasets.**

In [None]:
known_points = samples[["X", "Y", "DEM"]].to_numpy()
unknown_points = grid[["x", "y"]].to_numpy()

**Question 4. Create an IDW surface predictions with different `power` parameters (`2` and `8`). Use `8` neighbous for both cases. Vizualize and compare the output (use continuous color scheme). What differences do you notice?**

In [None]:
NUMBER_OF_NEIGHBOURS = 8
IDW_POWER = 2

idw_predictions = []
for pt in unknown_points:
    idw_result = inverse_distance_weighting(known_points, pt, NUMBER_OF_NEIGHBOURS,  IDW_POWER)
    idw_predictions.append(idw_result)

grid["dem-pred-{}".format(IDW_POWER)] = idw_predictions

In [None]:
NUMBER_OF_NEIGHBOURS = 8
IDW_POWER = 8

idw_predictions = []
for pt in unknown_points:
    idw_result = inverse_distance_weighting(known_points, pt, NUMBER_OF_NEIGHBOURS,  IDW_POWER)
    idw_predictions.append(idw_result)

grid["dem-pred-{}".format(IDW_POWER)] = idw_predictions

In [None]:
fig, axs = plt.subplots(1,2, figsize=(12,6), tight_layout=True)

grid.plot(ax=axs[0], column="dem-pred-2", cmap="terrain", legend=True, markersize=3)
axs[0].set_title("Power=2", weight="bold")

grid.plot(ax=axs[1], column="dem-pred-8", cmap="terrain", legend=True, markersize=3)
axs[1].set_title("Power=8", weight="bold")

**Question 5. Define a maximum range of spatial dependency for your variogram. This parameter should be at most half of the maximum distance between the known points.
Hint: use the function from: https://pyinterpolate.readthedocs.io/en/latest/api/distance/distance.html**

In [None]:
from pyinterpolate import calc_point_to_point_distance
distances = calc_point_to_point_distance(known_points)
print (np.max(distances) / 2)

**Question 6. Create an experimental variogram and then use `autofit.()` to produce a theoretical model.**

In [None]:
STEP_SIZE = 10000
MAX_RANGE = 350000

exp_semivar = build_experimental_variogram(known_points, step_size=STEP_SIZE, max_range=MAX_RANGE)
print(exp_semivar)
exp_semivar.plot(plot_covariance=True)

In [None]:
theor_semivar = TheoreticalVariogram()
theor_semivar.autofit(experimental_variogram=exp_semivar)
print (theor_semivar)
theor_semivar.plot()

**Question 7. Produce a krigging output with Oridinary Krigging method. Plot the output along with variance errors.**

In [None]:
predictions = kriging(observations=known_points, theoretical_model=theor_semivar, points=unknown_points, how="ok")

In [None]:
grid["dem-pred-ok-krigging"] = predictions[:, 0]
grid["varience-error-ok-krigging"] = predictions[:, 1]

In [None]:
fig, axs = plt.subplots(1,2, figsize=(12,6), tight_layout=True)

grid.plot(ax=axs[0], column="dem-pred-ok-krigging", cmap="terrain", legend=True, markersize=2)
samples.plot(ax=axs[0], edgecolor="grey", facecolor="none")
axs[0].set_title("Ordinary Krigging Predictions (DEM)", weight="bold")

grid.plot(ax=axs[1], column="varience-error-ok-krigging", cmap="coolwarm", legend=True, markersize=2, alpha=0.5)
samples.plot(ax=axs[1], edgecolor="grey", facecolor="none")
axs[1].set_title("Ordinary Krigging Variance Error", weight="bold")