# Spatial Mapping

## Table of Contents

1. [Introduction](#1.-Introduction)
2. [Convert Model Predictions to Spatial Data](#2.-Convert-Model-Predictions-to-Spatial-Data)

## 1. Introduction

The final stage of species distribution modelling involves translating model outputs into spatially explicit predictions for visualization and analysis in Geographic Information Systems (GIS). This step is crucial for identifying suitable habitats, informing conservation planning, and integrating results into decision-making frameworks.

Species distribution models (SDMs) generate probability estimates of species presence based on environmental predictors. These probabilities must be converted into spatially explicit rasters or vector layers that align with the study area’s geographic coordinates. The transition from tabular predictions to spatial datasets ensures that results can be integrated with other environmental data, such as land cover, habitat connectivity, and conservation priorities.

The goal is to predict species suitability across a continuous landscape using environmental predictors. This is typically achieved by applying trained models to these predictors to generate continuous suitability maps.

### Interpolation vs. Model Application:

1. **Model Application**: After training the models (e.g., Random Forest, XGBoost, MaxEnt) using occurrence data and environmental predictors, these trained models can be applied to the same set of predictors across the entire study area. This process generates continuous suitability values for each location, resulting in a comprehensive suitability map.

2. **Interpolation:** Interpolation methods, such as kriging, estimate values at unsampled locations based on the spatial configuration of sampled points. While interpolation can be useful in certain contexts, it doesn't leverage the relationship between species occurrences and environmental predictors as effectively as direct model application.

## 2. Create Suitability Maps

To create continuous suitability maps we will:
1. **Prepare Environmental Predictors**: Ensure all environmental predictor rasters are aligned (same resolution, extent, and coordinate system).
2. **Apply Trained Models**: Use the trained ensemble model to predict suitability values across the entire study area by applying it to the predictor rasters.
3. **Generate Suitability Maps**: The output will be continuous suitability maps indicating the predicted probability of species presence for each location.

### **Load Final Suitability Predictions**

In [1]:
import os
import pandas as pd
import geopandas as gpd

# Define paths
prediction_dir = r"C:\GIS_Course\MScThesis-MaviSantarelli\results\Models\Final_Binary"
species_list = ["Bufo bufo", "Rana temporaria", "Lissotriton helveticus"]

# Load prediction files
predictions = {}
for species in species_list:
    file_path = os.path.join(prediction_dir, f"{species}_Final_Binary_Predictions.csv")
    if os.path.exists(file_path):
        predictions[species] = pd.read_csv(file_path)
        print(f"✅ Loaded predictions for {species}")
    else:
        print(f"⚠️ Missing predictions for {species}")


✅ Loaded predictions for Bufo bufo
✅ Loaded predictions for Rana temporaria
✅ Loaded predictions for Lissotriton helveticus


### **Stack Predictors into a Multi-Band Raster**

### Check Raster Shapes Before Stacking

In [3]:
import rasterio

# Define predictor raster file paths
predictor_files = [
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Building_Density_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/DistWater_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/NOx_Stand_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/RGS_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Runoff_Coefficient_Standardised_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Slope_Proj_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/SoilMoisture_32bit_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Traffic_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Wood_Resample_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Grass_Stand.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/NDVI_median.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/NDVI_StDev.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/VegHeight.tif"
]

# Check raster shapes
for file in predictor_files:
    with rasterio.open(file) as src:
        print(f"{file}: {src.shape}, Resolution: {src.res}, CRS: {src.crs}")


C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Building_Density_Reversed.tif: (5971, 6369), Resolution: (30.0, 30.0), CRS: EPSG:27700
C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/DistWater_Reversed.tif: (5971, 6369), Resolution: (30.0, 30.0), CRS: EPSG:27700
C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/NOx_Stand_Reversed.tif: (5971, 6369), Resolution: (30.0, 30.0), CRS: EPSG:27700
C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/RGS_Reversed.tif: (5971, 6369), Resolution: (30.0, 30.0), CRS: EPSG:27700
C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Runoff_Coefficient_Standardised_Reversed.tif: (5970, 6369), Resolution: (30.0, 30.0), CRS: EPSG:27700
C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Slope_Proj_Reversed.tif: (5971, 6369), Resolution: (30.0, 30.0), CRS: EPSG:27700
C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/SoilMo

### Reproject and Resample to Match a Reference Raster

In [4]:
import rasterio
from rasterio.warp import calculate_default_transform, reproject, Resampling
import os

# Choose a reference raster (first predictor)
reference_raster = predictor_files[0]

# Define output directory for resampled rasters
output_dir = "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Resampled"
os.makedirs(output_dir, exist_ok=True)

# Open reference raster
with rasterio.open(reference_raster) as ref_src:
    ref_transform = ref_src.transform
    ref_width = ref_src.width
    ref_height = ref_src.height
    ref_crs = ref_src.crs

# Resample and reproject other rasters to match reference
for file in predictor_files:
    output_file = os.path.join(output_dir, os.path.basename(file))

    with rasterio.open(file) as src:
        if (src.shape != (ref_height, ref_width)) or (src.crs != ref_crs):
            print(f"🔄 Resampling {file}...")

            # Calculate new transform and dimensions
            transform, width, height = calculate_default_transform(
                src.crs, ref_crs, src.width, src.height, *src.bounds
            )

            # Update metadata
            new_meta = src.meta.copy()
            new_meta.update({
                "crs": ref_crs,
                "transform": transform,
                "width": ref_width,
                "height": ref_height
            })

            # Create resampled raster
            with rasterio.open(output_file, "w", **new_meta) as dst:
                for i in range(1, src.count + 1):
                    reproject(
                        source=rasterio.band(src, i),
                        destination=rasterio.band(dst, i),
                        src_transform=src.transform,
                        src_crs=src.crs,
                        dst_transform=transform,
                        dst_crs=ref_crs,
                        resampling=Resampling.bilinear  # Use bilinear resampling
                    )

            print(f"✅ Resampled and saved: {output_file}")
        else:
            print(f"✅ {file} already matches reference raster.")

print("\n🚀 Resampling complete! Use these resampled rasters for stacking.")


✅ C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Building_Density_Reversed.tif already matches reference raster.
✅ C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/DistWater_Reversed.tif already matches reference raster.
✅ C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/NOx_Stand_Reversed.tif already matches reference raster.
✅ C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/RGS_Reversed.tif already matches reference raster.
🔄 Resampling C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Runoff_Coefficient_Standardised_Reversed.tif...
✅ Resampled and saved: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Resampled\Runoff_Coefficient_Standardised_Reversed.tif
✅ C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Slope_Proj_Reversed.tif already matches reference raster.
🔄 Resampling C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/SoilMo

### Create a List of the Correct File Paths

In [6]:
import os

# Define original predictor file paths
original_predictor_files = [
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Building_Density_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/DistWater_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/NOx_Stand_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/RGS_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Runoff_Coefficient_Standardised_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Slope_Proj_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/SoilMoisture_32bit_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Traffic_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Wood_Resample_Reversed.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Grass_Stand.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/NDVI_median.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/NDVI_StDev.tif",
    "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/VegHeight.tif"
]

# Define resampled directory
resampled_dir = "C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Resampled"

# Generate final paths: use resampled files if they exist, otherwise use original
final_predictor_files = []
for file in original_predictor_files:
    resampled_path = os.path.join(resampled_dir, os.path.basename(file))
    
    if os.path.exists(resampled_path):  # Use resampled version if available
        final_predictor_files.append(resampled_path)
    else:  # Use original file
        final_predictor_files.append(file)

# Print final list to check
for file in final_predictor_files:
    print(f"✅ Using: {file}")


✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Building_Density_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/DistWater_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/NOx_Stand_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/RGS_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Resampled\Runoff_Coefficient_Standardised_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Slope_Proj_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Resampled\SoilMoisture_32bit_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Traffic_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Input/Reversed/Wood_Resample_Reversed.tif
✅ Using: C:/GIS_Course/MScThesis-MaviSantarelli/data/Pred

### Stack Resampled Rasters

In [7]:
import rasterio
import numpy as np

# Read and stack all predictors
stacked_data = []
meta = None

for i, file in enumerate(final_predictor_files):
    with rasterio.open(file) as src:
        if meta is None:
            meta = src.meta.copy()  # Save metadata from first raster
        data = src.read(1)  # Read the first band
        stacked_data.append(data)

# Convert to numpy array
stacked_array = np.stack(stacked_data, axis=0)

# Check stacked raster shape
print(f"✅ Stacked raster shape: {stacked_array.shape}")


✅ Stacked raster shape: (13, 5971, 6369)


### **Convert Raster to Model Input Format**

In [8]:
import pandas as pd

# Reshape raster stack into a 2D table (num_pixels, num_predictors)
num_bands, height, width = stacked_array.shape
reshaped_array = stacked_array.reshape(num_bands, -1).T  # Transpose to (num_pixels, num_predictors)

# Convert to DataFrame
predictor_df = pd.DataFrame(reshaped_array, columns=[f"Predictor_{i+1}" for i in range(num_bands)])

# Check for missing values
print(f"✅ Predictor DataFrame Shape: {predictor_df.shape}")
print(f"🔍 Missing Values:\n{predictor_df.isnull().sum()}")

# Drop rows with NaN values (if any)
predictor_df = predictor_df.dropna().reset_index(drop=True)
print(f"✅ Cleaned Predictor DataFrame Shape: {predictor_df.shape}")


✅ Predictor DataFrame Shape: (38029299, 13)
🔍 Missing Values:
Predictor_1     0
Predictor_2     0
Predictor_3     0
Predictor_4     0
Predictor_5     0
Predictor_6     0
Predictor_7     0
Predictor_8     0
Predictor_9     0
Predictor_10    0
Predictor_11    0
Predictor_12    0
Predictor_13    0
dtype: int64
✅ Cleaned Predictor DataFrame Shape: (38029299, 13)


### **Apply Ensemble Model to Generate Suitability Scores**

In [10]:
import rasterio
import numpy as np
import joblib
import os
import pandas as pd

# Define file paths
stacked_raster_path = r"C:\GIS_Course\MScThesis-MaviSantarelli\data\Predictors\Stacked_Predictors.tif"
model_dir = r"C:\GIS_Course\MScThesis-MaviSantarelli\results\Models\Final_Ensemble_Model"
output_dir = r"C:\GIS_Course\MScThesis-MaviSantarelli\results\Suitability_Maps"
os.makedirs(output_dir, exist_ok=True)

# Load stacked raster
with rasterio.open(stacked_raster_path) as src:
    stacked_data = src.read()  # Shape: (num_predictors, height, width)
    meta = src.meta.copy()

# Iterate over species
for species in species_list:
    print(f"🔍 Generating suitability map for {species}...")

    # Load trained ensemble model
    model_path = os.path.join(model_dir, f"{species}_Final_Ensemble_Model.pkl")
    if not os.path.exists(model_path):
        print(f"⚠️ Missing model for {species}. Skipping.")
        continue

    ensemble_prediction = joblib.load(model_path)  # Load saved function

    # Reshape raster data to (num_pixels, num_predictors)
    num_predictors, height, width = stacked_data.shape
    flat_data = stacked_data.reshape(num_predictors, -1).T  # Shape: (num_pixels, num_predictors)

    # Convert raster to DataFrame for prediction
    predictor_df = pd.DataFrame(flat_data, columns=[f"Predictor_{i}" for i in range(num_predictors)])

    # Apply ensemble model
    suitability_scores = ensemble_prediction(predictor_df)

    # Reshape back to raster dimensions
    suitability_map = suitability_scores.reshape(height, width)

    # Save suitability map
    meta.update(dtype=rasterio.float32, count=1)  # Update metadata
    output_file = os.path.join(output_dir, f"{species}_Suitability_Map.tif")

    with rasterio.open(output_file, "w", **meta) as dst:
        dst.write(suitability_map.astype(np.float32), 1)

    print(f"✅ Suitability map saved: {output_file}")

print("\n🚀 Suitability maps generated! Ready for visualization and connectivity analysis.")


RasterioIOError: C:/GIS_Course/MScThesis-MaviSantarelli/data/Predictors/Stacked_Predictors.tif: No such file or directory