*****
## WFS Layer Extract
*****
Author: Mackenzie Rock

Date: June 3, 2025

Goal: The goal of this Jupyter notebook is to determine a foundation for extracting the relevant WCS layers identified and storing them in a suitable format for uploading to the postgreSQL database. In this section I will test:
- What layers I can extract into GeoTiff format (This will inform on what I decide to extract)
- The code to extract
- I will visualize the data to ensure I have capture it correctly
- I will test several samples overtime
- Transformation into suitable format for load

### Extraction Code

To be extracted from WCS:
- Daily Severity Rating: public:dsr
- Drought Code: public:dc
- Fire Type: public:ft
- Precipitation: public:precip
- Initial Spread Index: public:isi
- Rate of Spread: public:ros
- Surface Fuel Consumption: public:sfc
- Wind Direction: public:wdir
- Wind Speed: public:ws

In [8]:
import requests
from bs4 import BeautifulSoup

url = "https://cwfis.cfs.nrcan.gc.ca/geoserver/public/wcs?service=WCS&request=GetCapabilities&version=1.0.0"
response = requests.get(url)
soup = BeautifulSoup(response.content, "lxml-xml")

entries = []
for cov in soup.find_all("wcs:CoverageOfferingBrief"):
    description = cov.find("wcs:description")
    name = cov.find("wcs:name")
    label = cov.find("wcs:label")

    if description and "geotiff" in description.text.lower():
        entries.append({
            "Name": name.text if name else "",
            "Label": label.text if label else "",
            "Description": description.text.strip()
        })

import pandas as pd
df = pd.DataFrame(entries)
display(df)

Unnamed: 0,Name,Label,Description
0,public:cffdrs_fbp_fuel_types_100m,100m CFFDRS Fire Behaviour Prediction (FBP) Fu...,Generated from GeoTIFF
1,public:bui_current,Buildup Index / Indice de combustible disponib...,Generated from GeoTIFF
2,public:cffdrs_fbp_fuel_types,CFFDRS Fire Behaviour Prediction (FBP) Fuel Ty...,Generated from GeoTIFF
3,public:NFDB_MRB,CNFDB - most recent burn 1980-2021,Generated from GeoTIFF
4,public:FBP_FuelLayer_wBurnScars,Canadian Forest FBP Fuel Types (CanFG) 2019,Generated from GeoTIFF
5,public:cfb_current,Crown Fraction Burned / Fraction consumée des ...,Generated from GeoTIFF
6,public:gu_current,Current green-up,Generated from GeoTIFF
7,public:pc_current,Current percentage of grass curing,Generated from GeoTIFF
8,public:dsr_current,Daily Severity Rating / Indice journalier de s...,Generated from GeoTIFF
9,public:dc_current,Drought Code / Indice de sécheresse - 2025-06-06,Generated from GeoTIFF


#### Layer Selection

I will utilize the following layers in my project (reason):
- Precipitation (rainfall or lack thereof and drought are main determinants in wildfires starting and their spread)
- Surface Fuel Consumption (the amount of surface fuel available for consumption and consumed provides a clear indication of the intensity of the fire)
    - Dropped following testing. Revealed to not be in GeoTiff format
- Current Wind Direction (shows the direction the fire is likely to take)
- Current Wind Speed (is an indicator of the intensity of which the fire will spread)
- Temperature (higher temperatures are usually indicative of dryer climates and thus higher likelihood of wildfire)
- Fire Type (an indicator of the type of fire faced --> human, natural, etc.)
- Initial Spread Index
- Current Daily Severity Rating (A measure of the danger poised by a wildfire)
- Current Drought Code (Another measure of how likely a fire is to spread or start)

In [2]:
import os
import requests
import rasterio
import matplotlib.pyplot as plt
import numpy as np

WCS_OUTPUT_DIR = "./wcs_layers"
WCS_MAP_OUTPUT_DIR = "./wcs_maps"
os.makedirs(WCS_OUTPUT_DIR, exist_ok=True)
os.makedirs(WCS_MAP_OUTPUT_DIR, exist_ok=True)

def fetch_and_visualize_wcs_layer(coverage_id: str, label: str):
    bbox = "-2378164,-707617,3039835,3854382"
    crs = "EPSG:3978"
    width, height = 768, 646
    format_ = 'geotiff'

    url = (
        "https://cwfis.cfs.nrcan.gc.ca/geoserver/public/wcs?"
        f"service=WCS&version=1.0.0&request=GetCoverage&"
        f"coverage={coverage_id}&"
        f"BBOX={bbox}&CRS={crs}&WIDTH={width}&HEIGHT={height}&FORMAT={format_}"
    )

    print(f"Requesting: {url}")
    response = requests.get(url)
    if "image/tiff" not in response.headers.get("Content-Type", ""):
        print("Invalid response. Not a GeoTIFF:")
        print(response.content[:300].decode(errors="ignore"))
        return

    label_safe = label.replace(" ", "_").lower()
    tif_path = os.path.join(WCS_OUTPUT_DIR, f"{label_safe}.tif")
    with open(tif_path, "wb") as f:
        f.write(response.content)

    with rasterio.open(tif_path) as src:
        data = src.read(1)
        nodata = src.nodata if src.nodata is not None else -3.4028235e+38
        data = np.where(data == nodata, np.nan, data)

    plt.figure(figsize=(10, 6))
    plt.imshow(data, cmap="viridis")
    plt.colorbar(label="Value")
    plt.title(label)
    plt.axis("off")
    img_path = os.path.join(WCS_MAP_OUTPUT_DIR, f"{label_safe}.png")
    plt.savefig(img_path, bbox_inches="tight")
    plt.close()

    print(f"Saved GeoTIFF to {tif_path}")
    print(f"Saved raster visualization to {img_path}")

    return tif_path, img_path


layer_table = {
    'Precipitation': 'public:precip_current',
    'Surface Fuel Consumption': 'public:sfc_ccurrent',
    'Current Wind Direction': 'public:wdir_current',
    'Current Wind Speed': 'public:ws_current',
    'Temperature': 'public:temp_current',
    'Fire Type': 'public:ft_current',
    'Initial Spread Index': 'public:isi_current',
    'Current Daily Severity Rating': 'public:dsr_current',
    'Current Drought Code': 'public:dc_current'
}

for type, code in layer_table.items():
    print(f'Processing GeoTiff for {type}')
    fetch_and_visualize_wcs_layer(code, type)


Processing GeoTiff for Precipitation
Requesting: https://cwfis.cfs.nrcan.gc.ca/geoserver/public/wcs?service=WCS&version=1.0.0&request=GetCoverage&coverage=public:precip_current&BBOX=-2378164,-707617,3039835,3854382&CRS=EPSG:3978&WIDTH=768&HEIGHT=646&FORMAT=geotiff
Saved GeoTIFF to ./wcs_layers/precipitation.tif
Saved raster visualization to ./wcs_maps/precipitation.png
Processing GeoTiff for Surface Fuel Consumption
Requesting: https://cwfis.cfs.nrcan.gc.ca/geoserver/public/wcs?service=WCS&version=1.0.0&request=GetCoverage&coverage=public:sfc_ccurrent&BBOX=-2378164,-707617,3039835,3854382&CRS=EPSG:3978&WIDTH=768&HEIGHT=646&FORMAT=geotiff
Invalid response. Not a GeoTIFF:
<?xml version="1.0" encoding="UTF-8"?><ServiceExceptionReport version="1.2.0" >   <ServiceException code="InvalidParameterValue" locator="coverage">
      Could not find coverage &apos;public:sfc_ccurrent&apos;
</ServiceException></ServiceExceptionReport>
Processing GeoTiff for Current Wind Direction
Requesting: https:/

In [13]:
import rasterio
import os

def inspect_all_tifs(folder_path="./wcs_layers"):
    tif_files = [f for f in os.listdir(folder_path) if f.endswith(".tif")] #list of tif files

    if not tif_files:
        print("No .tif files found in the directory.")
        return

    for filename in tif_files:
        file_path = os.path.join(folder_path, filename) #for each file name in tif files join the folder path with the file name to get file path
        print(f"\nFile: {filename}")

        #check the file attributes for each of the .tif files

        print(filename)

        try:
            with rasterio.open(file_path) as src:
                print(f"  - CRS: {src.crs}")
                print(f"  - Width x Height: {src.width} x {src.height}")
                print(f"  - Number of Bands: {src.count}")
                print(f"  - Data Type(s): {src.dtypes}")
                print(f"  - Bounds: {src.bounds}")
                print(f"  - NoData Value: {src.nodata}")
                print(f"  - Transform: {src.transform}")
        except Exception as e:
            print(f"Failed to read {filename}: {e}")


inspect_all_tifs()



File: current_wind_speed.tif
current_wind_speed.tif
  - CRS: EPSG:3978
  - Width x Height: 768 x 646
  - Number of Bands: 1
  - Data Type(s): ('float32',)
  - Bounds: BoundingBox(left=-2378164.0, bottom=-707617.0, right=3039835.0, top=3854382.0)
  - NoData Value: -3.4028234663852886e+38
  - Transform: | 7054.69, 0.00,-2378164.00|
| 0.00,-7061.92, 3854382.00|
| 0.00, 0.00, 1.00|

File: temperature.tif
temperature.tif
  - CRS: EPSG:3978
  - Width x Height: 768 x 646
  - Number of Bands: 1
  - Data Type(s): ('float32',)
  - Bounds: BoundingBox(left=-2378164.0, bottom=-707617.0, right=3039835.0, top=3854382.0)
  - NoData Value: -3.4028234663852886e+38
  - Transform: | 7054.69, 0.00,-2378164.00|
| 0.00,-7061.92, 3854382.00|
| 0.00, 0.00, 1.00|

File: current_drought_code.tif
current_drought_code.tif
  - CRS: EPSG:3978
  - Width x Height: 768 x 646
  - Number of Bands: 1
  - Data Type(s): ('float32',)
  - Bounds: BoundingBox(left=-2378164.0, bottom=-707617.0, right=3039835.0, top=3854382.0)

In [45]:
import os
import rasterio
import numpy as np
from numpy import float32
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
from datetime import date
import pandera as pa
from pandera.typing import DataFrame, Series
from pandera import Field, check_types
from pandera.typing.geopandas import GeoDataFrame
from typing import Dict, Type
import warnings

warnings.filterwarnings('ignore', module='pandera')

# File mappings from logical table name to .tif filename
TABLE_FILE_MAP = {
    "daily_severity_rating": "current_daily_severity_rating.tif",
    "drought_code": "current_drought_code.tif",
    "wind_direction": "current_wind_direction.tif",
    "wind_speed": "current_wind_speed.tif",
    "fire_type": "fire_type.tif",
    "initial_spread_index": "initial_spread_index.tif",
    "precipitation": "precipitation.tif",
    "temperature": "temperature.tif"
}

# === Schema and Validator === #
class RasterPointSchema(pa.DataFrameModel):
    lon: Series[float32]
    lat: Series[float32]
    value: Series[float32]
    acquisition_date: Series[date]

class RasterPointValidator:
    @staticmethod
    @check_types
    def validate(df: GeoDataFrame[RasterPointSchema]) -> GeoDataFrame[RasterPointSchema]:
        assert isinstance(df, gpd.GeoDataFrame), "Not a GeoDataFrame"
        return df

VALIDATOR_REGISTRY: Dict[str, Type[RasterPointValidator]] = {
    table: RasterPointValidator for table in TABLE_FILE_MAP.keys()
}

def validate_table(df, table_name):
    validator = VALIDATOR_REGISTRY.get(table_name)
    if validator is None:
        raise ValueError(f"No validator found for table {table_name}")
    return validator.validate(df)

# === Directory containing the .tif files === #
TIF_DIR = "./wcs_layers"


# === Function: Convert .tif to GeoDataFrame === #
def raster_to_geodf(file_path: str, acquisition_date: date) -> gpd.GeoDataFrame:
    with rasterio.open(file_path) as src:
        band = src.read(1)
        transform = src.transform
        nodata = src.nodata

        # Get all valid pixel indices
        rows, cols = np.where(band != nodata)
        if len(rows) == 0:
            raise ValueError("No valid data in raster.")

        # Get spatial coordinates for those pixels
        coords = rasterio.transform.xy(transform, rows, cols, offset='center')
        xs, ys = coords

        values = band[rows, cols]

        # Build GeoDataFrame
        gdf = gpd.GeoDataFrame({
            "value": values.astype(np.float32),
            "geometry": [Point(x, y) for x, y in zip(xs, ys)],
            "acquisition_date": acquisition_date
        }, geometry="geometry", crs=src.crs)


        # Reproject to EPSG:4326 for correct lat/lon
        gdf = gdf.to_crs(epsg=4326)

        # Extract lat/lon from reprojected geometry
        gdf["lon"] = gdf.geometry.x.astype(np.float32)
        gdf["lat"] = gdf.geometry.y.astype(np.float32)

        return gdf




In [46]:
# === Main Preprocessing Loop === #
prepared_dfs = {}

for table_name, filename in TABLE_FILE_MAP.items():
    file_path = os.path.join(TIF_DIR, filename)
    if not os.path.exists(file_path):
        print(f"File not found: {file_path}")
        continue

    # try:
    gdf = raster_to_geodf(file_path, acquisition_date=date.today())
    validated_gdf = validate_table(gdf, table_name)
    prepared_dfs[table_name] = validated_gdf
    print(f"Processed and validated: {table_name}")
    #except Exception as e:
        #print(f"Error processing {table_name}: {e}")

# All validated GeoDataFrames are now in `prepared_dfs` keyed by table name

Processed and validated: daily_severity_rating
Processed and validated: drought_code
Processed and validated: wind_direction
Processed and validated: wind_speed
Processed and validated: fire_type
Processed and validated: initial_spread_index
Processed and validated: precipitation
Processed and validated: temperature


In [47]:
display(prepared_dfs['daily_severity_rating'].head())

Unnamed: 0,value,geometry,acquisition_date,lon,lat
0,1.5e-05,POINT (-140.84591 68.0185),2025-06-08,-140.845901,68.018501
1,4e-06,POINT (-140.8616 67.92657),2025-06-08,-140.861603,67.926575
2,6e-06,POINT (-140.73136 67.9695),2025-06-08,-140.731354,67.969505
3,9e-06,POINT (-140.60065 68.01233),2025-06-08,-140.600647,68.012329
4,1.5e-05,POINT (-140.46947 68.05505),2025-06-08,-140.469467,68.055046


In [50]:
from sqlalchemy import create_engine
from geoalchemy2 import Geometry

engine = create_engine("postgresql://postgres:K2><X*T$Jad#gQg2@34.23.205.32:5432/postgres")

# --- Loop through and write each table ---
for table_name, gdf in prepared_dfs.items():
    try:
        gdf.to_postgis(
            name=table_name,
            con=engine,
            if_exists='replace',       # Use 'replace' to overwrite
            index=False,
            dtype={
                "geom": Geometry(geometry_type="POINT", srid=4326)
            }
        )
        print(f"Inserted: {table_name} ({len(gdf)} rows)")
    except Exception as e:
        print(f"Failed to insert {table_name}: {e}")

Inserted: daily_severity_rating (136883 rows)
Inserted: drought_code (136883 rows)
Inserted: wind_direction (190629 rows)
Inserted: wind_speed (190629 rows)
Inserted: fire_type (136865 rows)
Inserted: initial_spread_index (136883 rows)
Inserted: precipitation (190629 rows)
Inserted: temperature (190564 rows)
