North Ross created this notebook for an analysis to examine the amount of overlap in HV1 protection and development zones for the BRFN analysis work.

Some of the data was pre-processed in QGIS and saved to a temp folder to use in thsi analysis, while most records came from the Sharepoint site (accessed via local link to sharepoint through OneDrive)

In summary, the script:

1. Lists all the input layers along with relevant information like defintion queries and fields to summarize by in a list of dictionaries.
2. Loops through the list and for each:
    - Reads the file
    - Intersects with the HV1 zones
    - Creates a pivot table reporting the area of overlap (or the sum of another specified field). It can also group the input by a certain field. For exampele, the recruitment forest is split by recruitment class (1 - 5)
    - transposes this table and appends to the previous one

3. Saves the final table as an excel table 

In [None]:
import geopandas as gpd
import pandas as pd

In [None]:
# hv1 zones:
hv1_zones = r'\\<path>\DraftDevelopmentZones_Merge_2025-03-27.shp'

# list data sources - shapefile, geopackage or geodatabase
# for .gpkg and .gdb you need to specify a layer keyword as in examples
input_list = [
    {
        'name': 'Overlap with top 20% WV map',
        'path': r'\\<path>\2025-25-01_WMBWeightedValues_Top20_40.gdb',
        # place keyword arguments for the gpd.read_file() function in a nested dictionary to easily pass them to the function
        'kw': {'layer': 'WeightedValuesMap_Top20'}
        },
    {
        'name': 'Overlap with top 40% WV map',
        'path': r'\\<path>\2025-25-01_WMBWeightedValues_Top20_40.gdb',
        'kw': {'layer': 'WeightedValuesMap_Top40'}
        },
    {
        'name': 'Recruitment Class',
        'path': r'\\<path>\Recruitment_Class_Dissolved.gpkg', 
        'kw': {'layer': 'RecruitmentDissolve'},
        'groupField': 'Rec_Cat'
        },
    {
        'name': 'Broad Contiguous Habitat Top',
        'path': r'\\<path>\ContiguousHabitat_2024-12-17_NR.gdb',
        'kw': {'layer': 'Contiguous_Broad_Threshold_byWMB_noPrivate', 
                'where': "threshold is not null"},
        'groupField': 'threshold'
        },
    {
        'name': 'Refined Contiguous Habitat Top',
        'path': r'\\<path>\ContiguousHabitat_2024-12-17_NR.gdb', 
        'kw': {'layer':'Contiguous_Refined_Threshold_NoPrivate', 
                'where':"threshold is not null"},
        'groupField': 'Threshold'
        },
    {
        'name': "Connectivity",
        'path': r'\\<path>\Connectivity_60m.shp'
        },
    {
        'name': "Private Land",
        'path': r'\\<path>\iflb_own_studyArea.shp', 
        'kw': {'where':"OWNERSHIP_ = 'Private'"}
        },
    {
        'name': "Moose Class",
        'path': r'\\<path>\MooseClassesPolygonDissolve.gpkg',
        'groupField': 'MooseClass'
        },
    {
        'name': 'Caribou Class 1',
        'path': r'\\<path>\Wildlife_MtnNorthCaribou_Hex.gdb', 
        'kw': {'layer': 'Binary_CaribouMtnNorth_Class1_2024_12_06', 
                'where': "gridcode > 0"}
        },
    {
        'name': 'Caribou Class 2',
        'path': r'\\<path>\Wildlife_MtnNorthCaribou_Hex.gdb', 
        'kw': {'layer': 'Binary_CaribouMtnNorth_Class2_2024_12_06', 
                'where': "gridcode > 0"}
        },
    {
        'name': "Riparian (not erased)",
        'path': r"\\<path>\RiparianLayer_Ranked_2024-12-06.shp"
        },
    {
        'name': "Microrefugia",
        'path': r'\\<path>\MicrorefugiaHex_2024_12_17.gdb', 
        'kw': {'layer': 'PU1ha_Microrefugia_2024_12_17', 
                'where': "microrefugProp > 0"},
        'sumField': 'microrefugProp'
        },
    {
        'name': "Headwaters Tier 1",
        'path': r'\\<path>\Headwater_Tier1_2024_12_17_1.shp'
        }
]

In [None]:
# read HV1 Zones for relevant areas
hv1 = gpd.read_file(hv1_zones, where='Label_Num IN (3, 1)', crs=3005)

# summarize vector data
def summarizeVectorArea(data):
    
    # read data
    if data.get('kw'): # if input dict contains keyword arguments (like 'layer', 'where', etc)
        args = data['kw']
        gdf = gpd.read_file(data.get('path'), **args) # add args to function
    else: # else just read the path
        gdf = gpd.read_file(data.get('path'))
    
    # intersect with HV1 zones and get area
    intersected = gpd.overlay(hv1, gdf, how="intersection", keep_geom_type=True)
    intersected['AreaHa'] = intersected.geometry.area/10000
    
    # if a sum field is indicated, sum this field instead of the AreaHa field (for ones based on hexagons)
    if data.get('sumField'):
        sumField = data.get('sumField')
        
        output = pd.pivot_table(
            data = intersected,
            values = sumField,
            index=['Name', 'Zone'],
            dropna=False, fill_value=0,
            aggfunc='sum'
            )
    
    # if a group field is indicated, separate columns into the group field rather than reporting just one value
    # for example the Recruitment Forest data should be reported by class (1-5)
    # this adds five records to the output instead of one
    elif data.get('groupField'):
        group = data.get('groupField')
        
        output = pd.pivot_table(
            data = intersected,
            values = 'AreaHa',
            index=['Name', 'Zone'],
            dropna=False, fill_value=0,
            columns=group,
            aggfunc='sum'
            )
        # add a prefix to the group titles (the name of the layer) for legibility
        output = output.add_prefix(f"{data['name']} ")
        
    else: # else just report one AreaHa for the overlap with the entire polygon.
        output = pd.pivot_table(
            data = intersected,
            values = 'AreaHa',
            index=['Name', 'Zone'],
            dropna=False, fill_value=0,
            aggfunc='sum'
            )
    
    # rename AreaHa to the feature name
    output = output.rename(columns={'AreaHa': data.get('name')})
    output = output.transpose() # transpose so the index becomes columns
    return output

In [None]:
# read all inputs one by one, create the table, and append all tables together

final_df = pd.DataFrame()
for lyr in input_list:
    outdf = summarizeVectorArea(lyr)
    final_df = pd.concat([final_df, outdf])


In [None]:
# Here is an example of the output, but with the numbers "scrambled" for confidentiality:
import numpy as np
final_df.apply(lambda row: row * np.random.uniform(-10, 10, size=len(row)), axis=1)
final_df.to_excel('path\\to\\output.xlsx')

Name,North Aitken,North Aitken,Wolf/Davis Trapping 1,Wolf/Davis Trapping 1
Zone,Development,Protection,Development,Protection
Overlap with top 20% WV map,-3217.773275,-21268.480544,725.213542,-17449.298566
Overlap with top 40% WV map,-784.040208,-6466.121917,6794.369924,31169.320395
Recruitment Class 1.0,-0.0,29.266562,0.0,-302.958085
Recruitment Class 2.0,-565.979085,-22495.364116,-456.093387,8862.963328
Recruitment Class 3.0,-2149.72021,-2016.902439,560.200466,-13940.957509
Recruitment Class 4.0,-1017.127808,-4529.968107,-1951.118037,-3242.536161
Recruitment Class 5.0,-0.000538,9.403993,0.0,39.876374
Broad Contiguous Habitat Top 25,-811.743133,3282.430965,1495.189814,-30297.381245
Broad Contiguous Habitat Top 50,77.544569,-44.458595,-716.295276,9554.472228
Refined Contiguous Habitat Top 25,-29.784211,2944.706664,207.858859,-2943.081597
