# Appendix

## Appendix 1:

When we produced the analysis, we defined a Site of Biological Importance (SBI) as highly at risk if 15 or more development sites were located within a 1 kilometre radius of the centroid of the SBI. In the following table we show what happens to the total number of SBIs categorised as highly at risk if we modify:
    
- The number of development sites that must be located within a given distance of the SBI
- The maximum distance from the SBI where a development site is counted

The table is shown below:

In [1]:
import geopandas as gpd
import pandas as pd

# Put the analysis code inside a function wrapper
def produce_analysis(dist, grading):
    # Load and join GMCA housing, industrial and office supply data
    housing_supply_gdf = gpd.read_file("data/gmca_data/2024 GM Housing Land Supply GIS.shp")
    industrial_supply_gdf = gpd.read_file("data/gmca_data/2024 GM Industrial-warehousing Land Supply GIS.shp")
    offices_supply_gdf = gpd.read_file("data/gmca_data/2024 GM Offices Land Supply GIS.shp")
    
    total_supply_gdf = pd.concat(
        [housing_supply_gdf, industrial_supply_gdf, offices_supply_gdf]
    )
    
    # Load and tidy GMEU Sites of Biological Importance data
    sbi_gdf = gpd.read_file("data/gmeu_data/gm_sbi.shp")
    sbi_gdf["Category"] = "Site of Biological Importance"
    sbi_gdf = sbi_gdf.rename(columns = {"district": "LAName", "site_nam": "SiteRef"})
    
    # Join GMCA and GMEU data
    full_data_gdf = pd.concat(
        [total_supply_gdf, sbi_gdf[["SiteRef", "LAName", "Category", "geometry"]]]
    )
    
    #Use geopandas to get centroids of all the sites
    full_data_gdf["centroid"] = full_data_gdf.centroid
    full_data_gdf["ref"] = range(len(full_data_gdf))
    
    #Split into sites of biological importance and non-biological importance
    sbi = full_data_gdf[full_data_gdf["Category"] == "Site of Biological Importance"]
    non_sbi = full_data_gdf[full_data_gdf["Category"] != "Site of Biological Importance"]
    
    #Find the number of new developments less than 1km away for each SBI
    sbinames = list(sbi["SiteRef"]) 
    indexes = list(sbi["ref"])
    
    
    #list of all the sbis
    distances = list()
    less_than_1km = list() #creating empty lists to add to data frame
    
    for x in sbi["centroid"]: #loop through each sbi
        y = non_sbi["centroid"].distance(x) #find all the distances of developments to centroid
        for distance in y: #filter for less than 1km away
                if distance < dist:
                    distances.append(distance)
        r = len(distances)    #find no. developments less than 1km away to each sbi
        less_than_1km.append(r)
        distances = list()
    
    Dev_1km = pd.DataFrame({'SiteRef':sbinames, 'No. Sites in 1km': less_than_1km, 'ref': indexes}) #create dataframe of sbi and no. developments

    Dev_1km = Dev_1km[Dev_1km["No. Sites in 1km"] >= grading]

    return len(Dev_1km)
    

# Set levels for the distance cutoff and the grading criteria
heuristics_dist = [250, 500, 750, 1000, 1250, 1500]
heuristics_grade = [5, 10, 15, 20]

table = []

# Loop through the distance and grading criteria levels
for d in heuristics_dist:
    table_row = []
    for g in heuristics_grade:
        table_row.append(produce_analysis(d, g))
    table.append(table_row)

table

[[0, 0, 0, 0],
 [31, 5, 1, 0],
 [149, 29, 8, 4],
 [289, 129, 45, 19],
 [393, 258, 153, 80],
 [451, 361, 279, 172]]

|    | 5 sites | 10 sites | 15 sites | 20 sites |
|-------|-----|-----|-----|-----|
| 250m  | 0   | 0   | 0   | 0   |
| 500m  | 31  | 5   | 1   | 0   |
| 750m  | 149 | 29  | 8   | 4   |
| 1000m | 289 | 129 | 45  | 19  |
| 1250m | 393 | 258 | 153 | 80  |
| 1500m | 451 | 361 | 279 | 172 |

As expected, making the categorisation more stringent (e.g., to categorise an SBI at risk, we require a large number of development sites located very close to the SBI), the number of SBIs categorised as highly at risk declines. Interestingly, with a distance metric of 250 metres, there are no SBIs categorised as highly at risk even when we require only 5 sites be located within 250 metres. This table shows the results from our chosen heuristic (15 sites located within 1000 metres) is relatively robust to modifying the number of sites required within that chosen radius, however modifying the radius causes the results to fluctuate quite a bit more.