# Calculating park entrances

Using the calculated park geojson file (produced by the [Processing parks data notebook](01_Processing_parks_data.ipynb)), this script calculates park entrances, producing a more extensive/true-to-life dataset than the entrances provided with osm data.

- In general, OSM data includes in-park gates etc., which are not as useful for our visibility analyses.
- OSM data is missing some boundary entrances because they are not gated etc.

Note that the park geojson file has some parks just marked as single points due to data sourcing issues.

The following algorithm and code was written by Maeve Murphy Quinlan.

In [2]:
from shapely import Point, LineString, remove_repeated_points, unary_union
import numpy as np
import matplotlib.pyplot as plt
import osmnx as ox
import geopandas as gpd
import pandas as pd
import json
from pathlib import Path

## Load in datasets

Load in the data set of parks produced in the last notebook.

Also set up save paths for result files.

In [None]:
# Read in the park geometries data set

# New calculation with improved parks dataset
parks_path = "../data/results/testing/bradford_all_parks_comprehensive.geojson"
output_path_1 = "../data/results/testing/calculated_park_entrances.geojson"
output_path_2 = "../data/results/testing/merged_park_entrances.geojson"

parks_gdf = gpd.read_file(parks_path)

In [4]:
parks_gdf.explore()

## Check CRS

Check coordinate reference system to ensure that it is compatible with other datasets/analysis.

In [5]:
# Check crs
parks_gdf.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [6]:
print(f"Loaded {len(parks_gdf)} parks from GeoJSON")
print("Park names:", parks_gdf['Park Name'].tolist() if 'Park Name' in parks_gdf.columns else "No names available")

Loaded 71 parks from GeoJSON
Park names: ['Russell Hall Park', 'Greenwood Park', 'Grange Park', 'Foxhill Park', 'Knowles Park', 'Seymour Street Recreation Ground', 'Eccleshill Park', 'Cross Roads Park', 'Wibsey Park', 'Prince of Wales Park', 'Peel Park', 'Lund Park', 'Littlemoor Park', 'Ladyhill Park', 'Horton Park', 'Foster Park', 'Devonshire Park', 'Crowgill Park', 'Cross Roads Park', 'Brackenhill Park', 'Lister Park', 'Bradford Moor Park', 'Parkside Park', 'Idle Recreation Ground', 'Judy Woods', 'Griff Wood', 'Ferniehurst Dell', 'Silsden Park', 'Shipley Central Park', 'Roberts Park', 'Myrtle Park', 'Harold Park', 'Victoria Park, Oakenshaw', 'West Park, Girlington', 'Crabtree Ghyll', 'Riverside Gardens, Ilkley', 'Cliffe Castle Museum & Park', 'Central Park, Haworth', 'Middleton Woods', 'Northcliffe Park', 'Victoria Park, Keighley', 'Victoria Park, Clayton', 'Menston Park', 'Holden Park (Oakworth Park)', 'Hirst Woods', 'Shipley Glen', 'St. Ives Country Park', 'Ben Rhydding Gravel Pits

## Analyze parks to extract entrances

In [None]:
def extract_park_entrances(park_polygon, buffer_distance=0.001):
    """
    Extract entrance points for a park polygon using network intersection analysis.
    
    Parameters:
    - park_polygon: Shapely polygon representing the park boundary
    - buffer_distance: Buffer distance for network extraction (default: 0.001 degrees)
    
    Returns:
    - entrance_points: List of raw entrance points (duplicated)
    - merged_points: Deduplicated entrance points as MultiPoint geometry
    - G: OSMnx graph for visualization
    """
    
    # Create buffered polygon for network extraction
    network_buffer = park_polygon.buffer(buffer_distance)
    
    try:
        # Extract walking network around the park
        # G = ox.graph_from_polygon(network_buffer, network_type='walk') # we may want to update this to also include bike paths
        G = ox.graph_from_polygon(network_buffer, network_type='all')
    except Exception as e:
        print(f"Error extracting network: {e}")
        return [], None, None
    
    # Find entrance points where edges cross the park boundary
    entrance_points = []
    for u, v in G.edges():
        u_pt = Point(G.nodes[u]['x'], G.nodes[u]['y'])
        v_pt = Point(G.nodes[v]['x'], G.nodes[v]['y'])
        
        # Check if edge crosses park boundary
        if park_polygon.contains(u_pt) != park_polygon.contains(v_pt):
            # Edge crosses boundary, interpolate intersection
            line = LineString([u_pt, v_pt])
            intersection = line.intersection(park_polygon.boundary)
            
            if not intersection.is_empty:
                if intersection.geom_type == 'Point':
                    entrance_points.append(intersection)
                elif intersection.geom_type == 'MultiPoint':
                    entrance_points.extend(intersection.geoms)
    
    # Deduplicate entrance points
    if entrance_points:
        merged_points = unary_union(entrance_points)
    else:
        merged_points = None
    
    return entrance_points, merged_points, G

In [None]:
# Initialize results storage
results = []
failed_parks = []

# Process each park
for idx, park_row in parks_gdf.iterrows():
    park_polygon = park_row.geometry
    park_name = park_row.get('Park Name')
    park_id = park_row.get('park_id', f'park_{idx}')
    
    print(f"\nProcessing {park_name}...")
    
    # Extract entrances
    entrance_points, merged_points, G = extract_park_entrances(park_polygon)
    
    if merged_points is not None and hasattr(merged_points, 'geoms'):
        num_entrances = len(merged_points.geoms)
        print(f"  Found {len(entrance_points)} raw entrance nodes")
        print(f"  Deduplicated to {num_entrances} unique entrances")
        
        # Store results
        for i, point in enumerate(merged_points.geoms):
            results.append({
                'park_name': park_name,
                'park_id': park_id,
                'entrance_id': f"{park_id}_entrance_{i+1}",
                'longitude': point.x,
                'latitude': point.y,
                'geometry': point
            })
    else:
        print(f"  No entrances found or error occurred")
        failed_parks.append(park_name)

print(f"\n=== Processing Complete ===")
print(f"Successfully processed: {len(parks_gdf) - len(failed_parks)} parks")
print(f"Failed: {len(failed_parks)} parks")
if failed_parks:
    print(f"Failed parks: {failed_parks}")


Processing Russell Hall Park...
  Found 6 raw entrance nodes
  Deduplicated to 3 unique entrances

Processing Greenwood Park...
  Found 6 raw entrance nodes
  Deduplicated to 3 unique entrances

Processing Grange Park...
  Found 34 raw entrance nodes
  Deduplicated to 17 unique entrances

Processing Foxhill Park...
  Found 4 raw entrance nodes
  Deduplicated to 2 unique entrances

Processing Knowles Park...
  Found 14 raw entrance nodes
  Deduplicated to 7 unique entrances

Processing Seymour Street Recreation Ground...
  Found 6 raw entrance nodes
  Deduplicated to 3 unique entrances

Processing Eccleshill Park...
  Found 10 raw entrance nodes
  Deduplicated to 5 unique entrances

Processing Cross Roads Park...
  Found 4 raw entrance nodes
  Deduplicated to 2 unique entrances

Processing Wibsey Park...
  Found 12 raw entrance nodes
  Deduplicated to 6 unique entrances

Processing Prince of Wales Park...
  Found 14 raw entrance nodes
  Deduplicated to 7 unique entrances

Processing Pe

In [None]:
# gdf.crs = "OGC:CRS84"

if results:
    results_gdf = gpd.GeoDataFrame(results)
    results_gdf.crs = "OGC:CRS84"
    print(f"\nTotal entrances found: {len(results_gdf)}")
else:
    print("\nNo entrances found for any parks")


Total entrances found: 518


In [None]:
results_gdf

Unnamed: 0,park_name,park_id,entrance_id,longitude,latitude,geometry
0,Russell Hall Park,park_0,park_0_entrance_1,-1.849552,53.765560,POINT (-1.84955 53.76556)
1,Russell Hall Park,park_0,park_0_entrance_2,-1.848694,53.764210,POINT (-1.84869 53.76421)
2,Russell Hall Park,park_0,park_0_entrance_3,-1.847720,53.765366,POINT (-1.84772 53.76537)
3,Greenwood Park,park_1,park_1_entrance_1,-1.834189,53.815265,POINT (-1.83419 53.81526)
4,Greenwood Park,park_1,park_1_entrance_2,-1.833978,53.813631,POINT (-1.83398 53.81363)
...,...,...,...,...,...,...
513,Emsley Recreation Ground,park_52,park_52_entrance_2,-1.731981,53.838076,POINT (-1.73198 53.83808)
514,Emsley Recreation Ground,park_52,park_52_entrance_3,-1.729438,53.837101,POINT (-1.72944 53.8371)
515,Emsley Recreation Ground,park_52,park_52_entrance_4,-1.728215,53.838423,POINT (-1.72822 53.83842)
516,Emsley Recreation Ground,park_52,park_52_entrance_5,-1.727898,53.836872,POINT (-1.7279 53.83687)


In [None]:
results_gdf.explore()

## Save out calculated entrances

- Entrances that have been algorithmically determined can be exported separately from OSM tagged entrances that will also be collected.

In [None]:
# Save the results as a geojson file

results_gdf.to_file(output_path_1, driver='GeoJSON')
print(f"Results exported to: {output_path_1}")


Results exported to: Data/park_entrances_improved.geojson


## Extract OSM-tagged entrances

Now we'll extract the OSM-supplied park entrances using entrance and barrier tags, and save them as a separate GeoJSON file for comparison.

In [None]:
def extract_osm_entrances(park_polygon, park_name=None, buffer_distance=300):
    """
    Extract OSM-tagged entrance points for a park using the park's center point.
    
    Parameters:
    - park_polygon: Shapely polygon representing the park boundary
    - park_name: Name of the park for location-based query
    - buffer_distance: Search radius in meters (default: 300)
    
    Returns:
    - osm_entrances_gdf: GeoDataFrame of OSM-tagged entrances within/near the park
    """
    
    # Tags likely to denote entrances
    entrance_tags = {
        "entrance": True,
        "barrier": ["gate", "entrance"]
    }
    
    try:
        # Get park centroid for location-based search
        centroid = park_polygon.centroid
        location_query = f"{centroid.y}, {centroid.x}"  # lat, lon format
        
        # Pull geometries tagged as entrances
        entrances = ox.features.features_from_point(
            (centroid.y, centroid.x), 
            tags=entrance_tags, 
            dist=buffer_distance
        )
        
        if entrances.empty:
            return gpd.GeoDataFrame()
        
        # Filter to point geometries (actual entrance nodes)
        entrance_points = entrances[entrances.geom_type == "Point"].copy()
        
        # Filter to entrances that are within or very close to the park
        # Create a small buffer around the park to catch entrances at the boundary
        park_buffer = park_polygon.buffer(0.0005)  # Small buffer in degrees
        
        # Keep only entrances within the buffered park area
        entrance_points = entrance_points[entrance_points.geometry.within(park_buffer)]
        
        print(f"  Found {len(entrance_points)} OSM-tagged entrances")
        return entrance_points
        
    except Exception as e:
        print(f"  Error extracting OSM entrances: {e}")
        return gpd.GeoDataFrame()

In [None]:
# Initialize results storage for OSM entrances
osm_results = []
osm_failed_parks = []

# Process each park for OSM entrances
for idx, park_row in parks_gdf.iterrows():
    park_polygon = park_row.geometry
    park_name = park_row.get('Park Name')
    park_id = park_row.get('park_id', f'park_{idx}')
    
    print(f"\nExtracting OSM entrances for {park_name}...")
    
    # Extract OSM entrances
    osm_entrances = extract_osm_entrances(park_polygon, park_name)
    
    if not osm_entrances.empty:
        # Store results
        for i, (entrance_idx, entrance_row) in enumerate(osm_entrances.iterrows()):
            # Get relevant attributes from the OSM data
            entrance_attrs = {
                'park_name': park_name,
                'park_id': park_id,
                'entrance_id': f"{park_id}_osm_entrance_{i+1}",
                'longitude': entrance_row.geometry.x,
                'latitude': entrance_row.geometry.y,
                'geometry': entrance_row.geometry,
                'osm_entrance_type': entrance_row.get('entrance', 'unknown'),
                'osm_barrier_type': entrance_row.get('barrier', 'unknown'),
                'osm_id': entrance_row.get('osmid', 'unknown')
            }
            osm_results.append(entrance_attrs)
    else:
        print(f"  No OSM entrances found")
        osm_failed_parks.append(park_name)

print(f"\n=== OSM Entrance Processing Complete ===") 
print(f"Successfully found OSM entrances for: {len(parks_gdf) - len(osm_failed_parks)} parks")
print(f"No OSM entrances found for: {len(osm_failed_parks)} parks")
if osm_failed_parks:
    print(f"Parks without OSM entrances: {osm_failed_parks}")


Extracting OSM entrances for Russell Hall Park...
  Found 1 OSM-tagged entrances

Extracting OSM entrances for Greenwood Park...
  Error extracting OSM entrances: No matching features. Check query location, tags, and log.
  No OSM entrances found

Extracting OSM entrances for Grange Park...
  Found 0 OSM-tagged entrances
  No OSM entrances found

Extracting OSM entrances for Foxhill Park...
  Error extracting OSM entrances: No matching features. Check query location, tags, and log.
  No OSM entrances found

Extracting OSM entrances for Knowles Park...
  Found 1 OSM-tagged entrances

Extracting OSM entrances for Seymour Street Recreation Ground...
  Found 1 OSM-tagged entrances

Extracting OSM entrances for Eccleshill Park...
  Found 0 OSM-tagged entrances
  No OSM entrances found

Extracting OSM entrances for Cross Roads Park...
  Found 0 OSM-tagged entrances
  No OSM entrances found

Extracting OSM entrances for Wibsey Park...
  Found 4 OSM-tagged entrances

Extracting OSM entrances 

In [None]:
# Create GeoDataFrame for OSM entrances
if osm_results:
    osm_results_gdf = gpd.GeoDataFrame(osm_results)
    osm_results_gdf.crs = "OGC:CRS84"
    print(f"\nTotal OSM entrances found: {len(osm_results_gdf)}")
    
    # Display summary by park
    entrance_summary = osm_results_gdf.groupby('park_name').size().reset_index(name='osm_entrance_count')
    print("\nOSM entrances per park:")
    for _, row in entrance_summary.iterrows():
        print(f"  {row['park_name']}: {row['osm_entrance_count']} entrances")
else:
    osm_results_gdf = gpd.GeoDataFrame()
    print("\nNo OSM entrances found for any parks")


Total OSM entrances found: 93

OSM entrances per park:
  Ben Rhydding Gravel Pits: 4 entrances
  Bowling Park: 3 entrances
  Brackenhill Park: 3 entrances
  Central Park, Haworth: 1 entrances
  Cliffe Castle Museum & Park: 4 entrances
  Crabtree Ghyll: 3 entrances
  Emsley Recreation Ground: 2 entrances
  Esholt Woods: 1 entrances
  Foster Park: 1 entrances
  Hirst Woods: 3 entrances
  Horton Park: 1 entrances
  Idle Recreation Ground: 2 entrances
  Knowles Park: 1 entrances
  Ladyhill Park: 2 entrances
  Lister Park: 9 entrances
  Lund Park: 1 entrances
  Menston Park: 4 entrances
  Middleton Woods: 1 entrances
  Myrtle Park: 11 entrances
  Peel Park: 2 entrances
  Riverside Gardens, Ilkley: 14 entrances
  Roberts Park: 5 entrances
  Russell Hall Park: 1 entrances
  Seymour Street Recreation Ground: 1 entrances
  Shipley Central Park: 2 entrances
  Shipley Glen: 1 entrances
  Victoria Park, Oakenshaw: 6 entrances
  Wibsey Park: 4 entrances


In [None]:
# Display the OSM entrances dataframe
if not osm_results_gdf.empty:
    display(osm_results_gdf.head())
else:
    print("No OSM entrances to display")

Unnamed: 0,park_name,park_id,entrance_id,longitude,latitude,geometry,osm_entrance_type,osm_barrier_type,osm_id
0,Russell Hall Park,park_0,park_0_osm_entrance_1,-1.847718,53.76537,POINT (-1.84772 53.76537),unknown,entrance,unknown
1,Knowles Park,park_4,park_4_osm_entrance_1,-1.719305,53.771784,POINT (-1.71931 53.77178),unknown,gate,unknown
2,Seymour Street Recreation Ground,park_5,park_5_osm_entrance_1,-1.733258,53.794143,POINT (-1.73326 53.79414),unknown,gate,unknown
3,Wibsey Park,park_8,park_8_osm_entrance_1,-1.783554,53.766637,POINT (-1.78355 53.76664),unknown,gate,unknown
4,Wibsey Park,park_8,park_8_osm_entrance_2,-1.784978,53.767246,POINT (-1.78498 53.76725),unknown,gate,unknown


In [None]:
osm_results_gdf.explore(marker_kwds={'radius': 8, 'color': 'yellow'}, 
                           tooltip=['park_name', 'osm_entrance_type', 'osm_barrier_type'])

## If wished, save out OSM entrances

Additionally, the OSM tagged entrances can be exported separately.

In [None]:
# # Save the OSM entrances as a geojson file
# if not osm_results_gdf.empty:
#     osm_results_gdf.to_file('Data/osm_park_entrances.geojson', driver='GeoJSON')
#     print("OSM entrances exported to: Data/osm_park_entrances.geojson")
# else:
#     print("No OSM entrances to export")

## Compare both datasets

In [None]:
# Compare the two datasets
print("\n=== COMPARISON SUMMARY ===")
print(f"Calculated entrances (network intersection): {len(results_gdf) if 'results_gdf' in locals() and not results_gdf.empty else 0}")
print(f"OSM-tagged entrances: {len(osm_results_gdf) if not osm_results_gdf.empty else 0}")

if 'results_gdf' in locals() and not results_gdf.empty and not osm_results_gdf.empty:
    # Calculate average entrances per park for each method
    calc_avg = len(results_gdf) / len(parks_gdf)
    osm_avg = len(osm_results_gdf) / len(parks_gdf)
    print(f"Average calculated entrances per park: {calc_avg:.1f}")
    print(f"Average OSM entrances per park: {osm_avg:.1f}")
    
    # Parks with both types of entrances
    calc_parks = set(results_gdf['park_name'].unique())
    osm_parks = set(osm_results_gdf['park_name'].unique())
    both_parks = calc_parks.intersection(osm_parks)
    print(f"Parks with both calculated and OSM entrances: {len(both_parks)}")
    print(f"Parks with only calculated entrances: {len(calc_parks - osm_parks)}")
    print(f"Parks with only OSM entrances: {len(osm_parks - calc_parks)}")


=== COMPARISON SUMMARY ===
Calculated entrances (network intersection): 518
OSM-tagged entrances: 93
Average calculated entrances per park: 9.8
Average OSM entrances per park: 1.8
Parks with both calculated and OSM entrances: 26
Parks with only calculated entrances: 23
Parks with only OSM entrances: 2


## Merge and deduplicate entrance datasets

Combine calculated and OSM entrances, removing duplicates and tagging the source of each entrance.

In [None]:
def merge_and_deduplicate_entrances(calculated_gdf, osm_gdf, distance_threshold=0.0001):
    """
    Merge calculated and OSM entrance datasets, removing duplicates based on spatial proximity.
    
    Parameters:
    - calculated_gdf: GeoDataFrame of calculated entrances
    - osm_gdf: GeoDataFrame of OSM-tagged entrances
    - distance_threshold: Distance threshold in degrees for considering entrances as duplicates (default: 0.0001 ≈ 10m)
    
    Returns:
    - merged_gdf: GeoDataFrame with merged and deduplicated entrances
    """
    
    # Prepare calculated entrances
    if not calculated_gdf.empty:
        calc_df = calculated_gdf.copy()
        calc_df['source'] = 'calculated'
        calc_df['entrance_type'] = 'network_intersection'
        calc_df['osm_entrance_type'] = None
        calc_df['osm_barrier_type'] = None
        calc_df['osm_id'] = None
    else:
        calc_df = gpd.GeoDataFrame()
    
    # Prepare OSM entrances
    if not osm_gdf.empty:
        osm_df = osm_gdf.copy()
        osm_df['source'] = 'osm_tagged'
        osm_df['entrance_type'] = 'osm_tagged'
    else:
        osm_df = gpd.GeoDataFrame()
    
    # If one or both datasets are empty, return the non-empty one or empty GeoDataFrame
    if calc_df.empty and osm_df.empty:
        return gpd.GeoDataFrame()
    elif calc_df.empty:
        return osm_df
    elif osm_df.empty:
        return calc_df
    
    # Ensure both have the same columns
    common_columns = ['park_name', 'park_id', 'entrance_id', 'longitude', 'latitude', 
                     'geometry', 'source', 'entrance_type', 'osm_entrance_type', 
                     'osm_barrier_type', 'osm_id']
    
    # Add missing columns with None values
    for col in common_columns:
        if col not in calc_df.columns:
            calc_df[col] = None
        if col not in osm_df.columns:
            osm_df[col] = None
    
    # Select only common columns
    calc_df = calc_df[common_columns]
    osm_df = osm_df[common_columns]
    
    # Start with calculated entrances
    merged_list = []
    
    # Add all calculated entrances first
    for idx, calc_row in calc_df.iterrows():
        merged_list.append(calc_row.to_dict())
    
    # Check OSM entrances against calculated ones for duplicates
    for idx, osm_row in osm_df.iterrows():
        osm_point = osm_row.geometry
        
        # Find if there's a nearby calculated entrance
        is_duplicate = False
        duplicate_idx = None
        
        for i, merged_entrance in enumerate(merged_list):
            if merged_entrance['source'] == 'calculated':
                calc_point = merged_entrance['geometry']
                distance = osm_point.distance(calc_point)
                
                if distance <= distance_threshold:
                    is_duplicate = True
                    duplicate_idx = i
                    break
        
        if is_duplicate:
            # Mark as both calculated and OSM
            merged_list[duplicate_idx]['source'] = 'both'
            merged_list[duplicate_idx]['entrance_type'] = 'calculated_and_osm'
            merged_list[duplicate_idx]['osm_entrance_type'] = osm_row['osm_entrance_type']
            merged_list[duplicate_idx]['osm_barrier_type'] = osm_row['osm_barrier_type']
            merged_list[duplicate_idx]['osm_id'] = osm_row['osm_id']
            # Update entrance_id to reflect both sources
            orig_id = merged_list[duplicate_idx]['entrance_id']
            merged_list[duplicate_idx]['entrance_id'] = f"{orig_id}_osm_{osm_row['entrance_id'].split('_')[-1]}"
        else:
            # Add as unique OSM entrance
            merged_list.append(osm_row.to_dict())
    
    # Create final GeoDataFrame
    if merged_list:
        merged_gdf = gpd.GeoDataFrame(merged_list)
        merged_gdf.crs = "OGC:CRS84"
        return merged_gdf
    else:
        return gpd.GeoDataFrame()

In [None]:
# Merge and deduplicate the entrance datasets
print("\n=== MERGING AND DEDUPLICATING ENTRANCES ===")

if 'results_gdf' in locals() and not results_gdf.empty:
    calc_entrances = results_gdf
else:
    calc_entrances = gpd.GeoDataFrame()

if 'osm_results_gdf' in locals() and not osm_results_gdf.empty:
    osm_entrances = osm_results_gdf
else:
    osm_entrances = gpd.GeoDataFrame()

# Perform the merge and deduplication
merged_entrances_gdf = merge_and_deduplicate_entrances(calc_entrances, osm_entrances)

if not merged_entrances_gdf.empty:
    print(f"\nMerged dataset statistics:")
    print(f"Total unique entrances: {len(merged_entrances_gdf)}")
    
    # Count by source
    source_counts = merged_entrances_gdf['source'].value_counts()
    print(f"\nEntrances by source:")
    for source, count in source_counts.items():
        print(f"  {source}: {count}")
    
    # Count by park
    park_counts = merged_entrances_gdf.groupby(['park_name', 'source']).size().unstack(fill_value=0)
    print(f"\nEntrances per park by source:")
    print(park_counts)
    
else:
    print("No entrances to merge")


=== MERGING AND DEDUPLICATING ENTRANCES ===

Merged dataset statistics:
Total unique entrances: 593

Entrances by source:
  calculated: 500
  osm_tagged: 75
  both: 18

Entrances per park by source:
source                            both  calculated  osm_tagged
park_name                                                     
Ben Rhydding Gravel Pits             1          21           3
Bowling Park                         1          27           2
Brackenhill Park                     0           0           3
Bradford Moor Park                   0           2           0
Buck Wood                            0          12           0
Central Park, Haworth                1           8           0
Chellow Dean Woods                   0          16           0
Cliffe Castle Museum & Park          0          10           4
Crabtree Ghyll                       1           6           2
Cross Roads Park                     0           4           0
Crowgill Park                        0      

In [None]:
# Display the merged entrances dataframe
if not merged_entrances_gdf.empty:
    print("\nSample of merged entrances:")
    display(merged_entrances_gdf.head(10))
    
    print("\nColumn information:")
    print(f"Columns: {list(merged_entrances_gdf.columns)}")
else:
    print("No merged entrances to display")


Sample of merged entrances:


Unnamed: 0,park_name,park_id,entrance_id,longitude,latitude,geometry,source,entrance_type,osm_entrance_type,osm_barrier_type,osm_id
0,Russell Hall Park,park_0,park_0_entrance_1,-1.849552,53.76556,POINT (-1.84955 53.76556),calculated,network_intersection,,,
1,Russell Hall Park,park_0,park_0_entrance_2,-1.848694,53.76421,POINT (-1.84869 53.76421),calculated,network_intersection,,,
2,Russell Hall Park,park_0,park_0_entrance_3_osm_1,-1.84772,53.765366,POINT (-1.84772 53.76537),both,calculated_and_osm,unknown,entrance,unknown
3,Greenwood Park,park_1,park_1_entrance_1,-1.834189,53.815265,POINT (-1.83419 53.81526),calculated,network_intersection,,,
4,Greenwood Park,park_1,park_1_entrance_2,-1.833978,53.813631,POINT (-1.83398 53.81363),calculated,network_intersection,,,
5,Greenwood Park,park_1,park_1_entrance_3,-1.83356,53.815116,POINT (-1.83356 53.81512),calculated,network_intersection,,,
6,Grange Park,park_2,park_2_entrance_1,-1.750345,53.913192,POINT (-1.75035 53.91319),calculated,network_intersection,,,
7,Grange Park,park_2,park_2_entrance_2,-1.750084,53.913579,POINT (-1.75008 53.91358),calculated,network_intersection,,,
8,Grange Park,park_2,park_2_entrance_3,-1.749857,53.913264,POINT (-1.74986 53.91326),calculated,network_intersection,,,
9,Grange Park,park_2,park_2_entrance_4,-1.749828,53.912403,POINT (-1.74983 53.9124),calculated,network_intersection,,,



Column information:
Columns: ['park_name', 'park_id', 'entrance_id', 'longitude', 'latitude', 'geometry', 'source', 'entrance_type', 'osm_entrance_type', 'osm_barrier_type', 'osm_id']


In [None]:

merged_entrances_gdf.explore(
    column='source',
    categorical=True,
    cmap='Set1',
    marker_kwds={'radius': 6},
    tooltip=['park_name', 'source', 'entrance_type', 'osm_entrance_type'],
    legend=True
)


In [None]:
merged_entrances_gdf.to_file(output_path_2, driver='GeoJSON')
print(f"Merged entrances exported to: {output_path_2}")


Merged entrances exported to: Data/merged_park_entrances_improved.geojson


In [None]:
# Final summary
print("\n=== FINAL SUMMARY ===")
if 'calc_entrances' in locals() and not calc_entrances.empty:
    calc_count = len(calc_entrances)
else:
    calc_count = 0
    
if 'osm_entrances' in locals() and not osm_entrances.empty:
    osm_count = len(osm_entrances)
else:
    osm_count = 0

print(f"Original calculated entrances: {calc_count}")
print(f"Original OSM entrances: {osm_count}")
print(f"Total before deduplication: {calc_count + osm_count}")
print(f"Final merged and deduplicated: {len(merged_entrances_gdf) if not merged_entrances_gdf.empty else 0}")

if not merged_entrances_gdf.empty:
    duplicates_found = (calc_count + osm_count) - len(merged_entrances_gdf)
    print(f"Duplicates removed: {duplicates_found}")
    
    both_count = len(merged_entrances_gdf[merged_entrances_gdf['source'] == 'both'])
    print(f"Entrances confirmed by both methods: {both_count}")
    
    # Breakdown by source
    source_breakdown = merged_entrances_gdf['source'].value_counts()
    print(f"\nBreakdown by source:")
    for source, count in source_breakdown.items():
        percentage = (count / len(merged_entrances_gdf)) * 100
        print(f"  {source}: {count} ({percentage:.1f}%)")


=== FINAL SUMMARY ===
Original calculated entrances: 518
Original OSM entrances: 93
Total before deduplication: 611
Final merged and deduplicated: 593
Duplicates removed: 18
Entrances confirmed by both methods: 18

Breakdown by source:
  calculated: 500 (84.3%)
  osm_tagged: 75 (12.6%)
  both: 18 (3.0%)


In [None]:
# Analyze spatial agreement between methods
if not merged_entrances_gdf.empty and 'both' in merged_entrances_gdf['source'].values:
    print("\n=== SPATIAL AGREEMENT ANALYSIS ===")
    
    # Parks with entrances from both methods
    parks_with_both = merged_entrances_gdf[merged_entrances_gdf['source'] == 'both']['park_name'].unique()
    print(f"Parks with spatially overlapping entrances: {len(parks_with_both)}")
    
    if len(parks_with_both) > 0:
        print("Parks with confirmed entrances:")
        for park in parks_with_both:
            count = len(merged_entrances_gdf[(merged_entrances_gdf['park_name'] == park) & 
                                           (merged_entrances_gdf['source'] == 'both')])
            print(f"  {park}: {count} confirmed entrance(s)")
    
    # Calculate agreement rate
    if calc_count > 0 and osm_count > 0:
        agreement_rate = both_count / min(calc_count, osm_count) * 100
        print(f"\nAgreement rate: {agreement_rate:.1f}% (confirmed entrances / smaller dataset)")
else:
    print("\nNo spatially overlapping entrances found between methods.")


=== SPATIAL AGREEMENT ANALYSIS ===
Parks with spatially overlapping entrances: 11
Parks with confirmed entrances:
  Russell Hall Park: 1 confirmed entrance(s)
  Peel Park: 2 confirmed entrance(s)
  Lister Park: 4 confirmed entrance(s)
  Crabtree Ghyll: 1 confirmed entrance(s)
  Riverside Gardens, Ilkley: 2 confirmed entrance(s)
  Menston Park: 1 confirmed entrance(s)
  Central Park, Haworth: 1 confirmed entrance(s)
  Middleton Woods: 1 confirmed entrance(s)
  Myrtle Park: 3 confirmed entrance(s)
  Ben Rhydding Gravel Pits: 1 confirmed entrance(s)
  Bowling Park: 1 confirmed entrance(s)

Agreement rate: 19.4% (confirmed entrances / smaller dataset)
