# Spatial join between regions and hospitals in Italy

Import necessary libraries and modules

In [1]:
from libadalina_core.readers import geopackage_to_dataframe
import pathlib
import os

ERROR 1: PROJ: proj_create_from_database: Open of /home/marco/Workspace/miniconda/v3/envs/libadalina/share/proj failed


Read the geopackages containing the location of hospitals and regions in Italy and returns geopandas DataFrames

In [2]:
base_path = pathlib.Path(os.environ.get("SAMPLES_DIR", ""))

hospitals = geopackage_to_dataframe(
    str(base_path / "healthcare" / "EU_healthcare.gpkg"),
    "EU"
)[["hospital_name", "geometry", "city", "cap_beds"]]

regions = geopackage_to_dataframe(
        str(base_path / "regions" / "NUTS_RG_20M_2024_4326.gpkg"),
        "NUTS_RG_20M_2024_4326.gpkg"
    )[["LEVL_CODE", "NUTS_NAME", "CNTR_CODE", "geometry"]]

Import libadalina-core spatial operators for performing spatial joins and aggregations.

In [3]:
from libadalina_core.spatial_operators import spatial_join, JoinType, spatial_aggregation, AggregationType, \
    AggregationFunction

For the sake of this example, filter the regions to select only those that correspond to the provinces of Milan and Cremona.
`regions` and `filtered_regions` are geopandas DataFrame at this step

In [4]:
# select province of Milan and Cremona
filtered_regions = regions[
    (regions['LEVL_CODE'] == 3) &
    (regions['CNTR_CODE'] == "IT") &
    (regions['NUTS_NAME'].str.contains('Milano|Cremona', case=False))
]

Join provinces and hospitals in such a way that for each province we get the hospitals that are located withing its boundaries.
The `result` is a PySpark DataFrame having an entry for each pair of province and hospital in that province.

In [5]:
result = (spatial_join(filtered_regions, hospitals, join_type=JoinType.LEFT)
          # join operator renames the geometries adding suffixes _left and _right to avoid conflicts
          .withColumnRenamed('geometry_left', 'geometry'))
result.show(truncate=False)

https://artifacts.unidata.ucar.edu/repository/unidata-all added as a remote repository with the name: repo-1
Ivy Default Cache set to: /home/marco/.ivy2/cache
The jars for the packages stored in: /home/marco/.ivy2/jars
org.apache.sedona#sedona-spark-3.3_2.12 added as a dependency
org.datasyslab#geotools-wrapper added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-ec3e23f5-bffa-427c-b3e6-dc60ac76441e;1.0
	confs: [default]
	found org.apache.sedona#sedona-spark-3.3_2.12;1.7.1 in central
	found org.apache.sedona#sedona-common;1.7.1 in central
	found org.apache.commons#commons-math3;3.6.1 in central
	found org.locationtech.jts#jts-core;1.20.0 in central
	found org.wololo#jts2geojson;0.16.1 in central
	found org.locationtech.spatial4j#spatial4j;0.8 in central
	found com.google.geometry#s2-geometry;2.0.0 in central
	found com.google.guava#guava;25.1-jre in central
	found com.google.code.findbugs#jsr305;3.0.2 in central
	found org.checkerframework#checker-qua

:: loading settings :: url = jar:file:/home/marco/Workspace/miniconda/v3/envs/libadalina/lib/python3.10/site-packages/pyspark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml


	found com.google.errorprone#error_prone_annotations;2.1.3 in central
	found com.google.j2objc#j2objc-annotations;1.1 in central
	found org.codehaus.mojo#animal-sniffer-annotations;1.14 in central
	found com.uber#h3;4.1.1 in central
	found net.sf.geographiclib#GeographicLib-Java;1.52 in central
	found com.github.ben-manes.caffeine#caffeine;2.9.2 in central
	found org.checkerframework#checker-qual;3.10.0 in central
	found com.google.errorprone#error_prone_annotations;2.5.1 in central
	found org.apache.sedona#sedona-spark-common-3.3_2.12;1.7.1 in central
	found org.apache.sedona#shade-proto;1.7.1 in central
	found org.xerial#sqlite-jdbc;3.41.2.2 in central
	found commons-lang#commons-lang;2.6 in central
	found graphframes#graphframes;0.8.3-spark3.4-s_2.12 in spark-packages
	found org.slf4j#slf4j-api;1.7.36 in central
	found org.scala-lang.modules#scala-collection-compat_2.12;2.5.0 in central
	found org.beryx#awt-color-factory;1.0.0 in central
	found org.datasyslab#geotools-wrapper;1.7.1-

25/09/10 16:03:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
                                                                                

+---------+---------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------+---------------------------------------------+-------------------+--------+
|LEVL_CODE|NUTS_NAME|CNTR_CODE|geometry                                                                                                                                                                                                                                          

Aggregate the results to obtain the number of hospitals, the total number of beds and the average number of beds in each province.
Aggregation is performed based on the geometry.

In [6]:
result = spatial_aggregation(result, aggregate_functions=[
    AggregationFunction("hospital_name", AggregationType.COUNT, 'hospitals'),
    AggregationFunction("cap_beds", AggregationType.SUM, 'total_beds'),
    AggregationFunction("cap_beds", AggregationType.AVG, 'average_beds'),
])
result.show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------+---------+-----------------------------------+-------------+---------+----------+------------------+
|geometry                                                                                                                                                                                                                                                                                                  