# Graph building and enrichment tools

Set the base path where the samples dataset are

In [2]:
import pathlib
import os

base_path = pathlib.Path(os.environ.get("SAMPLES_DIR", ""))

Read and parse OpenStreetMap dataset with roads of Milano.
Road dataset is filtered, removing minor and pedestrian only roads.

In [3]:
from libadalina_core.graph_extraction.readers import OpenStreetMapReader, RoadTypes

osm_df = OpenStreetMapReader(RoadTypes.CAR_ONLY).read(str(base_path / 'road_maps' / 'Milano.gpkg'))
osm_df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)


Unnamed: 0,geometry,id,name,oneway
0,"LINESTRING (9.16836 45.47604, 9.16827 45.47595...",4011790,Via Antonio Canova,forward
1,"LINESTRING (9.15073 45.45995, 9.15066 45.45993...",4011792,Via Costanza,forward
2,"LINESTRING (9.1711 45.47085, 9.17106 45.47089,...",4011793,Viale Pietro e Maria Curie,forward
3,"LINESTRING (9.15084 45.46181, 9.15076 45.46187...",4011799,Via Marchesi de' Taddei,forward
4,"LINESTRING (9.14642 45.47797, 9.14631 45.47805...",4011800,Via Monte Bianco,forward


A `networkx` `DiGraph` is built from the geometries in the OpenStreetMap dataset.

In [4]:
from libadalina_core.graph_extraction.builders import build_graph

graph = build_graph(osm_df, name='milan_road')
graph.number_of_nodes(), graph.number_of_edges() # number of nodes and edges

https://artifacts.unidata.ucar.edu/repository/unidata-all added as a remote repository with the name: repo-1
Ivy Default Cache set to: /home/marco/.ivy2/cache
The jars for the packages stored in: /home/marco/.ivy2/jars
org.apache.sedona#sedona-spark-3.3_2.12 added as a dependency
org.datasyslab#geotools-wrapper added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-4ce36235-95c5-4955-b8be-523e4a9dde15;1.0
	confs: [default]
	found org.apache.sedona#sedona-spark-3.3_2.12;1.7.1 in central
	found org.apache.sedona#sedona-common;1.7.1 in central
	found org.apache.commons#commons-math3;3.6.1 in central
	found org.locationtech.jts#jts-core;1.20.0 in central


:: loading settings :: url = jar:file:/home/marco/Workspace/miniconda/v3/envs/adalina-analytics/lib/python3.10/site-packages/pyspark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml


	found org.wololo#jts2geojson;0.16.1 in central
	found org.locationtech.spatial4j#spatial4j;0.8 in central
	found com.google.geometry#s2-geometry;2.0.0 in central
	found com.google.guava#guava;25.1-jre in central
	found com.google.code.findbugs#jsr305;3.0.2 in central
	found org.checkerframework#checker-qual;2.0.0 in central
	found com.google.errorprone#error_prone_annotations;2.1.3 in central
	found com.google.j2objc#j2objc-annotations;1.1 in central
	found org.codehaus.mojo#animal-sniffer-annotations;1.14 in central
	found com.uber#h3;4.1.1 in central
	found net.sf.geographiclib#GeographicLib-Java;1.52 in central
	found com.github.ben-manes.caffeine#caffeine;2.9.2 in central
	found org.checkerframework#checker-qual;3.10.0 in central
	found com.google.errorprone#error_prone_annotations;2.5.1 in central
	found org.apache.sedona#sedona-spark-common-3.3_2.12;1.7.1 in central
	found org.apache.sedona#shade-proto;1.7.1 in central
	found org.xerial#sqlite-jdbc;3.41.2.2 in central
	found com

25/10/25 19:25:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
                                                                                

(146883, 220423)

Edge data includes information from to OpenStreetMap dataset, including the id of the geometry, and the name of the road.
Also, graph edges are enriched with the geometry of the corresponding line on the map and its distance in meters.

In [4]:
graph.edges(0, data=True) # show data of the first edge

OutEdgeDataView([(0, 223338299393, {'geometry': <LINESTRING (9.151 45.462, 9.151 45.462)>, 'id': '4011799', 'name': "Via Marchesi de' Taddei", 'distance': 13.853932347639198})])

Graphs can be further enriched with external datasets to enhance their attributes. This example demonstrates how to incorporate population data from a grid dataset, where each cell contains information about the number of residents.


In [5]:
from libadalina_core.readers import geopackage_to_dataframe
population = geopackage_to_dataframe(
    str(base_path / "population-north-italy" / "Milano.gpkg"),
    "dataframe"
)[['T', 'geometry']]
population.head()

Unnamed: 0,T,geometry
0,85,"POLYGON ((9.03109 45.3828, 9.04383 45.38292, 9..."
1,10,"POLYGON ((9.04383 45.38292, 9.05658 45.38303, ..."
2,0,"POLYGON ((9.05658 45.38303, 9.06933 45.38315, ..."
3,4,"POLYGON ((9.06933 45.38315, 9.08208 45.38326, ..."
4,46,"POLYGON ((9.08208 45.38326, 9.09483 45.38337, ..."


The population data will be joined with the road network graph during the building phase to enrich each edge with population data. Specifically, it calculates the estimated population living within a 1km radius of each road segment


In [1]:
from libadalina_core.spatial_operators import AggregationFunction, AggregationType

graph = build_graph(osm_df,
                        name='milan_road_with_population',
                        joined_df=population,
                        buffer_radius_meters=1000, # 1km
                        aggregate_functions=[
                            AggregationFunction("T", AggregationType.SUM, 'population', proportional='geometry_right')
                        ]
                        )

NameError: name 'build_graph' is not defined

Edges data now include the amount of population nearby.

In [7]:
graph.edges(0, data=True) # show data of the first edge

OutEdgeDataView([(0, 17179869283, {'geometry': <LINESTRING (9.151 45.462, 9.151 45.462)>, 'id': '4011799', 'name': "Via Marchesi de' Taddei", 'distance': 13.853932347639198, 'population': 447.0405512516792})])