# NETWORK CONNECTIONS

For the tracing usecase, cross-border connection of networks is fundamental to the objectives. A logical approach is applied where a network connection object is created, to indicate two nodes that exist in different networks make reference to the same point in the real world object. 

To achieve this, the nodes that make reference to the same real world object, or at least indicate the flow of water from one region or country to another, are identified. 

This entails;
- Extracting the endpoints of one network with no begin points, falling along the border.
- Extracting the beginpoints of another networkfalling along the same border.
Performing an **sjoin_nearest** operation, results will include multiple output records for a single input record where there are multiple equidistant nearest or intersected neighbors.

Because of possible precision errors, a buffer along the border is used to identify the two sets of nodes.

These nodes then form the network connection object, as cross-border connected, cross-border identical or intermodal connections.

This code was developed using the three regions of Belgium, and can be replicated to any regions/countries by declaring the appropriate labels and columns.

In [1]:
import os
import sys
path = os.path.dirname(os.path.abspath(''))
os.chdir(path)
print(path)
sys.path.insert(0, path)

c:\Workdir\Develop\repository\go-peg


In [3]:
import geopandas as gpd
import pandas as pd

from shapely.geometry import Point, LineString, MultiLineString, MultiPoint, Polygon
from shapely import wkt
from shapely.ops import nearest_points

from src.config import config

### 1. Load datasets

In [4]:
PROJ_CRS = 'EPSG:31370'
FINAL_CRS= 'EPSG:3035'
connection_type = "cross-border connected"

In [5]:
def load_datasets(path, PROJ_CRS):
    """
    Loads the data from the given path, 
    and prints the shape and crs of the data.
    """
    data = gpd.read_file(path)
    data = data.to_crs(PROJ_CRS)
   
    return data

In [6]:
vl_nodes = config.data_dest / 'vl_nodes_PROCESSED.shp'
bxl_water = config.data_dest / 'bxl_waterNodes.shp'
vl_border = config.data_src / 'BE_boundaries/flanders.shp'
bxl_border = config.data_src / 'BE_boundaries/bruxelles.shp'
wal_nodes = config.data_dest / 'wal_waternodesPROCESSED.shp'
wal_border = config.data_src / 'BE_boundaries/wallonie.shp'

In [7]:
df1 = load_datasets(wal_nodes, PROJ_CRS) #wal_points
df2 = load_datasets(vl_nodes, PROJ_CRS) #vl_points
dataset1_border = load_datasets(wal_border, PROJ_CRS)
dataset2_border = load_datasets(vl_border, PROJ_CRS)

In [8]:
buffer_length = 50
dataset1_buffer = dataset1_border.buffer(buffer_length)
dataset2_buffer = dataset2_border.buffer(buffer_length)

#intersection of the two buffers
buffer_intersection = dataset1_buffer.intersection(dataset2_buffer)

#extract the point data within the border buffer strip
df1_points = df1.clip(buffer_intersection)
df2_points = df2.clip(buffer_intersection)

In [9]:
network_conn = (gpd.sjoin_nearest(df1_points, df2_points)
                    .merge(df2[["node_id", "geometry"]], left_on="node_id_right", right_on="node_id", how="left"))

In [10]:
network_conn.head(2)

Unnamed: 0,node_id_left,source_left,sewernode__left,geometry_x,index_right,node_id_right,source_right,sewernode__right,STATUS,LBLTYPE,node_id,geometry_y
0,WAL_HN1320,water_node,,POINT (258446.469 158897.404),18904,VL_HN18905,water_node,,,,VL_HN18905,POINT (258449.365 158902.070)
1,WAL_HN1269,water_node,,POINT (247909.886 160543.643),34851,VL_HN34852,water_node,,,,VL_HN34852,POINT (247905.672 160535.281)


In [11]:
def make_connection_lines(df, from_point, to_point):
    lines = []
    for index, row in df.iterrows():
        p_1 = Point(row[from_point])
        p_2 = Point(row[to_point])
        intersect = LineString([p_1, p_2])
        # linestring = loads(intersect)
        lines.append(intersect)
    return lines

network_conn['connection_lines'] = make_connection_lines(network_conn, 'geometry_x', 'geometry_y')
network_conn.head(2)

Unnamed: 0,node_id_left,source_left,sewernode__left,geometry_x,index_right,node_id_right,source_right,sewernode__right,STATUS,LBLTYPE,node_id,geometry_y,connection_lines
0,WAL_HN1320,water_node,,POINT (258446.469 158897.404),18904,VL_HN18905,water_node,,,,VL_HN18905,POINT (258449.365 158902.070),LINESTRING (258446.4693398135 158897.404485286...
1,WAL_HN1269,water_node,,POINT (247909.886 160543.643),34851,VL_HN34852,water_node,,,,VL_HN34852,POINT (247905.672 160535.281),LINESTRING (247909.88613982164 160543.64278520...


In [12]:
connection_links = gpd.GeoDataFrame(network_conn[["node_id_left", "node_id_right", "connection_lines"]]
                    .rename(columns={"node_id_left":"idElement1", "node_id_right":"idElement2", "connection_lines":"geometry"}))

In [13]:
connection_links

Unnamed: 0,idElement1,idElement2,geometry
0,WAL_HN1320,VL_HN18905,"LINESTRING (258446.469 158897.404, 258449.365 ..."
1,WAL_HN1269,VL_HN34852,"LINESTRING (247909.886 160543.643, 247905.672 ..."
2,WAL_HN1374,VL_HN30138,"LINESTRING (216027.598 156590.331, 216027.598 ..."
3,WAL_HN790,VL_HN60308,"LINESTRING (216055.971 156616.522, 216055.972 ..."
4,WAL_HN1375,VL_HN46290,"LINESTRING (216257.428 156753.467, 216257.428 ..."
...,...,...,...
102,WAL_HN1192,VL_HN42133,"LINESTRING (54279.784 165549.628, 54513.399 16..."
103,WAL_HN1454,VL_HN59960,"LINESTRING (50810.867 166309.407, 50812.130 16..."
104,WAL_HN1064,VL_HN41692,"LINESTRING (54234.079 166330.121, 54232.946 16..."
105,WAL_HN135,VL_HN59983,"LINESTRING (53835.561 167280.712, 53839.584 16..."


In [14]:
import uuid
connection_links['UUID'] = [uuid.uuid4().hex for _ in range(len(connection_links.index))]

connection_links['watercourse_namespace'] = "gopeg.eu/tracing"
connection_links['connectionType'] = connection_type


In [17]:
connection_links['fictitious'] = 'true'

In [15]:
connection_links = connection_links.set_crs(PROJ_CRS)
connection_links = connection_links.to_crs(FINAL_CRS)

In [18]:
connection_links.crs

<Derived Projected CRS: EPSG:3035>
Name: ETRS89-extended / LAEA Europe
Axis Info [cartesian]:
- Y[north]: Northing (metre)
- X[east]: Easting (metre)
Area of Use:
- name: Europe - European Union (EU) countries and candidates. Europe - onshore and offshore: Albania; Andorra; Austria; Belgium; Bosnia and Herzegovina; Bulgaria; Croatia; Cyprus; Czechia; Denmark; Estonia; Faroe Islands; Finland; France; Germany; Gibraltar; Greece; Hungary; Iceland; Ireland; Italy; Kosovo; Latvia; Liechtenstein; Lithuania; Luxembourg; Malta; Monaco; Montenegro; Netherlands; North Macedonia; Norway including Svalbard and Jan Mayen; Poland; Portugal including Madeira and Azores; Romania; San Marino; Serbia; Slovakia; Slovenia; Spain including Canary Islands; Sweden; Switzerland; Turkey; United Kingdom (UK) including Channel Islands and Isle of Man; Vatican City State.
- bounds: (-35.58, 24.6, 44.83, 84.73)
Coordinate Operation:
- name: Europe Equal Area 2001
- method: Lambert Azimuthal Equal Area
Datum: Europ

In [19]:
connection_links.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
Int64Index: 107 entries, 0 to 106
Data columns (total 7 columns):
 #   Column                 Non-Null Count  Dtype   
---  ------                 --------------  -----   
 0   idElement1             107 non-null    object  
 1   idElement2             107 non-null    object  
 2   geometry               107 non-null    geometry
 3   UUID                   107 non-null    object  
 4   watercourse_namespace  107 non-null    object  
 5   connectionType         107 non-null    object  
 6   fictitious             107 non-null    object  
dtypes: geometry(1), object(6)
memory usage: 6.7+ KB


In [20]:
# connection_links.to_file("harmonized_data/NetworkConnections.gpkg", layer="wal_vl_connections", driver='GPKG')