# Import networks data to the 3DCityDB

This notebook shows how the **IntegrCiTy Data Access Layer** (DAL) can be used to import data to the 3DCityDB. 
In this specific example, **geometric data** (2D line diagram) and of a gas network is **extracted** from separate sources (shapefiles), **consolidated** and **stored** in the database.

## Extracting the data

In this example, we extract and consolidate data from various files.
In this case, all the date is tored in 2 separate shapefiles:
* **Gas network pipes**: The positions of all pipes are stored as 2D line diagrams, where each pipe is defined by 2 points (start and end point of a straight pipe).
* **Gas network nodes**: Nodes are points in space where either a pipe is connected to a network boundary (substation, building) or where two or more pipes are joined.

In this example, both data sets use a consistent set of IDs to refer to pipes and nodes, which makes it possible to easily link the data from both sets.
For instance, there is a pipe called *gas_pipe83-224*, whose starting and ending points coincide with *gas_node83* and *gas_node224*, respectively.
By checking the data of these nodes, it is possible to see that this pipe is connected to a building on one side (*gas_node83* is connected to *building126*) and that two other pipes are joined to it (*gas_pipe223-224* and *gas_pipe224-225* via *gas_node224*).
In real-life applications it might take another pre-processing step to link the available data.

The image below visualizes part of the network data from the shapefile (with the help of [QGIS](https://www.qgis.org/)).

<img src="./img/network.png" style="height:14cm">

Before we start, let's define a few tuples and dicts that will be helpful to collect and process the data from the shapefiles:

In [1]:
from collections import namedtuple

# This tuple and dict are used to collect data common to all nodes.
NodeData = namedtuple( 'NodeData', [ 'level', 'x' ,'y' ] )
nodes = {}

# This tuple and dict are used to collect data common to nodes that represent networks sinks (i.e., connections to building).
SinkData = namedtuple( 'SinkData', [ 'level', 'building_name' ] )
sinks = {}

# This tuple and dict are used to collect data common to nodes that represent networks sources (i.e., connections to substations).
SourceData = namedtuple( 'SourceData', [ 'level', 'p_lim_kw', 'p_pa' ] )
sources = {}

# This tuple and dict are used to collect data about pipes.
PipeData = namedtuple( 'PipeData', [ 'length', 'diameter_m', 'from_node', 'to_node' ] )
pipes = {}

Like in the previous example, the [Python Shapefile Library (PyShp)](https://pypi.org/project/pyshp/) is used to exatract the shapefile data for the nodes:

In [2]:
import shapefile
import os, os.path

gas_nodes_data = shapefile.Reader( os.path.join( os.getcwd(), '..', '1_data', 'shapefiles', 'gas_network_nodes' ) )
assert( gas_nodes_data.shapeType == shapefile.POINT )

Let's go through and extract the nodes data.
We use the tuples and dicts defined above to extract data common to all the nodes as well as data specific to sources and sinks:

In [3]:
for data in gas_nodes_data:
    # Retrieve 2D coordinates.
    x = data.shape.points[0][0]
    y = data.shape.points[0][1]

    # Extract data common to all nodes.
    nodes[data.record['name']] = NodeData( data.record['level'], x, y )
    
    # Extract data only relevant for sinks and sources.
    if data.record['type'] == 'SINK':
        sinks[data.record['name']] = SinkData( data.record['level'], data.record['build_id'] )
    elif data.record['type'] == 'SRCE':
        sources[data.record['name']] = SourceData( data.record['level'], data.record['p_lim_kw'], data.record['p_lim_kw'] )

Now load the pipes data:

In [4]:
gas_pipes_data = shapefile.Reader( os.path.join( os.getcwd(), '..', '1_data', 'shapefiles', 'gas_network_pipes' ) )
assert( gas_pipes_data.shapeType == shapefile.POLYLINE )

Now go through and extract the pipes data, using the dict and tuple defined above:

In [5]:
from math import *

for data in gas_pipes_data:
    # Retrieve the pipe's start and end point.
    p1 = data.shape.points[0]
    p2 = data.shape.points[1]
    
    # Calculate the pipe's length.
    length = sqrt( pow( p1[0] - p2[0], 2 ) + pow( p1[1] - p2[1], 2) )
    
    # Extract the relevant data.
    pipes[data.record['name']] = \
        PipeData( length, data.record['diameter_m'], data.record['from_node'], data.record['to_node'] )

## Accessing the 3DCityDB through the IntegrCiTy DAL

Like in the previous examples, load [package dblayer](https://github.com/IntegrCiTy/dblayer).
The following lines import the core of the package (*dblayer*) for accessing the database:

In [6]:
from dblayer import *

connect = PostgreSQLConnectionInfo(
    user = 'postgres',
    pwd = 'postgres',
    host = 'localhost',
    port = '5432',
    dbname = 'citydb'
    )

db_access = DBAccess()
db_access.connect_to_citydb( connect )

Specify the spatial reference identifier (SRID) used by the 3DCityDB instance. If you have used the setup scripts for installing the extended 3DCityDB provided as part of package dblayer, then the default SRID is 4326.

In [7]:
srid = 4326

## Consolidating the network data and adding it to the database

Networks can be represented in the 3DCityDB with the help of the [Utility Network ADE](https://github.com/TatjanaKutzner/CityGML-UtilityNetwork-ADE) (UNADE).
This CityGML domain extension provides a very flexible framework to store both **topographical data** (e.g., coordinates and shapes) and **topological data** (e.g., functional connections between network features) of various types of networks.
However, with flexibility also comes complexity, which makes the usage of the UNADE a non-trivial task for new users.
To this end, [package dblayer](https://github.com/IntegrCiTy/dblayer) provides helper functions (*dblayer.helpers.utn*) that  will be used further down in this notebook.

Start by adding a new network object to the 3DCityDB called *gas_network*. The returned values are the IDs of the associated (but still empty) *Network* and *NetworkGraph* objects, which represent the topographical and topological aspects of the network, respectively.

In [8]:
from dblayer.helpers.utn.gas_network import *

( ntw_id, ntw_graph_id ) = write_network_to_db(
    db_access,
    name = 'gas_network',
    id = 3000
    )

We will start building up the network contents in the 3DCityDB by adding the nodes using function *write_network_node_to_db*.
For every new node added to the network, this functions returns an instance of class *GasNetworkNodeData*, which holds the most relevant information of the associated database object (see [here](https://github.com/IntegrCiTy/dblayer/blob/master/dblayer/helpers/utn/gas_network.py) for details).
For further processing, we also store this information in a separate dict:

In [9]:
# Dict for storing information returned from database.
nodes_db_data = {}

for name, data in nodes.items():
    nodes_db_data[name] = write_network_node_to_db(
        db_access,
        name,
        data.level,
        Point2D( data.x, data.y ),
        srid,
        ntw_id,
        ntw_graph_id )

Next, we add the pipes to the database with the help of function *write_round_pipe_to_db*.
For this, we can already make good use of the information returned in the previous step:

In [10]:
for name, data in pipes.items():

    pipe_id = write_round_pipe_to_db(
        db_access,
        name,
        # Provide information about database object associated with the starting node:
        nodes_db_data[data.from_node], 
        # Provide information about database object associated with the end node:
        nodes_db_data[data.to_node],
        srid,
        ntw_id,
        ntw_graph_id,
        int_diameter = data.diameter_m,
        int_diameter_unit = 'm',
    )


We also want to link the network sinks to the buildings already stored in the database.
For this, we retrieve all buildings data:

In [11]:
buildings_db_data = db_access.get_citydb_objects( 
    'Building',
    table_name='building', 
    schema='citydb_view' 
)

buildings_db_id = { b.name: b.id for b in buildings_db_data}

Network sinks can be represented by objects of type *TerminalElement*, which are associated both to a network node and another CityGML object (a building in this case).
To make this association, we can use function *write_gas_sink_to_db*:

In [12]:
for name, data in sinks.items():

    write_gas_sink_to_db(
        db_access,
        name,
        # Provide information about database object associated with the network node:
        nodes_db_data[name],
        # Provide default value for consumption (static) of this terminal element:
        10.,
        'kW',
        srid,
        ntw_id,
        ntw_graph_id,
        # Provide ID of city object that should be linked:
        buildings_db_id[data.building_name]
        )

Similarly, we can represent sources (e.g., substations) by objects of type *TerminalElement*.
To make this association, we can use function *write_feeder_to_db*:

In [13]:
for name, data in sources.items():

    write_feeder_to_db(
        db_access,
        name,
        # Provide information about database object associated with the network node:
        nodes_db_data[name],
        # Provide default value for generation (rating, static) of this terminal element:
        data.p_lim_kw,
        data.p_pa,
        srid,
        ntw_id,
        ntw_graph_id
        )

## Storing the data to the 3DCityDB

Above, the data was *added* to the database session. In order to make it persistent, i.e., to store it permanently in the database, it has to be *committed* to the 3DCityDB.
This is done via *commit_citydb_session*:

In [14]:
db_access.commit_citydb_session()

Finally, delete the instance of class DBAccess to close the session.

In [15]:
del db_access

Next up is notebook [3a_sim_setup.ipynb](../3_simulate/3a_sim_setup.ipynb), which demonstrates how to use the IntegrCiTy DAL to create a simulation setup for analyzing this gas network.