# ULTImodel tutorial: Create cost matrices

The distribution in the gravity model is based on the cost of travel between TAZ. In ULTImodel, the cost of travel comprises of travel times and distances. Therefore, the connectors set in tutorial2 will be used as start and end coordinates for OSRM requests in the `Matrix` class.

The inputs are:
- TAZ as GeoDataFrame
- Connectors as GeoDataFrame (from tutorial2)

## Import packages

In [1]:
# for network creation
from ultimodel import Matrices

In [2]:
# for reading and saving files etc.
import geopandas as gpd
import pandas as pd
import numpy as np
# for time tracking
from datetime import datetime

## Read input: TAZ and connectors

The input includes georeferenced TAZ in `EPSG:4326` with the following _required_ attributes (columns):

* __ID__ | field including a unique ID, e.g. the NUTS ID
* __Country__ | field containing the ISO-2 code of the respective country

Other attributes like name, population etc. can be added, but are not required for the following steps.

The connector nodes per TAZ in `EPSG:4326`, as created in `tutorial2`, are also imported.

In [3]:
# load taz from database or local
taz = gpd.GeoDataFrame.from_file('tutorial-files/_input/taz-tutorial.gpkg')

# defining ID, country and geometry column names of taz
taz_id = "nuts_id"
taz_cn = "cntr_code"
taz_geo = "geometry"

taz.head()

Unnamed: 0,nuts_id,cntr_code,nuts_name,geometry
0,FI193,FI,Keski-Suomi,"MULTIPOLYGON (((26.13865 63.45759, 26.16055 63..."
1,FI194,FI,Etelä-Pohjanmaa,"MULTIPOLYGON (((21.64783 62.01959, 21.63880 62..."
2,FI195,FI,Pohjanmaa,"MULTIPOLYGON (((21.64783 62.01959, 21.52578 62..."
3,FI196,FI,Satakunta,"MULTIPOLYGON (((21.41993 61.04744, 21.42015 61..."
4,FI197,FI,Pirkanmaa,"MULTIPOLYGON (((22.83124 62.27089, 22.90118 62..."


In [4]:
path_import = 'tutorial-files/tutorial2/'

In [5]:
connectors = gpd.GeoDataFrame.from_file(path_import + 'connectors.gpkg')
connectors.head()

Unnamed: 0,node_id,nuts_id,c_n,weight,geometry
0,10002098,FI193,0,0.300759,POINT (26.13884 62.24631)
1,10001123,FI193,1,0.059247,POINT (24.59288 62.58921)
2,10001498,FI193,2,0.027741,POINT (25.06013 63.01714)
3,10001991,FI193,3,0.041949,POINT (25.86502 63.07544)
4,10001575,FI193,4,0.097001,POINT (25.17530 61.85061)


## Get Matrices

Using OSRM requests, the travel times and distances between all connector points are extracted. The result is given in the form of a `np.array` with the shape `(len(connctors), len(connectors), 2)`. Afterwards, this matrix will be aggregated to a matrix between TAZ, using the connector weights as weighting factors.

In [12]:
path_export = 'tutorial-files/tutorial4/'

In [6]:
mx = Matrices.Matrix(conn=connectors, zone_col=taz_id, conn_geo='geometry', id_col='c_n')
mx.osrm_request_nodes(save_np=False)
zone_matrix, zone_ids = mx.transform_to_taz()

335 connector point coordinates
start requests: 2023-03-21 11:05:36.594881


100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [01:30<00:00, 22.72s/it]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.conn[self.id_col] = np.arange(len(self.conn))


total time requests: 90.86675477027893 s (1.51 min)
2023-03-21 11:07:07.461636


100%|████████████████████████████████████████████████████████████████████████████████| 335/335 [01:01<00:00,  5.48it/s]


start aggregating zone matrix: 2023-03-21 11:08:08.553333


100%|██████████████████████████████████████████████████████████████████████████████████| 69/69 [00:02<00:00, 23.57it/s]

total time zones: 2.9304826259613037 s (0.05 min)
2023-03-21 11:08:11.483816





The final matrix has the shape `(len(taz), len(taz), 2)` with the dimensions `(origin, destination, cost)`. Two costs are given:

1) Travel time in s `[0]`
2) Distance in m `[1]`

In [13]:
# save 
np.save(path_export + "cost-matrices.npy", zone_matrix)

## Assign matrix index to TAZ

In order to enable the connection between the cost matrices and the TAZ, the index used for each TAZ while creating the cost matrices will be added as a column to TAZ.

In [14]:
# add zone_id to taz
taz = taz.merge(zone_ids, how='left', left_on='nuts_id', right_on='zone')
taz.drop(columns=['zone'], inplace=True)
taz.head()

Unnamed: 0,nuts_id,cntr_code,nuts_name,geometry,id
0,FI193,FI,Keski-Suomi,"MULTIPOLYGON (((26.13865 63.45759, 26.16055 63...",11
1,FI194,FI,Etelä-Pohjanmaa,"MULTIPOLYGON (((21.64783 62.01959, 21.63880 62...",12
2,FI195,FI,Pohjanmaa,"MULTIPOLYGON (((21.64783 62.01959, 21.52578 62...",13
3,FI196,FI,Satakunta,"MULTIPOLYGON (((21.41993 61.04744, 21.42015 61...",14
4,FI197,FI,Pirkanmaa,"MULTIPOLYGON (((22.83124 62.27089, 22.90118 62...",15


In [16]:
taz.to_file(path_export + 'taz-tutorial-id' + '.gpkg', driver='GPKG')