Skip to content

Usage: 2.1. Reading Data: MATSim

Kasia Kozlowska edited this page Jul 14, 2022 · 5 revisions

Reading a MATSim network

This page goes through methods for reading in MATSim networks. Available as a jupyter notebook or a wiki page.

from genet import read_matsim
import os
from pprint import pprint
path_to_matsim_network = '../example_data/pt2matsim_network'

network = os.path.join(path_to_matsim_network, 'network.xml')
schedule = os.path.join(path_to_matsim_network, 'schedule.xml')
vehicles = os.path.join(path_to_matsim_network, 'vehicles.xml')

We can read the network, schedule and vehicles xml files. You can read only the network, without the schedule, but we wary that some of the operations that you can perform in GeNet may have an impact on the schedule file. For example, simplifying the Network graph will result in a lot of new, simplified, links with different ids. This means the network routes contained for services in the schedule need to be updated and validated.

n = read_matsim(
    path_to_network=network, 
    epsg='epsg:27700', 
    path_to_schedule=schedule, 
    path_to_vehicles=vehicles
)
n.print()
Graph info: Name: 
Type: MultiDiGraph
Number of nodes: 1662
Number of edges: 3166
Average in degree:   1.9049
Average out degree:   1.9049 
Schedule info: Schedule:
Number of services: 9
Number of routes: 68
Number of stops: 118

Calling plot method on the network will plot the graph and highlight the schedule's network routes

# n.plot()

You can also just plot the graph on its' own using plot_graph

# n.plot_graph()

And the schedule, showing stop-to-stop connections, using plot_schedule

# n.plot_schedule()

We can check what kind of data is stored for nodes:

n.node_attribute_summary(data=True)
attribute
├── id: ['9521031', '3826581164', '1678452821', '4074522300', '185620606']
├── x: [528387.4250512555, 528391.4406755936, 528393.2742107178, 528396.6287644263, 528396.3513181042]
├── y: [181547.5850354673, 181552.72935927223, 181558.10532352765, 181559.970402835, 181562.0370527053]
├── lon: [-0.15178558709839862, -0.135349787087776, -0.122919287085967, -0.13766218709633904, -0.14629008709559344]
├── lat: [51.52643403323907, 51.51609983324067, 51.51595583324104, 51.5182034332405, 51.52410423323943]
└── s2_id: [5221390710015643649, 5221390314367946753, 5221366508477440003, 5221390682291777543, 5221390739236081673]

s2_id refers to S2 Geometry id of that point. We can check what kind of data is stored for links:

n.link_attribute_summary(data=False)
attribute
├── id
├── from
├── to
├── freespeed
├── capacity
├── permlanes
├── oneway
├── modes
├── s2_from
├── s2_to
├── attributes
│   ├── osm:way:access
│   ├── osm:way:highway
│   ├── osm:way:id
│   ├── osm:way:name
│   ├── osm:relation:route
│   ├── osm:way:lanes
│   ├── osm:way:oneway
│   ├── osm:way:tunnel
│   ├── osm:way:psv
│   ├── osm:way:vehicle
│   ├── osm:way:traffic_calming
│   ├── osm:way:junction
│   └── osm:way:service
└── length

A MATSim network will often have additional data stored under link attributes, e.g.

<link id="1" from="1" to="2" length="3" freespeed="4" capacity="600.0" permlanes="1.0" oneway="1" modes="car" >
  <attributes>
    <attribute name="osm:way:highway" class="java.lang.String" >unclassified</attribute>
    <attribute name="osm:way:id" class="java.lang.Long" >26997928</attribute>
    <attribute name="osm:way:name" class="java.lang.String" >Brunswick Place</attribute>
  </attributes>

GeNet handles this as a nested attributes dictionary saved on the links, i.e.

pprint(n.link('1'))
{'attributes': {'osm:way:access': 'permissive',
                'osm:way:highway': 'unclassified',
                'osm:way:id': 26997928.0,
                'osm:way:name': 'Brunswick Place'},
 'capacity': 600.0,
 'freespeed': 4.166666666666667,
 'from': '25508485',
 'id': '1',
 'length': 52.765151087870265,
 'modes': {'car'},
 'oneway': '1',
 'permlanes': 1.0,
 's2_from': 5221390301001263407,
 's2_to': 5221390302696205321,
 'to': '21667818'}

GeNet assumes data types for python objects based on the class declared in the file. Below are the mappings responsible for these assumptions:

from genet.utils.java_dtypes import JAVA_DTYPE_MAP, PYTHON_DTYPE_MAP
from pprint import pprint

pprint(JAVA_DTYPE_MAP)
{'java.lang.Array': <class 'list'>,
 'java.lang.Boolean': <class 'bool'>,
 'java.lang.Byte': <class 'int'>,
 'java.lang.Char': <class 'str'>,
 'java.lang.Double': <class 'float'>,
 'java.lang.Float': <class 'float'>,
 'java.lang.Integer': <class 'int'>,
 'java.lang.Long': <class 'float'>,
 'java.lang.Short': <class 'int'>,
 'java.lang.String': <class 'str'>}
pprint(PYTHON_DTYPE_MAP)
{<class 'bool'>: 'java.lang.Boolean',
 <class 'float'>: 'java.lang.Float',
 <class 'list'>: 'java.lang.Array',
 <class 'int'>: 'java.lang.Integer',
 <class 'set'>: 'java.lang.Array',
 <class 'str'>: 'java.lang.String'}