# MatSim specific validation

You can generate a validation report for the genet Network encompassing validity of the network, schedule and routing (of the transit services in the schedule on the network). It aims to provide a good collection of checks known to have affected MatSim simulations in the past. The report is a simple dictionary with keys: `graph`, `schedule` and `routing`.

In [1]:
# read sample network
from genet import read_matsim
import os

path_to_matsim_network = '../example_data/pt2matsim_network'

network = os.path.join(path_to_matsim_network, 'network.xml')
schedule = os.path.join(path_to_matsim_network, 'schedule.xml')
vehicles = os.path.join(path_to_matsim_network, 'vehicles.xml')
n = read_matsim(
    path_to_network=network, 
    epsg='epsg:27700', 
    path_to_schedule=schedule, 
    path_to_vehicles=vehicles
)
# you don't need to read the vehicles file, but doing so ensures all vehicles
# in the schedule are of the expected type and the definition of the vehicle
# is preserved
n.print()

Graph info: Name: Network graph
Type: MultiDiGraph
Number of nodes: 1662
Number of edges: 3166
Average in degree:   1.9049
Average out degree:   1.9049 
Schedule info: Schedule:
Number of services: 9
Number of routes: 68
Number of stops: 45


In [2]:
report = n.generate_validation_report()

2021-03-31 11:57:59,027 - Checking validity of the Network
2021-03-31 11:57:59,033 - Checking validity of the Network graph
2021-03-31 11:57:59,034 - Checking network connectivity for mode: car
2021-03-31 11:57:59,389 - Checking network connectivity for mode: walk
2021-03-31 11:57:59,441 - Checking network connectivity for mode: bike
2021-03-31 11:58:01,552 - Checking validity of the Schedule


The `graph` section describes strongly connected components of the modal subgraphs, for modes that agents in MATSim need to find routes on: `car`, and `walk` and `bike` if using the `multimodal.contrib`. In addition to this, it also flags links of length 1km or longer that can be inspected separately.

In [3]:
from pprint import pprint
pprint(report['graph'])

{'graph_connectivity': {'bike': {'number_of_connected_subgraphs': 0,
                                 'problem_nodes': {'dead_ends': [],
                                                   'unreachable_node': []}},
                        'car': {'number_of_connected_subgraphs': 1,
                                'problem_nodes': {'dead_ends': [],
                                                  'unreachable_node': []}},
                        'walk': {'number_of_connected_subgraphs': 0,
                                 'problem_nodes': {'dead_ends': [],
                                                   'unreachable_node': []}}},
 'link_attributes': {'links_over_1km_length': {'link_ids': [],
                                               'number_of': 0,
                                               'percentage': 0.0},
                     'zero_attributes': {}}}


The `schedule` section describes correctness of the schedule on three levels:
    
- `schedule_level`: Overall look at the schedule validity. A `Schedule` is valid if:
    - all of its' services are valid
    - its' services are uniquely indexed
    
    Schedule `has_valid_services` if all services within the schedule are deemed valid. The invalid services are 
    flagged in `invalid_services` and the invalid stages of schedule validity are flagged in `invalid_stages`.
- `service_level`: Provides a look at validity of services within the schedule. It is indexed by service ids. Each
`Service` is valid if:
    - each of its' routes is valid
    - its' routes are uniquely indexed
    
    A service `has_valid_routes` if all routes within the service are deemed valid. The invalid routes are 
    flagged in `invalid_routes` and the invalid stages of service validity are flagged in `invalid_stages`.
- `route_level`: Provides a look at validity of each route within each service indexed by service id and route id
(or service id and the index in the `Service.routes` list if not uniquely indexed). Each `Route` is valid if it
    - has more than one `Stop`
    - has correctly ordered route (the stops (their link reference ids) and links a route refers to are in the same 
    order)
    - arrival and departure offsets are correct (each stop has one and they are correctly ordered temporally)
    - does not have self loops (there are no trips such as: Stop A -> Stop A)
    
    If a route satisfies the above `is_valid_route` is `True`. If not, the `invalid_stages` flag where the route
    did not satisfy validity conditions.

(Nb. The same dictionary can be generated by using `Schedule` object's own `generate_validation_report` method.)


In [4]:
pprint(report['schedule'])

{'route_level': {'12430': {'VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3': {'invalid_stages': [],
                                                                          'is_valid_route': True},
                           'VJ06cd41dcd58d947097df4a8f33234ef423210154': {'invalid_stages': [],
                                                                          'is_valid_route': True},
                           'VJ0f3c08222de16c2e278be0a1bf0f9ea47370774e': {'invalid_stages': [],
                                                                          'is_valid_route': True},
                           'VJ15419796737689e742962a625abcf3fd5b3d58b1': {'invalid_stages': [],
                                                                          'is_valid_route': True},
                           'VJ235c8fca539cf931b3c673f9b056606384aff950': {'invalid_stages': [],
                                                                          'is_valid_route': True},
                         

Finally, the `routing` section describes routing of the transit schedule services onto the network graph.
- `services_have_routes_in_the_graph`: all routes have network routes and the links they refer to exist in the graph,
are connected (to nodes of preceding link is the from node of the next link in the chain) and the `modes` saved on the
link data accept the mode of the route.
- `service_routes_with_invalid_network_route`: flags routes not satifying the above,
- `route_to_crow_fly_ratio`: gives ratio of the length of route to crow-fly distance between each of the stops along 
route. If the route is invalid, it will result in 0. If the route has only one stop it will result in 
`'Division by zero'`.

In [5]:
pprint(report['routing'])

{'route_to_crow_fly_ratio': {'12430': {'VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3': 0.846235591692864,
                                       'VJ06cd41dcd58d947097df4a8f33234ef423210154': 0.846235591692864,
                                       'VJ0f3c08222de16c2e278be0a1bf0f9ea47370774e': 0.6847798803333932,
                                       'VJ15419796737689e742962a625abcf3fd5b3d58b1': 0.846235591692864,
                                       'VJ235c8fca539cf931b3c673f9b056606384aff950': 0.6847798803333932,
                                       'VJ8f9aea7491080b0137d3092706f53dc11f7dba45': 0.6847798803333932,
                                       'VJ948e8caa0f08b9c6bf6330927893942c474b5100': 0.6847798803333932,
                                       'VJ95b4c534d7c903d76ec0340025aa88b81dba3ce4': 0.6847798803333932,
                                       'VJeae6e634f8479e0b6712780d5728f0afca964e64': 0.846235591692864,
                                       'VJeb72539d69ddf8e29

The above report relies on a lot of convenience methods which can be used on their own. For example, you can list all invalid routes for the network using:

In [6]:
n.invalid_network_routes()

[]

In [7]:
n.schedule.is_valid_schedule()

True

Something that is not included in the validity report (because MATSim doesn't insist on it being satified) is strong connectivity of PT. You can call `is_strongly_connected` on `Schedule` or the schedule components: `Service` and `Route`. The process uses an underlying  directed graph of stop connections (which you can access by calling `graph` method on a schedule-type element, e.g. if `s` is a `genet.Service` object, `s.graph()` will give you this directed graph)).

In [8]:
n.schedule.is_strongly_connected()

False

In [9]:
n.schedule.graph().is_directed()

True