# MatSim specific validation

You can generate a validation report for the genet Network encompassing validity of the network, schedule and routing (of the transit services in the schedule on the network). It aims to provide a good collection of checks known to have affected MatSim simulations in the past. The report is a simple dictionary with keys: `graph`, `schedule` and `routing`.

In [1]:
# read sample network
from genet import Network
import os

n = Network('epsg:27700')
path_to_matsim_network = '../tests/test_data/matsim'
n.read_matsim_network(os.path.join(path_to_matsim_network, 'network.xml'))
n.read_matsim_schedule(os.path.join(path_to_matsim_network, 'schedule.xml'))
n.print()

Graph info: Name: Network graph
Type: MultiDiGraph
Number of nodes: 2
Number of edges: 1
Average in degree:   0.5000
Average out degree:   0.5000 
Schedule info: Schedule:
Number of services: 1
Number of unique routes: 1
Number of stops: 2


In [2]:
report = n.generate_validation_report()

2020-12-17 12:21:35,216 - Checking validity of the Network
2020-12-17 12:21:35,218 - Checking validity of the Network graph
2020-12-17 12:21:35,221 - Checking network connectivity for mode: car
2020-12-17 12:21:35,222 - Checking network connectivity for mode: walk
2020-12-17 12:21:35,224 - Checking network connectivity for mode: bike
2020-12-17 12:21:35,226 - Checking validity of the Schedule
2020-12-17 12:21:35,228 - Not all stops reference network link ids.
2020-12-17 12:21:35,231 - Not all stops reference network link ids.
2020-12-17 12:21:35,232 - Not all stops reference network link ids.
2020-12-17 12:21:35,233 - This schedule is not valid
2020-12-17 12:21:35,234 - Not all stops reference network link ids.
2020-12-17 12:21:35,236 - Not all stops reference network link ids.
2020-12-17 12:21:35,238 - Service id=10314 is not valid
2020-12-17 12:21:35,239 - Not all stops reference network link ids.
2020-12-17 12:21:35,240 - Not all stops reference network link ids.
2020-12-17 12:21:35

The `graph` section describes strongly connected components of the modal subgraphs, for modes that agents in MATSim need to find routes on: `car`, and `walk` and `bike` if using the `multimodal.contrib`. In addition to this, it also flags links of length 1km or longer that can be inspected separately.

In [3]:
from pprint import pprint
pprint(report['graph'])

{'graph_connectivity': {'bike': {'number_of_connected_subgraphs': 0,
                                 'problem_nodes': {'dead_ends': [],
                                                   'unreachable_node': []}},
                        'car': {'number_of_connected_subgraphs': 2,
                                'problem_nodes': {'dead_ends': ['21667818'],
                                                  'unreachable_node': ['25508485']}},
                        'walk': {'number_of_connected_subgraphs': 2,
                                 'problem_nodes': {'dead_ends': ['21667818'],
                                                   'unreachable_node': ['25508485']}}},
 'links_over_1km_length': []}


The `schedule` section describes correctness of the schedule on three levels:
    
- `schedule_level`: Overall look at the schedule validity. A `Schedule` is valid if:
    - all of its' services are valid
    - its' services are uniquely indexed
    
    Schedule `has_valid_services` if all services within the schedule are deemed valid. The invalid services are 
    flagged in `invalid_services` and the invalid stages of schedule validity are flagged in `invalid_stages`.
- `service_level`: Provides a look at validity of services within the schedule. It is indexed by service ids. Each
`Service` is valid if:
    - each of its' routes is valid
    - its' routes are uniquely indexed
    
    A service `has_valid_routes` if all routes within the service are deemed valid. The invalid routes are 
    flagged in `invalid_routes` and the invalid stages of service validity are flagged in `invalid_stages`.
- `route_level`: Provides a look at validity of each route within each service indexed by service id and route id
(or service id and the index in the `Service.routes` list if not uniquely indexed). Each `Route` is valid if it
    - has more than one `Stop`
    - has correctly ordered route (the stops (their link reference ids) and links a route refers to are in the same 
    order)
    - arrival and departure offsets are correct (each stop has one and they are correctly ordered temporally)
    - does not have self loops (there are no trips such as: Stop A -> Stop A)
    
    If a route satisfies the above `is_valid_route` is `True`. If not, the `invalid_stages` flag where the route
    did not satisfy validity conditions.

(Nb. The same dictionary can be generated by using `Schedule` object's own `generate_validation_report` method.)


In [4]:
pprint(report['schedule'])

{'route_level': {'10314': {'VJbd8660f05fe6f744e58a66ae12bd66acbca88b98': {'invalid_stages': ['not_has_correctly_ordered_route'],
                                                                          'is_valid_route': False}}},
 'schedule_level': {'has_valid_services': False,
                    'invalid_services': ['10314'],
                    'invalid_stages': ['not_has_valid_services'],
                    'is_valid_schedule': False},
 'service_level': {'10314': {'has_valid_routes': False,
                             'invalid_routes': ['VJbd8660f05fe6f744e58a66ae12bd66acbca88b98'],
                             'invalid_stages': ['not_has_valid_routes'],
                             'is_valid_service': False}}}


Finally, the `routing` section describes routing of the transit schedule services onto the network graph.
- `services_have_routes_in_the_graph`: all routes have network routes and the links they refer to exist in the graph,
are connected (to nodes of preceding link is the from node of the next link in the chain) and the `modes` saved on the
link data accept the mode of the route.
- `service_routes_with_invalid_network_route`: flags routes not satifying the above,
- `route_to_crow_fly_ratio`: gives ratio of the length of route to crow-fly distance between each of the stops along 
route. If the route is invalid, it will result in 0. If the route has only one stop it will result in 
`'Division by zero'`.

In [5]:
pprint(report['routing'])

{'route_to_crow_fly_ratio': {'10314': {'VJbd8660f05fe6f744e58a66ae12bd66acbca88b98': 'Division '
                                                                                     'by '
                                                                                     'zero'}},
 'service_routes_with_invalid_network_route': [('10314',
                                                'VJbd8660f05fe6f744e58a66ae12bd66acbca88b98')],
 'services_have_routes_in_the_graph': False}


The above report relies on a lot of convenience methods which can be used on their own. For example, you can list all invalid routes for the network using:

In [6]:
n.invalid_network_routes()

2020-12-17 12:21:35,288 - Some link ids in Route: VJbd8660f05fe6f744e58a66ae12bd66acbca88b98 don't accept the route's mode: bus


[('10314', 'VJbd8660f05fe6f744e58a66ae12bd66acbca88b98')]

In [7]:
n.schedule.is_valid_schedule()

2020-12-17 12:21:35,299 - Not all stops reference network link ids.


False

Something that is not included in the validity report (because MATSim doesn't insist on it being satified) is strong connectivity of PT. You can call `is_strongly_connected` on `Schedule` or the schedule components: `Service` and `Route`. The process uses an underlying  directed graph of stop connections (which you can access by calling `graph` method on a schedule-type element, e.g. if `s` is a `genet.Service` object, `s.graph()` will give you this directed graph)).

In [8]:
n.schedule.is_strongly_connected()

False

In [9]:
n.schedule.graph().is_directed()

True