Bring graph compression down to Cython #399

Jake-Moss · 2023-04-14T05:46:21Z

Currently this yields an approximate ~~2.7x~~ 3.9x increase in performance (10s down to ~~3.7~~ 2.55s on a 5600x) when using the Long An model. More testing and proper benchmarking is to come.

I've looked into parallelising this code but if done naively a thread local slink variable, acting as a counter, would provide random access to the compression_.* arrays. I'm not exactly sure how to move forward with parallelising it without some more invasive changes.

Some clean up commits are to come regarding the setting of members which is currently done at the end of the cythonised function.

Jake-Moss · 2023-04-14T05:57:48Z

Locally I have two tests failing, due to this conversion to np.int8 on line 296, as direction is only ever in {-1, 0, 1} should we look to convert the current usage of long to a smaller data type?

FAILED tests/aequilibrae/transit/test_lib_gtfs.py::test_map_match - ValueError: Buffer dtype mismatch, expected 'long' but got 'signed char'
FAILED tests/aequilibrae/transit/test_pattern.py::test_map_match - ValueError: Buffer dtype mismatch, expected 'long' but got 'signed char'

aequilibrae/aequilibrae/transit/map_matching_graph.py

Lines 293 to 305 in 5dc6246

    
           net_data = pd.DataFrame( 
        
               { 
        
                   "distance": net.distance.astype(float), 
        
                   "direction": net.direction.astype(np.int8), 
        
                   "a_node": net.a_node.astype(int), 
        
                   "b_node": net.b_node.astype(int), 
        
                   "link_id": net.link_id.astype(int), 
        
                   "is_connector": net.is_connector.astype(int), 
        
                   "original_id": net.original_id.astype(int), 
        
                   "closest": net.closest.astype(int), 
        
                   "free_flow_time": net.free_flow_time.astype(float), 
        
               } 
        
           )

pedrocamargo · 2023-04-17T00:36:15Z

It would be interesting to see benchmarking of the overall graph creating time as well, as that is that what matters in the end (even though the proportional gain would look the same given the preponderance of the compression step).

On the parallelization note, I do have a few ideas on how to go about it, but it might be too big of a lift to do it now. Let's discuss it and write them down for later, though.

On types, I am happy to bring the direction to a smaller type early on.

Jake-Moss · 2023-04-19T06:18:37Z

For reference, here's a benchmark of the current and improved graph compression

This was done using https://github.com/outerl/aequilibrae_performance_tests with

./benchmark.py --without-project-init -m ./models -p sioux_falls chicago_sketch LongAn Arkansas australia -l aequilibrae_graph_creation -r 3 -i 1 -c 0 --filename graph_creation_after --details after

Currently this yields an approximate 2.7x increase in performance (10s down to 3.7 on a 5600x) when using the Long An model. More testing and proper benchmarking is to come. I've looked into parallelising this code but if done naively a thread local `slink` variable, acting as a counter, would provide random access to the `compression_.*` arrays. I'm not exactly sure how to move forward with parallelising it without some more invasive changes. Some clean up commits are to come regarding the setting of members which is currently done at the end of the cythonised funciton.

Previous only the directed graph had its columns cast to the correct values. Now that is also applied to the network before any construction. Temporary arrays have also been adjusted to fit the smaller type. Removed the `.astype(int)` call from `link_idx` to avoid invalid cast warnings. Removed cast back to `int` for the compressed graph Prefer np.full over np.repeat and .fill, enforce np.int64 as well

Jake-Moss force-pushed the develop branch from 9e2d8f8 to 5a60b94 Compare April 21, 2023 02:09

Remove superfluous calls to max and cythonise small but hot function

06879c0

Jake-Moss force-pushed the develop branch from 61ebb1b to 54b15f4 Compare April 21, 2023 04:22

Jake-Moss force-pushed the develop branch from 54b15f4 to 38a530a Compare April 21, 2023 04:25

Jake-Moss marked this pull request as ready for review April 21, 2023 04:43

Jake-Moss requested a review from pedrocamargo April 21, 2023 04:43

pedrocamargo approved these changes Apr 21, 2023

View reviewed changes

pedrocamargo merged commit 8670b9e into AequilibraE:develop Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring graph compression down to Cython #399

Bring graph compression down to Cython #399

Jake-Moss commented Apr 14, 2023 •

edited

Loading

Jake-Moss commented Apr 14, 2023

pedrocamargo commented Apr 17, 2023

Jake-Moss commented Apr 19, 2023

Bring graph compression down to Cython #399

Bring graph compression down to Cython #399

Conversation

Jake-Moss commented Apr 14, 2023 • edited Loading

Jake-Moss commented Apr 14, 2023

pedrocamargo commented Apr 17, 2023

Jake-Moss commented Apr 19, 2023

Jake-Moss commented Apr 14, 2023 •

edited

Loading