-
-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bring graph compression down to Cython #399
Conversation
Locally I have two tests failing, due to this conversion to
aequilibrae/aequilibrae/transit/map_matching_graph.py Lines 293 to 305 in 5dc6246
|
It would be interesting to see benchmarking of the overall graph creating time as well, as that is that what matters in the end (even though the proportional gain would look the same given the preponderance of the compression step). On the parallelization note, I do have a few ideas on how to go about it, but it might be too big of a lift to do it now. Let's discuss it and write them down for later, though. On types, I am happy to bring the direction to a smaller type early on. |
For reference, here's a benchmark of the current and improved graph compression ./benchmark.py --without-project-init -m ./models -p sioux_falls chicago_sketch LongAn Arkansas australia -l aequilibrae_graph_creation -r 3 -i 1 -c 0 --filename graph_creation_after --details after |
Currently this yields an approximate 2.7x increase in performance (10s down to 3.7 on a 5600x) when using the Long An model. More testing and proper benchmarking is to come. I've looked into parallelising this code but if done naively a thread local `slink` variable, acting as a counter, would provide random access to the `compression_.*` arrays. I'm not exactly sure how to move forward with parallelising it without some more invasive changes. Some clean up commits are to come regarding the setting of members which is currently done at the end of the cythonised funciton.
Previous only the directed graph had its columns cast to the correct values. Now that is also applied to the network before any construction. Temporary arrays have also been adjusted to fit the smaller type. Removed the `.astype(int)` call from `link_idx` to avoid invalid cast warnings. Removed cast back to `int` for the compressed graph Prefer np.full over np.repeat and .fill, enforce np.int64 as well
Currently this yields an approximate
2.7x3.9x increase in performance (10s down to3.72.55s on a 5600x) when using the Long An model. More testing and proper benchmarking is to come.I've looked into parallelising this code but if done naively a thread local
slink
variable, acting as a counter, would provide random access to thecompression_.*
arrays. I'm not exactly sure how to move forward with parallelising it without some more invasive changes.Some clean up commits are to come regarding the setting of members which is currently done at the end of the cythonised function.