Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring graph compression down to Cython #399

Merged
merged 3 commits into from Apr 21, 2023

Conversation

Jake-Moss
Copy link
Contributor

@Jake-Moss Jake-Moss commented Apr 14, 2023

Currently this yields an approximate 2.7x 3.9x increase in performance (10s down to 3.7 2.55s on a 5600x) when using the Long An model. More testing and proper benchmarking is to come.

I've looked into parallelising this code but if done naively a thread local slink variable, acting as a counter, would provide random access to the compression_.* arrays. I'm not exactly sure how to move forward with parallelising it without some more invasive changes.

Some clean up commits are to come regarding the setting of members which is currently done at the end of the cythonised function.

@Jake-Moss
Copy link
Contributor Author

Locally I have two tests failing, due to this conversion to np.int8 on line 296, as direction is only ever in {-1, 0, 1} should we look to convert the current usage of long to a smaller data type?

FAILED tests/aequilibrae/transit/test_lib_gtfs.py::test_map_match - ValueError: Buffer dtype mismatch, expected 'long' but got 'signed char'
FAILED tests/aequilibrae/transit/test_pattern.py::test_map_match - ValueError: Buffer dtype mismatch, expected 'long' but got 'signed char'

net_data = pd.DataFrame(
{
"distance": net.distance.astype(float),
"direction": net.direction.astype(np.int8),
"a_node": net.a_node.astype(int),
"b_node": net.b_node.astype(int),
"link_id": net.link_id.astype(int),
"is_connector": net.is_connector.astype(int),
"original_id": net.original_id.astype(int),
"closest": net.closest.astype(int),
"free_flow_time": net.free_flow_time.astype(float),
}
)

@pedrocamargo
Copy link
Contributor

It would be interesting to see benchmarking of the overall graph creating time as well, as that is that what matters in the end (even though the proportional gain would look the same given the preponderance of the compression step).

On the parallelization note, I do have a few ideas on how to go about it, but it might be too big of a lift to do it now. Let's discuss it and write them down for later, though.

On types, I am happy to bring the direction to a smaller type early on.

@Jake-Moss
Copy link
Contributor Author

For reference, here's a benchmark of the current and improved graph compression
image
This was done using https://github.com/outerl/aequilibrae_performance_tests with

./benchmark.py --without-project-init -m ./models -p sioux_falls chicago_sketch LongAn Arkansas australia -l aequilibrae_graph_creation -r 3 -i 1 -c 0 --filename graph_creation_after --details after

Currently this yields an approximate 2.7x increase in performance (10s down to 3.7 on a 5600x) when using the Long An
model. More testing and proper benchmarking is to come.

I've looked into parallelising this code but if done naively a thread local `slink` variable, acting as a counter,
would provide random access to the `compression_.*` arrays. I'm not exactly sure how to move forward with parallelising
it without some more invasive changes.

Some clean up commits are to come regarding the setting of members which is currently done at the end of the cythonised
funciton.
Previous only the directed graph had its columns cast to the correct
values. Now that is also applied to the network before any construction.
Temporary arrays have also been adjusted to fit the smaller type.

Removed the `.astype(int)` call from `link_idx` to avoid invalid cast
warnings.

Removed cast back to `int` for the compressed graph

Prefer np.full over np.repeat and .fill, enforce np.int64 as well
@Jake-Moss Jake-Moss marked this pull request as ready for review April 21, 2023 04:43
@pedrocamargo pedrocamargo merged commit 8670b9e into AequilibraE:develop Apr 21, 2023
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants