Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance-improvement: Combine boolean masks #39

Open
Croydon-Brixton opened this issue Mar 3, 2021 · 1 comment
Open

Performance-improvement: Combine boolean masks #39

Croydon-Brixton opened this issue Mar 3, 2021 · 1 comment
Labels
concerns: GeoGraph kind: enhancement New feature or request kind: performance performance improvement requires: benchmarks Requires benchmarking of code
Projects

Comments

@Croydon-Brixton
Copy link
Collaborator

If numpy does short-circuit evaluation on these things this it'd be slightly faster to combine boolean masks.

Does anyone know how numpy handles these type of cases (below)?

Case:
Case select_from_array[np.logical_or(condition_array1, condition_array2)]
Does it first evaluate both condition_array1 and condition_array2 in the slice [ ... ] and then or the conditions (in which case it'd probably be slower bc we would calculate the geometry overlaps for shapes which won't agree in class label).
Or does it calculate the first element of condition_array1 and then short-circuit decide if that element of condition_array2 even needs to be calculated? (in which case I think it should be slightly faster)

Originally posted by @Croydon-Brixton in #28 (comment)

@Croydon-Brixton Croydon-Brixton added concerns: GeoGraph kind: enhancement New feature or request kind: performance performance improvement requires: benchmarks Requires benchmarking of code labels Mar 3, 2021
@Croydon-Brixton Croydon-Brixton added this to Sidelined in GeoGraph Mar 3, 2021
@herbiebradley
Copy link
Member

I did some tests and it looks like numpy does not short-circuit:

shortc

This issue seems to confirm it: numpy/numpy#3446

However, there may be performance improvements by switching to np.where or bigger ones from using numba, as tested here https://stackoverflow.com/questions/58422690/filtering-a-numpy-array-what-is-the-best-approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concerns: GeoGraph kind: enhancement New feature or request kind: performance performance improvement requires: benchmarks Requires benchmarking of code
Projects
GeoGraph
  
Sidelined
Development

No branches or pull requests

2 participants