Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further Speeding Up Routing Weight Calculations #22

Closed
elbeejay opened this issue Dec 2, 2020 · 0 comments · Fixed by #25
Closed

Further Speeding Up Routing Weight Calculations #22

elbeejay opened this issue Dec 2, 2020 · 0 comments · Fixed by #25
Labels
enhancement New feature or request

Comments

@elbeejay
Copy link
Member

elbeejay commented Dec 2, 2020

From #16, @wrightky said:

So, looking at this code, I see that the main structure of the weight computation hasn't changed. We're still constructing small sub-arrays at each index, doing a few quick operations (max, cleaning, multiplications, summing), and saving the 9 resulting weights in a weight array. Surprised we hadn't tried this sooner given how much better it performs.

It looks like the key reason the runtime scales better in this model isn't anything about how this computation works, it's how many times we do it. Originally, we performed these operations locally once for every particle at every iteration. So, the runtime scaled as Np_tracer * iterations_per_particle. Now, we perform this once for every cell, so it scales with domain size (L-2) * (W-2). I bet if you checked the example cases you benchmarked, the ratio of these values should give roughly the speedup you observed.

One thing I wonder, though, is whether we could obtain even faster runtimes by modifying the structure of this computation itself, by switching from local array operations to global. Whatever is causing this function to take so long must be due to the fact that we're repeatedly constructing many sub-arrays in a loop and operating on them, instead of performing a few big global array operations (inside which we'd be taking advantage of all the fast numpy broadcasting stuff). In principal, if we broke up these operations (max, cleaning, multiplication, summing) into a loop over each of the D8 directions, with each being a global matrix operation, instead of a loop over each cell, we could reduce the amount of overhead repeatedly calling these functions. I don't know exactly how that scales, but it'd be the difference between time(np.max(small array)) * (L-2) * (W-2) and time(np.max(big array)) * 9. Does that scale better? Not sure.

So that is a potential route for further speeding up the routing weight calculation.

@elbeejay elbeejay added the enhancement New feature or request label Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant