You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, looking at this code, I see that the main structure of the weight computation hasn't changed. We're still constructing small sub-arrays at each index, doing a few quick operations (max, cleaning, multiplications, summing), and saving the 9 resulting weights in a weight array. Surprised we hadn't tried this sooner given how much better it performs.
It looks like the key reason the runtime scales better in this model isn't anything about how this computation works, it's how many times we do it. Originally, we performed these operations locally once for every particle at every iteration. So, the runtime scaled as Np_tracer * iterations_per_particle. Now, we perform this once for every cell, so it scales with domain size (L-2) * (W-2). I bet if you checked the example cases you benchmarked, the ratio of these values should give roughly the speedup you observed.
One thing I wonder, though, is whether we could obtain even faster runtimes by modifying the structure of this computation itself, by switching from local array operations to global. Whatever is causing this function to take so long must be due to the fact that we're repeatedly constructing many sub-arrays in a loop and operating on them, instead of performing a few big global array operations (inside which we'd be taking advantage of all the fast numpy broadcasting stuff). In principal, if we broke up these operations (max, cleaning, multiplication, summing) into a loop over each of the D8 directions, with each being a global matrix operation, instead of a loop over each cell, we could reduce the amount of overhead repeatedly calling these functions. I don't know exactly how that scales, but it'd be the difference between time(np.max(small array)) * (L-2) * (W-2) and time(np.max(big array)) * 9. Does that scale better? Not sure.
So that is a potential route for further speeding up the routing weight calculation.
The text was updated successfully, but these errors were encountered:
From #16, @wrightky said:
So that is a potential route for further speeding up the routing weight calculation.
The text was updated successfully, but these errors were encountered: