Ungridded to ungridded performance #9

duncanwp · 2017-01-27T09:30:39Z

This originated here: https://jira.ceh.ac.uk/browse/JASCIS-305

One way to speed up collocation (and perhaps other operations) is to output the set of values as a ragged array, then perform the kernel as an array operation.

So e.g. output a numpy array of length output values, but with elements which are numpy arrays of values to operate on. Then just operate on those values as a numpy array wise operation. Not sure how much time this would save, but could be much quicker in some situations.

duncanwp · 2017-01-27T09:30:52Z

I'm currently of the thinking that we would be better off converting the sample points loop to C, either through Cython or PyPy (or Numba?) rather than trying to vectorise it because I don't think we'll get the flexibility we want that way.

duncanwp · 2017-01-27T12:22:21Z

By profiling an example CloudSat onto aircraft collocation it appears that the biggest bottleneck is actually the constraint step within the loop. Underneath we use the query_ball_point method to find nearby points for each sample point, but the query_ball_tree method should be much faster as we pass it all the sample points in one go. This moves the constraint step outside the sample point loop.

I've created a branch for this work here: https://github.com/cedadev/cis/tree/optimize_ug_col

duncanwp · 2017-03-03T15:07:56Z

I've had a bit more of a play with this today. Using query_ball_point does help (speed-up of about 50%) but not as much as I'd hoped - the bottleneck just gets moved to the loop over the constrained points.

I've had a look at ways of dealing with the staggered (jagged) array which you end up with. I think a scipy sparse array is probably the way to do this, but how to construct it and then actually use it to calculate the kernel values is going to need a bit more thought.

The other option is to calculate the whole sparse_distance_matrix in the first place. I've created an implementation for it, but it's very slow once you have many constrained points to calculate the distance of. We'll need this when calculating weighted averages anyway so I I'll probably concentrate on making this faster - using Cython for the main loop I think.

duncanwp mentioned this issue Jan 27, 2017

Collocation performance pytroll/cis-troll-match#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ungridded to ungridded performance #9

Ungridded to ungridded performance #9

duncanwp commented Jan 27, 2017

duncanwp commented Jan 27, 2017

duncanwp commented Jan 27, 2017

duncanwp commented Mar 3, 2017

Ungridded to ungridded performance #9

Ungridded to ungridded performance #9

Comments

duncanwp commented Jan 27, 2017

duncanwp commented Jan 27, 2017

duncanwp commented Jan 27, 2017

duncanwp commented Mar 3, 2017