Use a temporary array to read detector data with pulse selection #220
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This aims to speed up reading only selected frames from XTDF detector data, using the LPD/AGIPD/DSSC
.get_array()
method withpulses=
. This was prompted by @daviddoji's use case reading LPD data.I believe this is working around a performance issue in HDF5, which appears to have been around for years. It looks like HDF5 doesn't realise that it can copy large blocks of data, and falls back to copying point by point. By using a temporary array, we can persuade it that the source & destination are sufficiently similar so it works more efficiently. Then we copy the selected data to the output array with
numpy.compress()
.This involves some extra temporary memory use for the intermediate array. It should never be more than the size of a single file, and it should be possible for the operating system to do virtual memory tricks and only allocate memory for the frames we're actually reading, not the gaps between them.
The timings below for reading LPD parallel gain data with different numbers of pulses selected (e.g.
.get_array('image.data', pulses=np.s[:1])
). There are 100 pulses per train, but parallel gain mode records all 3 gain stages as separate frames, so we're actually reading 3n of 300 frames in each case. Times are per train; I used 10 or 25 trains to get a better average. I ran each one a couple of times to ensure data was cached.Without this change, you can see that reading even a single frame is slow, and the time scales linearly with the number of frames to read (except when we read all of them). With this change, reading a subset of frames per train is much faster.