New implementation of LoadEventNexus with compression #36594

peterfpeterson · 2023-12-29T15:59:42Z

Motivation

It has been frequently observed that LoadEventNexus is one of the slowest steps in any workflow. With the recent addition of disk read/write speed to mantid-profiler, it can be more clearly seen that LoadEventNexus is spending much of its time doing things other than reading from disk (see screenshot in the pull request). Some observations:

dd for a ~10GB file can copy to /dev/null at approximately 9Gbps
according to mantid-profiler, peak disk read for this file is closer to 15Gbps
the loadEvents portion of LoadEventNexus has an overall average of 4.7Gbps
profilers (prof and intel's vtune) show that ~50% of the time in this portion of code is spent in std::vector::emplace_back creating events and allocating memory
many workflows do not require event filtering and would work perfectly well with compressed events from the loaded data
other software (in python and idl) histograms as it reads the file and demonstrates significantly better performance

The current method for when CompressTolerance>0 is to create all the EventLists with EventType::TOF then call EventList::compressEvents(). EventList::compressEvents() then sorts the events by time-of-flight followed by compression. While this will create a relatively small EventWorkspace, it will still allocate memory for all of the EventType::TOF events (large temporary memory), then sort each EventList in serial (which is slow). These shortcomings were the main motivation for the creation of the LoadEventAndCompress algorithm which loads the file in chunks that are compressed and accumulated.

Talking with various CIS at ORNL, more than 2/3 of measurements could use this method for "large" files.

Suggested solution

Rather than allocating all of the events then re-using code for sorting and compressing, create a histogram while reading through the events, then convert the data to an EventList of EventType::WEIGHTED_NOTIME. This can be done by making a separate implementation of the ProcessBankData class which would be selected by specifying a compression tolerance and not having periods (see note below). The method for the new ProcessBankCompressed class will:

introspect the time-of-flights to determine the full range (taking into account user specifying reduced range)
configure an object that can calculate the bin index for a given time-of-flight - look at EventList::generateCountsHistogram() for details on the calculation. The method may need to be refactored to aid code re-use.
configure an object that stores the temporary histogram for each detector-id
- has a std::vector<float> to store the information about the total of all time-of-flight for each bin. The effective time of flight is calculated by dividing by the number of events in the bin
- has a std::vector<int> to store the number of events in each bin
once all of the events have been processed, this temporary histogram will create weighted events in a method that supplies the EventList to append them to. This should generate events in a similar manner to how CompressEvents algorithm does.
- [tof] = [sum of contributing tof] / [number of events] <- convert to double
- [weight] = [number of events]
- [errorSquared] = [number of events]

By adding a single class, the changes to the code can be localized to the new class, and connecting an option in LoadBankFromDiskTask to select the correct bank processor. The downside is that event_index and pulse information will be read from disk even though they will not be used.

There will be cases for which this method will not be used:

"Small" event files this will be significantly more memory intensive than loading the events and compressing them. It is up to the user to use LoadEventNexus + CompressEvents instead. The cross-over point will be related to when the number of events is equal to the number of bins in the temporary histogram.
Files with period data would require a temporary histogram for each period. This will default to the current method.
Files with weighted events will default to the current method.

This could be used by even more workflows if FilterBadPulses (default is off) is included as an additional parameter in LoadEventNexus. Similarly, it would be more useful if the behavior of veto pulses were included. This can be included in later versions by creating a TimeROI in LoadEventNexus and supplying it to the underlying code.

New classes

There will be a number of classes created in an effort to make this more organized and maintainable.

ProcessBankCompressed is mostly described above. Additionally, it will provide whatever functionality is necessary to filter events (e.g. based on time-of-flight-range or wall-clock time). It is also responsible for handling the life-cycle of classes it uses to process the individual events.
CompressedEventsAccumulator is a concept of something that takes events and updates the fine-bin histogram. It is unlikely to be an actual concrete or abstract class. The concept is that after being configured it is used in two phases: (1) events are added and stored in a way that can be later used and (2) to create actual events that are appended to EventList.
CompressedEventsBankAccumulator will contain information about the fine histogram parameters and a collection of CompressedEventSpectrumAccumulators. There will be a single CompressedEventBankAccumulator in each ProcessBankCompressed task.
CompressedEventSpectrumAccumulator will accumulate the supplied events for a single spectrum into a fine histogram. Above, this is described as two parallel vectors of vector<int> and vector<float> with information for EventList::FindLinearBin(). Measurements should be made to determine if the fine histogram could be stored as map<int, pair<int, float> and used with enough performance for a variety of cases. If the performance is equal, storing the temporary, fine histogram as a map is preferred since it will only contain bin boundaries where events exist.

Describe alternatives you've considered

If the individual Event objects had accessors and mutators, then rather than reserving the correct amount of space in the EventList and using std::vector::emplace_back, it would be possible to create a vector of the correct size using the std::vector constructor then modify the time-of-flight and pulse time information. There has not been an exploration of this technique and the benefits are unknown. Creating a fine histogram during load had been demonstrated by other software.

Additional context

While much of this information is anecdotal, the performance for a particular "large" event file is easily measured. The information at the top of this issue were observed with a laptop loading VULCAN_218092 which is 10GiB in size.

The text was updated successfully, but these errors were encountered:

peterfpeterson added the ORNL Team Issues and pull requests managed by the ORNL development team label Dec 29, 2023

This was referenced Jan 16, 2024

LoadEventNexus for non-filtering workflows mantidproject/roadmap#36

Open

Add optional argument to set event type in EventList constructor #36669

Merged

Refactor ProcessBankData class to enable code re-use #36673

Merged

peterfpeterson mentioned this issue Jan 24, 2024

Refactor BankPulseTimes to remove temporary object #36730

Merged

peterfpeterson mentioned this issue Feb 5, 2024

EventList generate unsorted histogram apply divisor as multiplicative factor #36813

Merged

This was referenced Feb 29, 2024

Improve wallclock filtering in LoadEventNexus #36944

Merged

Improve wallclock filtering in LoadEventNexus #36986

Merged

peterfpeterson mentioned this issue Mar 21, 2024

Add wall-clock filtering via pulse information to LoadEventNexus #37059

Merged

peterfpeterson mentioned this issue Apr 3, 2024

Add forward iterator to PulseIndexer #37117

Merged

peterfpeterson mentioned this issue Apr 11, 2024

LoadEventNexus with compression refactor #37141

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New implementation of LoadEventNexus with compression #36594

New implementation of LoadEventNexus with compression #36594

peterfpeterson commented Dec 29, 2023 •

edited

New implementation of LoadEventNexus with compression #36594

New implementation of LoadEventNexus with compression #36594

Comments

peterfpeterson commented Dec 29, 2023 • edited

peterfpeterson commented Dec 29, 2023 •

edited