Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chunk yielding plugin and tests #769

Merged
merged 9 commits into from Nov 23, 2023
Merged

Conversation

WenzDaniel
Copy link
Collaborator

What is the problem / what does the code in this PR do
Adds new type of plugin which allows compute method to return a generator instead of chunks. This allows to return sub-chunks early, in case the chunk of the new data types becomes too large. This is needed for example when simulating events and computing from a list of photon hits raw waveforms.

@github-actions
Copy link

github-actions bot commented Oct 24, 2023

Coverage Status

coverage: 91.536% (+0.08%) from 91.456%
when pulling 6893bf2 on add_chunk_yielding_plugin
into 72f935c on master.

@WenzDaniel
Copy link
Collaborator Author

Tested processing on a small run 025423 results are equivalent to strax master.

@JelleAalbers JelleAalbers self-assigned this Nov 8, 2023
Copy link
Member

@JelleAalbers JelleAalbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implements a new DownChunkingPlugin base class, for plugins that produce multiple chunks iteratively from a single compute call.

The main use case is plugins that create data that is much more itense in memory/time than the input, e.g. simulating waveforms from photon arrival times. Currently, you instead have to ensure the input is read in in small enough chunks. If memory isn't an issue but you want smaller files on disk, I suppose you could also do that with a modified Saver. But since memory is smaller than the disk, I guess that wouldn't help you much.

I left some comments but I definitely like the idea. This just makes a very small modification to the plugin base class so there should be no impact on other things.

There may be some ways to generalize this in the future, e.g. track the last produced chunk time in the base class so the user doesn't have to, or make a plugin type where compute gets iterators in its arguments too (like the iters in Plugin.iter, but time-synchronized), so it can keep state in local variables rather than instance variables.

(I didn't look at the code style since #762 is likely incoming, but happy to do that if you like.)

strax/plugins/down_chunking_plugin.py Outdated Show resolved Hide resolved
strax/plugins/down_chunking_plugin.py Show resolved Hide resolved
strax/plugins/plugin.py Outdated Show resolved Hide resolved
strax/testutils.py Show resolved Hide resolved
tests/test_context.py Outdated Show resolved Hide resolved
@JelleAalbers JelleAalbers removed their assignment Nov 9, 2023
@dachengx
Copy link
Collaborator

I will not merge master into this now since that will mess up Jelle's review. Let's merge master into this PR later.

@WenzDaniel
Copy link
Collaborator Author

Thank you very much @JelleAalbers for your thorough review. I agree with all your suggestions and I think I adopted all of them. Most of your points raised are originating from the fact that first I modified the strax.Plugin class in #751, before we decided to move it into a separate plugin class. (Except for the organization of our tests here I was simply lazy, but I moved the tests now in a separate file like we did with CutPlugins.)

Copy link
Member

@JelleAalbers JelleAalbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Daniel -- Looks good!

@WenzDaniel WenzDaniel merged commit 21cc96e into master Nov 23, 2023
11 checks passed
@WenzDaniel WenzDaniel deleted the add_chunk_yielding_plugin branch November 23, 2023 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants