Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Request all arrays from uproot at once inside dask task #1076

Merged
merged 2 commits into from Apr 15, 2024

Conversation

nsmith-
Copy link
Member

@nsmith- nsmith- commented Apr 12, 2024

So we can do more IO optimizations inside uproot's source (e.g. coalesced reads, more concurrent decompression)

FYI @jpivarski while working on this I found that uproot 5.3.3 broke test_nanoevents_delphes.py and test_nanoevents_treemaker.py.
scikit-hep/uproot5@v5.3.2...v5.3.3 doesn't suggest anything obvious

@nsmith- nsmith- changed the title Request all arrays from uproot at once inside dask task feat: Request all arrays from uproot at once inside dask task Apr 12, 2024
@nsmith-
Copy link
Member Author

nsmith- commented Apr 12, 2024

In the longer term we should really refactor nanoevents/mapping. It is a bit convoluted mostly due to being designed to work with https://github.com/CoffeaTeam/columnservice with globally unique form keys. I think we will find a different implementation of column caching. Then, more of the form key logic can be left to uproot (as it already is implemented for the sake of uproot.dask) and we can simplify the mapping to just implement the DSL for things like counts2offsets, parents, etc.

@jpivarski
Copy link
Contributor

@jpivarski while working on this I found that uproot 5.3.3 broke test_nanoevents_delphes.py and test_nanoevents_treemaker.py.
scikit-hep/uproot5@v5.3.2...v5.3.3 doesn't suggest anything obvious

Does Delphes have any std::bitvector TBranches? Before, they would have been skipped; now we'd be trying to interpret them. Maybe that's going wrong?

@nsmith- nsmith- requested a review from lgray April 15, 2024 13:38
@lgray
Copy link
Collaborator

lgray commented Apr 15, 2024

closes #1077

@lgray lgray merged commit b8d296c into master Apr 15, 2024
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants