New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use ak.merge_union_of_records to generate input data format #1017
Conversation
5f638d2
to
6e3e492
Compare
aca12a2
to
d2b10ef
Compare
@nsmith- this work around was the best I could come up with for now. This at least lets us deal with changing forms in a dataset in a way that doesn't mess up the I think it keeps the leak fairly contained, but if you can think of a cleaner solution I'll happily accept it. |
d2b10ef
to
a1dbafe
Compare
Why not |
Because I can't get it to work properly. |
After letting this simmer all day in the back of my head I think I have a way to get it to do option arrays. I'll give it a try. |
@nsmith- ready for a review. |
1b0b518
to
0d9fe7c
Compare
0eef99b
to
bf40012
Compare
974c56b
to
51fbc80
Compare
…present computed form
51fbc80
to
7f82629
Compare
Fixes #1014
May not need to merge if current pre-processing scheme already functions for this task.As it is right now this PR uses ak.merge_union_of_records to tag fields that are not shared across files in a dataset.
When a file is parsed, if a given key is not available in that file a buffer is generated as an array of
None
using an IndexedOptionArray and the expected length of the step in that form key.The only kind of arrays that are supported by this workaround at present are flat arrays of booleans. The user may choose with
ak.fill_none
the default behavior for the boolean to match their needs.If any other kind of array is encountered the file is considered not openable as nanoevents and emits errors either in preprocessing or within nanoevents itself.