Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement unflatten function #583

Merged
merged 5 commits into from
Dec 10, 2020
Merged

Conversation

jrueb
Copy link
Contributor

@jrueb jrueb commented Dec 9, 2020

This implements an unflatten function for awkward as described in #555.
As I really needed this function and the issue was marked as "good first issue" I thought I would give it a try.
Would you like to have a unit test for the function? I'm not sure as this is my first pull request, but if needed I can write one.

I noticed one problem with cumsum while writing this. However, this might need to go into a separate issue. Consider

arr = ak.Array(cupy.array([1,2,3]))
ak.nplike.of(arr.layout).cumsum(arr.layout)

will raise a TypeError. However it will work fine if you change cupy to numpy.

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! I have only small (explicit) corrections.

One question: do you want to allow missing values (None) in counts and pass them through to the output? If so, that's most of what the corrections do: they handle that in more generality. (IndexedOptionArray isn't the only node class that provides option-type, there's also ByteMaskedArray, BitMaskedArray, and UnmaskedArray. You don't have to explicitly check for all of these cases; the to_numpy function with allow_missing=True does that. See the comments.)

If you don't actually want this feature, if you want to require counts to only contain integers, then it's simpler (and I describe that, too).

In the corresponding issue, we talked about adding an axis parameter, but this PR doesn't have to address that. We can add a placeholder and have it raise NotImplementedError for all but the default value.

src/awkward/operations/structure.py Outdated Show resolved Hide resolved
src/awkward/operations/structure.py Outdated Show resolved Hide resolved
src/awkward/operations/structure.py Show resolved Hide resolved
src/awkward/operations/structure.py Show resolved Hide resolved
@jpivarski jpivarski linked an issue Dec 9, 2020 that may be closed by this pull request
@jrueb
Copy link
Contributor Author

jrueb commented Dec 9, 2020

Thanks for this long comment. I'm learning a lot. Will add the changes later.

For the missing values: Yes, I intended to pass them to the output. This is to undo ak.flatten that was called on a masked array.

I did not include an axis parameter, because of the statement "Such a function wouldn't take an axis" in #555. Was this parameter discussed somewhere else? Anyhow, I like the idea of having an axis parameter in parallel to the ones of ak.num and ak.flatten. Ultimately I think it would be nice if unflatten could take axis=None just as flatten does. However for that to be possible probably also num would need to learn to accept None.

@jpivarski
Copy link
Member

I did not include an axis parameter, because of the statement "Such a function wouldn't take an axis" in #555. Was this parameter discussed somewhere else?

Nope, I just forgot! No need to include axis, and I'll resolve the conversations about anything to do with axis.

Improve documentation
Raise errors if counts is valid
Handle all option types
@jrueb jrueb marked this pull request as ready for review December 10, 2020 18:31
@jrueb
Copy link
Contributor Author

jrueb commented Dec 10, 2020

Marked it a moment too early as "ready for review" but now everything should be ready for a review.

@jpivarski
Copy link
Member

I noticed that it didn't have tests, so I added them, and that gave me more things to check. Some of these appeared in the ak.unflatten definition, but a few others were in nplike and to_layout. Unless you have any complaints, I'll merge this when the tests pass.

Thanks!

@jpivarski jpivarski merged commit b6720c3 into scikit-hep:main Dec 10, 2020
@jpivarski
Copy link
Member

I know this was 4 years ago, but thanks for doing this and I'm sorry that I forgot to add you to the all-contributors list!

@all-contributors please add @jrueb for code

Copy link
Contributor

@jpivarski

I couldn't determine any contributions to add, did you specify any contributions?
Please make sure to use valid contribution names.

I've put up a pull request to add @jrueb! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ak.unflatten: an operation like JaggedArray.fromcounts
2 participants