Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ak.stack, ak.unstack #200

Open
nsmith- opened this issue Apr 2, 2020 · 2 comments
Open

ak.stack, ak.unstack #200

nsmith- opened this issue Apr 2, 2020 · 2 comments

Comments

@nsmith-
Copy link
Contributor

nsmith- commented Apr 2, 2020

Proposing a new structure operation akin to pandas' stack and unstack which pivots a record structure to a jagged array and vice versa. The main difference is that we do not have a non-trivial row index, so the ak.unstack operation would create tuple RecordArray rather than labeled columns, as is done for structure operations like ak.cross. Similarly, it might make sense to impose that ak.unstack will only operate on a tuple RecordArray. As in pandas, the default axis (level) should probably be -1 for this operation.

Examples:

a = ak.Array([[(1, 2), (1, 3), (2, 3)], [], [(4, 5), (6, 7, 8)]])
assert ak.tolist(a.stack()) == [[[1, 2], [1, 3], [2, 3]], [], [[4, 5], [6, 7, 8]]]
assert ak.tolist(a.stack(dropna=False)) == [[[1, 2, None], [1, 3, None], [2, 3, None]], [], [[4, 5, None], [6, 7, 8]]]

a = ak.Array([[[1, 2], [1, 3], [2, 3]], [], [[4, 5], [6]]])
assert ak.tolist(a.unstack()) == [[(1, 2), (1, 3), (2, 3)], [], [(4, 5), (6, None)]]

a = ak.Array([{'x': (2, 3), 'y': 1}, {'x': (4, 5), 'y': 0}])
assert ak.tolist(a.stack()) == [{'x': [2, 3], 'y': [[0, 1], [2, 3]]}, {'x': [4, 5], 'y': [[0, 1], [2, 3]]}]

# for
a = ak.zip({'x': [[1, 2, 3], [4, 5, 6, 7]], 'y': [[1, 2, 3], [4, 5, 6, 7]]})
# the following
b = ak.choose(a, 3).i0 + ak.choose(a, 3).i1 + ak.choose(a, 3).i2
# could be written
b = ak.sum(ak.stack(ak.choose(a, 3)), -1)
@jpivarski
Copy link
Member

I'm not sure if you've been seeing this, but it was asked for.

https://github.com/scikit-hep/awkward-1.0/blob/docs/0198-tutorial-documentation-1/studies/how-to-questions-survey.md

Thanks for making an issue!

@jpivarski jpivarski added the feature New feature or request label Apr 2, 2020
@nsmith-
Copy link
Contributor Author

nsmith- commented Apr 3, 2020

Interestingly, ak.concatenate with axis=-1 would be implementable with ak.stack(ak.zip([a, b, c])). For other axes, I think the more generalized pandas pivot method would need to be implemented: it can map an arbitrary set column index levels to new row index levels.

@jpivarski jpivarski added this to Idea in Long-term projects Dec 11, 2020
@jpivarski jpivarski moved this from Python ideas to C++ ideas in Long-term projects Dec 15, 2020
@jpivarski jpivarski added this to To-do: hard in Prioritized issues Apr 15, 2022
@ioanaif ioanaif self-assigned this Jan 5, 2023
@jpivarski jpivarski added this to Unprioritized in Finalization Jan 19, 2024
@jpivarski jpivarski moved this from Unprioritized to P6 (lowest) in Finalization Jan 19, 2024
@jpivarski jpivarski moved this from P6 (lowest) to Set aside in Finalization May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
Finalization
Set aside (don't do)
Development

No branches or pull requests

3 participants