Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add guide on unflattening and grouping #2622

Merged
merged 2 commits into from
Aug 8, 2023

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Aug 8, 2023

This PR adds an initial implementation of the https://awkward-array.org/doc/main/user-guide/how-to-create-unflatten-group.html tutorial page.

It's a hybrid user-guide and tutorial, following from the premise that we don't have enough resources to do both.

@agoose77 agoose77 requested a review from jpivarski August 8, 2023 14:01
@agoose77 agoose77 temporarily deployed to docs-preview August 8, 2023 14:10 — with GitHub Actions Inactive
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! The index heading still has a "[todo]" in it:

image

For group-by, I've been thinking that we might want to lean on Arrow even more and use its group by aggregations. Same for SQL-like joins.

Being NumPy-like, we don't intend to implement these features ourselves in Awkward Array, but we can make the best of both available in the same interface by passing Arrow's functionality through. If so, then the documentation on grouping would want to show the "Awkward native" way, involving flattening, run-lengths, and unflattening, as well as the "Arrow Compute" way, involving a direct call to that function.

@agoose77 agoose77 enabled auto-merge (squash) August 8, 2023 14:43
@agoose77
Copy link
Collaborator Author

agoose77 commented Aug 8, 2023

For group-by, I've been thinking that we might want to lean on Arrow even more and use its group by aggregations. Same for SQL-like joins.

Agreed. It doesn't make sense for us to re-invent the wheel unless we have to. The downside with Arrow is that it doesn't support all of the backends that we use, but there may be places that specialised Arrow support is worth having.

@agoose77 agoose77 temporarily deployed to docs-preview August 8, 2023 14:51 — with GitHub Actions Inactive
@agoose77 agoose77 disabled auto-merge August 8, 2023 14:53
@agoose77 agoose77 merged commit f1f93fc into main Aug 8, 2023
14 checks passed
@agoose77 agoose77 deleted the agoose77/docs-unflatten-group branch August 8, 2023 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants