Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic support for Grouping::GtoFEW #3370

Merged
merged 7 commits into from Oct 15, 2022
Merged

Conversation

oleksiyskononenko
Copy link
Contributor

@oleksiyskononenko oleksiyskononenko commented Oct 14, 2022

It seems that dt.categories() is the first datatable function, that needs to produce Grouping::GtoFEW columns. That's because the number of categories in a categorical column could be anything between 0 and nrows - 1. Currently, datatable doesn't really support Grouping::GtoFEW, but may need it for the cases when dt.categories() is combined with other f-expressions, or when dt.categories() is applied to columns that have different number of underlying categories.

In this PR we

  • add some basic support for Grouping::GtoFEW grouping mode;
  • adjust dt.categories() to produce Grouping::GtoFEW columns, that in the case of uneven number of rows are promoted to Grouping::GtoALL;
  • do minor refactoring in dt.alias() function.

WIP for #1691

@oleksiyskononenko oleksiyskononenko added the improve Improvement of an existing functionality label Oct 14, 2022
@oleksiyskononenko oleksiyskononenko added this to the Release 1.1.0 milestone Oct 14, 2022
@oleksiyskononenko oleksiyskononenko self-assigned this Oct 14, 2022
@oleksiyskononenko oleksiyskononenko changed the title Add some support for Grouping::GtoFEW Add basic support for Grouping::GtoFEW Oct 14, 2022
@oleksiyskononenko
Copy link
Contributor Author

Merging, error is not relevant to this PR.

@oleksiyskononenko oleksiyskononenko merged commit 20f1ec8 into main Oct 15, 2022
@oleksiyskononenko oleksiyskononenko deleted the ok/categories-col branch October 15, 2022 07:59
samukweku pushed a commit that referenced this pull request Jan 3, 2023
It seems that `dt.categories()` is the first datatable function, that needs to produce `Grouping::GtoFEW` columns. That's because the number of categories in a categorical column could be anything between `0` and `nrows - 1`. Currently, datatable doesn't really support `Grouping::GtoFEW`, but may need it for the cases when `dt.categories()` is combined with other f-expressions, or when `dt.categories()` is applied to columns that have different number of underlying categories.

In this PR we
- add some basic support for `Grouping::GtoFEW` grouping mode;
- adjust `dt.categories()` to produce `Grouping::GtoFEW` columns, that in the case of uneven number of rows are  promoted to `Grouping::GtoALL`;
- do minor refactoring in `dt.alias()` function.

WIP for #1691
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improve Improvement of an existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants