-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement equivalent of Pandas qcut() #1680
Comments
In pandas, In terms of API, I'm thinking of simple method that applies to a column expression and produces a quantile encoding:
|
@st-pasha Yes, However, I am not sure how integer codes would work in this case. Can you give an example of what you expcet the output to look like given the current datatable framework? |
|
What about using Aggregator for this purpose?
|
@oleksiyskononenko Aggregator doesn't have such a precise semantics. For example,
Note that in this case the fourth quartile isn't even produced. |
@navdeep-G Will integer codes work for you, or should we wait with this issue until we have categorical column type? |
@st-pasha I am not sure if integer codes would work for me. I think waiting for categorical types would be best. Is there an issue for this? We can tag it here as a dependency. |
Prerequisite: #1691 |
@st-pasha Yes, that’s because I compress it at the end. Could be easily disabled or enabled depending on an input parameter. |
@navdeep-G @st-pasha actually I think we (or at least MLI) should be ok with anything that describes to which group a given value belongs to so integers should be fine. Currently, we're using the group boundaries i.e. |
will be produced.
The text was updated successfully, but these errors were encountered: