-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Drop Duplicates Method #10207
Comments
Michal Malohlava commented: There is workaround (CC: [~accountid:557058:89297402-cb5a-4710-9511-20f42b25451a]) |
Navdeep commented: [~accountid:557058:389d9607-5bd8-4611-8c6a-755fe9295223] Oh, im intrigued. |
Michal Malohlava commented: Oki, so drop_duplicates() removes rows - at the end it is providing the same functionality as |
Lauren DiPerna commented: 1) one workaround is to use sort and then it could be easy to check duplicated rows (there is no support for that right now)
|
Neema Mashayekhi commented: Like in {{pandas.drop_duplicates()}} or data.table's {{unique()}}, customer would like the ability to: keep either the first occurrence or the last one, {{fromLast = FALSE}} or {{fromLast = TRUE}} .handle duplicate for multiple keys/columns. Using {{h2o.unique()}} only handles one key.[https://rdrr.io/rforge/data.table/man/duplicated.html|https://rdrr.io/rforge/data.table/man/duplicated.html] |
Neema Mashayekhi commented: Consider concatenating multiple columns into a single column: [1], ['A'] → ['1_A'] |
JIRA Issue Migration Info Jira Issue: PUBDEV-3292 Linked PRs from JIRA |
create a method similar to pandas drop_duplicates() for h2oframe
The text was updated successfully, but these errors were encountered: