Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using complete with a subset of levels silently drops observations #493

Closed
gowerc opened this issue Sep 10, 2018 · 3 comments
Closed

Using complete with a subset of levels silently drops observations #493

gowerc opened this issue Sep 10, 2018 · 3 comments
Labels
feature a feature request or enhancement help wanted ❤️ we'd love your help! pivoting ♻️ pivot rectangular data to different "shapes" tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day wip work in progress

Comments

@gowerc
Copy link

gowerc commented Sep 10, 2018

Based upon the description of complete its purpose is to add missing observations for specific combinations of variable values. I believe therefore it is not very intuitive for it to silently drop observations if certain levels are not present, for example:

library(tidyr)
library(dplyr)

iris %>% 
    as_data_frame() %>% 
    mutate( Species = as.character(Species)) %>% 
    complete( Species = c("versicolor", "virginica"))

This code silently drops all observations with Species = setosa

I would propose either changing the function to use a full_join() to ensure the original data is preserved or at the very least adding a warning/message so the user is aware that data has been lost.

I am happy to make a PR if people agree this is a beneficial change.

@gowerc gowerc changed the title Using complete with missing levels silently drops observations Using complete with a subset of levels silently drops observations Sep 10, 2018
@hadley hadley added feature a feature request or enhancement pivoting ♻️ pivot rectangular data to different "shapes" help wanted ❤️ we'd love your help! labels Jan 4, 2019
@hadley
Copy link
Member

hadley commented Jan 4, 2019

I think the right way to handle this is to union() the supplied value with the existing values. A PR would be greatly appreciated 😄

@Ryo-N7
Copy link
Contributor

Ryo-N7 commented Jan 19, 2019

Going to work on this for tidy-dev-day.
@batpigandme please add a tidy-dev-day label.

@batpigandme batpigandme added the tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day label Jan 19, 2019
@batpigandme
Copy link
Contributor

@Ryo-N7 All set 👍

Ryo-N7 added a commit to Ryo-N7/tidyr that referenced this issue Jan 19, 2019
…ata is preserved even if certain levels not specified. fixes tidyverse#493
@hadley hadley added the wip work in progress label Feb 28, 2019
@hadley hadley closed this as completed in c119635 Mar 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement help wanted ❤️ we'd love your help! pivoting ♻️ pivot rectangular data to different "shapes" tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day wip work in progress
Projects
None yet
Development

No branches or pull requests

4 participants