Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcast may incorrectly assign column names when the argument 'drop = F' #85

Open
hzarkoob opened this issue Mar 19, 2018 · 0 comments
Open

Comments

@hzarkoob
Copy link

hzarkoob commented Mar 19, 2018

When the argument 'drop' is set to FALSE, in some cases dcast incorrectly assigns the column names.

Note 1. I understand that the reshape2 package is retired, but I think this might be important because, if I am not mistaken, currently some data manipulation tasks, including usage of formulas, are available only true the reshape2 package and not through the newer package tidyr.

Note2. If anyone knows an alternative way to accomplish the task mentioned in the clarifying example below I would appreciate it if they can let me know.

Example to demonstrate the issue:

a = data.frame(Year = c(2000, 2001, 2000, 2001), Country = c("A", "B", "B", NA), City = c("A1", "B1", NA, "C1"), Cost = c(10, 20, 50, 30))

print(a)

Year Country City Cost
1 2000 A A1 10
2 2001 B B1 20
3 2000 B 50
4 2001 C1 30

Now we apply the dcast function with the argument 'drop = F':

dcast(a, "Year ~ Country + City", aggregate.fun = sum, value.var = "Cost", drop = F)

Output:
Year A_A1 A_B1 A_C1 B_A1 B_B1 B_C1 NA NA NA NA NA NA
1 2000 10 NA NA NA NA NA NA 50 NA NA NA NA
2 2001 NA NA NA NA NA 20 NA NA NA NA 30 NA

Note that the value at B_C1 at year 2001 is incorrectly mentioned to be 20. I think that is just because the columns names are mistakenly assigned.

Things look good with the argument 'drop = T'.
dcast(a, "Year ~ Country + City", aggregate.fun = sum, value.var = "Cost", drop = T)

Output:

Year A_A1 B_B1 B_NA NA_C1
1 2000 10 NA 50 NA
2 2001 NA 20 NA 30

@hzarkoob hzarkoob changed the title dcast incorrectly assigns the column names with the argument 'drop = F' dcast may incorrectly assign column names when the argument 'drop = F' Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant