Skip to content

# tidyverse/tidyr

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

# spread does not appear to work with a `value` parameter of type factor #35

Closed
opened this issue Oct 6, 2014 · 5 comments

## Comments

### sjackman commented Oct 6, 2014

 See this example ```data <- data.frame(x = c("a", "a", "b", "b"), y = c("c", "d", "c", "d"), z = c("w", "x", "y", "z")) spread.factor <- data %>% spread(x, z) data\$z <- as.integer(data\$z) spread.integer <- data %>% spread(x, z) str(spread.factor) str(spread.integer)```
Author

### sjackman commented Oct 6, 2014

 In the following example, `spread.integer` has the expected shape—three columns y, a and b—but `spread.factor` does not have the expected shape. It has instead two columns, and the second column is a matrix. ``````> str(spread.integer) 'data.frame': 2 obs. of 3 variables: \$ y: Factor w/ 2 levels "c","d": 1 2 \$ a: int 1 2 \$ b: int 3 4 > str(spread.factor) 'data.frame': 2 obs. of 2 variables: \$ y : Factor w/ 2 levels "c","d": 1 2 \$ ordered: factor [1:2, 1:2] w x y z ..- attr(*, "levels")= chr "w" "x" "y" "z" ..- attr(*, "dimnames")=List of 2 .. ..\$ : NULL .. ..\$ : chr "a" "b" ``````
Member

### hadley commented Oct 7, 2014

 Slightly simpler MRE ```library(dplyr) library(tidyr) data <- data.frame(x = c("a", "a", "b", "b"), y = c("c", "d", "c", "d"), z = c("w", "x", "y", "z")) data %>% spread(x, z) %>% str() data %>% mutate(z = as.integer(z)) %>% spread(x, z) %>% str()```
Member

### hadley commented Oct 7, 2014

 Root cause is that `as.data.frame()` does not work nicely on 2-d factors: ```x <- factor(letters[1:4]) dim(x) <- c(2, 2) as.data.frame(x)```

### hadley added a commit that referenced this issue Oct 7, 2014

``` Better coercion for factors. #35 ```
``` f236577 ```
Member

### hadley commented Oct 7, 2014

 Better now, although the output columns are characters. I'm not sure if this reasonable, or if each column should be a factor with the same levels.
Author

### sjackman commented Oct 7, 2014

 Thanks, Hadley. Better would be that the output columns are factors with the same levels as the input.

### hadley closed this in ``` a37274c ```Aug 24, 2015

to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.