Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summarize() gives unexpected result - seems to corrupt the data frame #300

Closed
ijlyttle opened this issue Mar 5, 2014 · 6 comments
Closed

summarize() gives unexpected result - seems to corrupt the data frame #300

ijlyttle opened this issue Mar 5, 2014 · 6 comments
Assignees
Labels
Milestone

Comments

@ijlyttle
Copy link
Contributor

@ijlyttle ijlyttle commented Mar 5, 2014

I'll allow for the possibility I am not using summarize() as I should, but...

library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
mtcars_slim <- summarize(mtcars, mpg, cyl)
summary(mtcars_slim) # ok
##       mpg            cyl      
##  Min.   :10.4   Min.   :4.00  
##  1st Qu.:15.4   1st Qu.:4.00  
##  Median :19.2   Median :6.00  
##  Mean   :20.1   Mean   :6.19  
##  3rd Qu.:22.8   3rd Qu.:8.00  
##  Max.   :33.9   Max.   :8.00
str(mtcars_slim) # not ok
## 'data.frame':    1 obs. of  2 variables:
##  $ mpg: num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl: num  6 6 4 6 8 6 8 4 4 6 ...
mtcars_slim # just weird
## Warning: corrupt data frame: columns will be truncated or padded with NAs
##    mpg cyl
## 1 21.0   6
mtcars_slim[3, ] # even weirder
##     mpg cyl
## NA 22.8   4
sessionInfo()
## R version 3.0.2 (2013-09-25)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] dplyr_0.1.2  knitr_1.5.22
## 
## loaded via a namespace (and not attached):
## [1] assertthat_0.1 evaluate_0.5.1 formatR_0.10   Rcpp_0.11.0   
## [5] stringr_0.6.2  tools_3.0.2
@ijlyttle
Copy link
Contributor Author

@ijlyttle ijlyttle commented Mar 5, 2014

@hs3180
Copy link
Contributor

@hs3180 hs3180 commented Mar 5, 2014

summarise function was designed to aggregate columns in data.frame. like examples in document:

summarise(mtcars, mean(disp))
summarise(group_by(mtcars, cyl), mean(disp))

your code :

summarise(mtcars, mpg, cyl)

just select two columns and doesn't do aggregate.

I guess what you want is :

select(mtcars, mpg, cyl)

you can do summary directly by :

summary(select(mtcars, mpg, cyl))

@ijlyttle
Copy link
Contributor Author

@ijlyttle ijlyttle commented Mar 5, 2014

Hi hs3180,

Thanks for putting my head on straight.

I'll leave the issue open, for now, to allow the powers that be to determine if what I tried to do should throw an error.

I'll head over to the manipulatr google group to understand better the philosophical issues.

Thanks again,

Ian

@hadley
Copy link
Member

@hadley hadley commented Mar 5, 2014

It should definitely throw an error.

@hadley hadley added the bug label Mar 5, 2014
@hadley hadley added this to the v0.2 milestone Mar 17, 2014
@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Mar 24, 2014

Does now, i.e. :

> mtcars_slim <- summarize(mtcars, mpg, cyl)
Error : expecting result of length one, got : 32

@ijlyttle
Copy link
Contributor Author

@ijlyttle ijlyttle commented Mar 27, 2014

Thanks!

@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants