Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutate returns nothing from data tables #1431

Closed
dbuijs opened this issue Sep 30, 2015 · 5 comments
Closed

Mutate returns nothing from data tables #1431

dbuijs opened this issue Sep 30, 2015 · 5 comments
Labels
feature a feature request or enhancement
Milestone

Comments

@dbuijs
Copy link

dbuijs commented Sep 30, 2015

When I pipe a data.table into mutate, it returns nothing and breaks the rest of the chain.

I just observed this morning for the first time and I'm afraid I'm at a loss to pinpoint the specific change that caused it. I'm inclined to believe it was the data.table update to 1.9.6 that happened Sept 19, but I can't be sure.

On a fresh Docker image of rocker/ropensci, with data.table v1.9.6 and dplyr v0.4.3, the following is reproducible:

 library(data.table)
 library(dplyr)

 test <- data.table(a = 1:10, b = letters[1:10])

 # These should give the same result, a data table/dataframe with a new column called "c"

 # This gives a blank result, no warnings, no errors
 test %>% mutate(c = "New")

 # This gives the expected output
 test %>% as.data.frame() %>% mutate(c = "New")
@dbuijs
Copy link
Author

dbuijs commented Sep 30, 2015

Found out from data.table that it's a display issue. The object still exists and you can still access its elements directly, but when you call it from the console, it won't print. Not sure if this is a data.table issue or a dplyr issue.

See the referenced data.table issue above.

@hadley hadley added the feature a feature request or enhancement label Oct 21, 2015
@hadley hadley added this to the 0.5 milestone Oct 21, 2015
@hadley
Copy link
Member

hadley commented Oct 21, 2015

I think this should be straightforward to fix. Minimal reprex:

library(data.table)
library(dplyr)

mutate(data.table(x = 1), y = 2)

@mattmalin
Copy link

Note that it's only the first time displaying the data.table object that it fails to show. If assigned, it will display the second time.

Example console output:

> library(data.table)
> library(dplyr)
> foo <- mutate(data.table(x = 1), y = 2)
> foo
> foo
   x y
1: 1 2

For others waiting for this to be fixed: if wanting to see the output from a mutate call, e.g. when doing exploratory analysis, calling .Last.value will show the last return including if it was a data.table that failed to display:

> library(data.table)
> library(dplyr)
> mutate(data.table(x = 1), y = 2)
> .Last.value
   x y
1: 1 2

@cderv
Copy link
Contributor

cderv commented Jan 18, 2016

I think it is more than just a display problem.
When you add a column in data.table with := operator, it currently (data.table 1.9.6) displays nothing but it modifies the input by reference.

library(data.table)
DT <- data.table(x = 1)
DT
#>   x
#>1: 1
address(DT)
#> [1] "0x5e67190"
DT[, y := 2] # Display nothing but y is added in DT
DT
#>   x y
#>1: 1 2
address(DT) # And it is modified by reference (same address)
#> [1] "0x5e67190"

When you add a column in data.table with mutate, it displays nothing and it modifies nothing.

library(data.table)
library(dplyr)
DT <- data.table(x = 1)
DT
#>   x
#>1: 1
mutate(DT, y = 2) # Displays nothing
DT
#>   x
#>1: 1

Yet, the result has the correct class data.table.

class(mutate(DT, y = 2))
#>[1] "data.table" "data.frame"

And if we force evalutation the data.table way with [], it displays the right result but DT is not modified by reference.

mutate(DT, y = 2)[]
#>    x y
#>1: 1 2
DT
#>   x
#>1: 1

If we assign, the results is correct but it is indeed a new object

DT <- data.table(x  =  1)
DT
#>   x
#>1: 1
address(DT)
#>[1] "0x5e20600"
DT <- mutate(DT, y = 2)
DT # displays nothing - see data.table bug fix 
DT # displays results 
#>   x y
#>1: 1 2
address(DT) # new object so new address
[1] "0x3f9bf40"

Behavior is different when manipulation data.table objects with dplyr package. or data.table package. (related with #614 ?)
I do not know where it comes from and if I am able to search, I will try. I understand that data.table do not print in all cases when using := operator, as we seen in my example. Could be a cause.

Perhaps, it is intentional because modification by reference in data.table is not the way of things for dplyr. In that case, I missed that... It is still disturbing when you are used to work with data.table and dplyr to take advantage of both package.

@mattmalin
Copy link

This has not yet been resolved, issue has moved to dtplyr (data.table backend of dplyr) issues: tidyverse/dtplyr#11

@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants