-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid dispatch when manipulating df proxies #1129
Comments
I think this is a bug on our end. We work on the proxy without removing the class, which means that we then dispatch on all the vctrs methods while working with the proxy (e.g. We can't really remove the class for all proxies because atomic vectors can't be shallow-duplicated (even with altrep on recent R there is some overhead). We should at least do it with data frames though. The other solution is to duplicate our internal API with proxy variants that never dispatch. In the mean time I'll look into stripping the class of data frame subclasses in |
@earowang Can you please add this method in your package while we figure this out? #' @export
vec_proxy.tbl_ts <- function(x, ...) {
new_data_frame(x)
} This strips all the attributes of your data frame and prevents dispatching again during internal memory manipulations. |
@earowang, I've also been working on vctrs/dplyr compatibility for rsample and dials. One thing that I have found there is that the
For example, with the current implementation of # devtools::install_github("tidyverts/tsibble", ref = "vctrs-vec-restore")
library(vctrs)
library(tsibble)
# falls back to tibble b/c of duplicates in index
pedestrian[c(1, 1, 2),]
#> # A tibble: 3 x 5
#> Sensor Date_Time Date Time Count
#> <chr> <dttm> <date> <int> <int>
#> 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630
#> 2 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630
#> 3 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826
# oops
vec_slice(pedestrian, c(1, 1, 2))
#> # A tsibble: 3 x 5 [1h] <Australia/Melbourne>
#> # Key: Sensor [1]
#> Sensor Date_Time Date Time Count
#> <chr> <dttm> <date> <int> <int>
#> 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630
#> 2 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630
#> 3 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826 Created on 2020-05-29 by the reprex package (v0.3.0) Here are the rsample and dials PRs. I've added a lot of notes that you might find useful: The general pattern that is emerging would be for you to have a Then you have a tsibble_reconstruct(x, to) {
if (tsibble_reconstructable(x, to)) {
# reconstruct the tsibble
} else {
# "upcast" `x` to a bare tibble
}
} Then you could consistently use this helper in the dplyr 1.0.0 methods and vec_restore.tsibble(x, to) {
tsibble_reconstruct(x, to)
}
dplyr_reconstruct.tsibble(x, to) {
tsibble_reconstruct(x, to)
} |
Much appreciated for all your responses. @lionel- yep, adding @DavisVaughan Actually a bit of correctness has been sacrificed for performance. My understanding is that |
A lot of the time it is called with empty data (when doing I think an overall goal for 0.4.0 is to make sure restore is only called on complete data. |
I passed https://win-builder.r-project.org/incoming_pretest/tsibble_0.9.0_20200531_234022/Windows/00check.log |
This is very strange. For some reasons it looks like your |
Which version of vctrs are you using locally? Note that I sent vctrs 0.3.1 to win-builder last week, and it "remembers" the last sent version, even for other packages. So maybe tsibble passes tests with 0.3.1, but not 0.3.0? |
oh it would make sense that the proxy thing in vec-rbind does not work on 0.3.0. @DavisVaughan changed the implementation to take the proxy before the loop, rather than inside the loop. In that case, you'll need to wait before vctrs 0.3.1 is on CRAN. I was holding it for r-spatial/sf#1390, but I might send it anyway since it's not about a regression. |
Oh yea, I was using 0.3.1, and thought 0.3.1 was on CRAN. Indeed, it doesn't work with vctrs 0.3.0. |
When do you plan to send 0.3.1 for CRAN? I'll have to submit tsibble to fix CRAN check errors with dplyr v1.0.0. |
@earowang on CRAN! |
I think I need some help here.
I have defined
vec_restore()
for the tsibble object, which solves the attribute updating issue whenvec_slice()
a tsibble. But it breaksbind_rows()
that usesvec_rbind()
internally.In the tsibble code, I need to make sure that there's no
NA
in my index column, andvalidate_index()
does the job. I have no idea whenNA
has been introduced in thevec_restore()
. When I debug through myvalidate_index()
, it will not trigger that error because of noNA
.Created on 2020-05-29 by the reprex package (v0.3.0)
The
vec_restore()
implemention sits in this branch https://github.com/tidyverts/tsibble/tree/vctrs-vec-restoreThe text was updated successfully, but these errors were encountered: