Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement drive_id class more fully #364

Merged
merged 17 commits into from
Jul 7, 2021
Merged

Implement drive_id class more fully #364

merged 17 commits into from
Jul 7, 2021

Conversation

jennybc
Copy link
Member

@jennybc jennybc commented Jun 28, 2021

Closes #93

Some aspects of this PR were done in a certain way, in order to generate discussion for a live code review in a group meeting


googledrive has always had drive_id, a lightweight S3 class to mark a string (or character vector) as file ID, instead of a filename.

drive_id pre-dated vctrs and was implemented in a very minimal way.

Overall, it’s been OK, but there are annoying downsides:

  • File IDs are almost meaningless to humans, but they take up lots of valuable
    space when printing dribbles, often at the expense of name, which is
    often more interesting to the user.
  • Any sort of manipulation tends to lose the drive_id class.
  • You can put strings in a drive_id object that are definitely not valid, like
    "".

Here’s what I mean:

drive_find(n_max = 3)
#> # A dribble: 3 x 3
#>   name                        id                                   drive_resource  
#> * <chr>                       <chr>                                <list>          
#> 1 foo_sheet-TEST-drive_publi… 1CEefQCUc5T7B4yrawnNfqwdWEbiDyDs9E5… <named list [36…
#> 2 foo_doc-TEST-drive_publish  1oUrQNg-2lcAieZyCqoQ_vDwYLMVzhN4-oO… <named list [36…
#> 3 imdb_latin1_csv-TEST-drive… 1o_7f6WRXN-CxgijwgfpDiust8A3zKcNI    <named list [39…

(ids <- as_id(month.abb))
#>  [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
#> attr(,"class")
#> [1] "drive_id"
just_two <- ids[c(2, 5)]
just_two
#> [1] "Feb" "May"
class(just_two)
#> [1] "character"

as_id(c("", "lskdjflslsei"))
#> [1] ""             "lskdjflslsei"
#> attr(,"class")
#> [1] "drive_id"

This PR implements drive_id more fully and uses the capabilities in vctrs and pillar to address the above.
Resources consulted:

Another relevant resource is Davis’s notes on
“adding dplyr and vctrs support to your tibble subclass so that it works with the tidyverse”.
I recently updated the technology behind the dribble subclass and, since the id column of a dribble should be a drive_id (but previously was not), these two efforts intersect.

TL;DR here’s how things work now:

The final drive_id truncation strategy ended up being different. There is less truncation than shown here.

dat <- drive_find("jenny", n_max = 5)
dat
#> # A dribble: 5 x 3
#>   name                                         id       drive_resource   
#>   <chr>                                        <drv_id> <list>           
#> 1 DESCRIPTION-TEST-drive_upload-jenny-8fb0b018 19TUFZL… <named list [40]>
#> 2 DESC-TEST-drive_mv-jenny-9db391f6            1YbwaJe… <named list [40]>
#> 3 DESC-TEST-drive_mv-jenny-4be97296            18wab1k… <named list [40]>
#> 4 DESC-TEST-drive_mv-jenny-4be97296            1IH6z0z… <named list [40]>
#> 5 DESC-TEST-drive_mv-jenny-b4c90052            1qQMpEJ… <named list [40]>

(ids <- as_id(month.abb))
#> <drive_id[12]>
#>  [1] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
just_two <- ids[c(2, 5)]
just_two
#> <drive_id[2]>
#> [1] Feb May
class(just_two)
#> [1] "drive_id"   "vctrs_vctr"

as_id(c("", "lskdjflslsei"))
#> Error: A <drive_id> must match this regular expression: `^[a-zA-Z0-9_-]+$`
#> Invalid input:
#> x '""'

Problems and observations

The pillar::pillar_shaft() method isn’t working how I want.
Current work-around is to always truncate the drive_ids in a tibble.
This is not great, because sometimes you really do need access to the whole id for copy/paste. r-lib/pillar#331

That 👆 was resolved by settling for the truncation normally done on character columns. It's not ideal but it's better than nothing.

Lots of as.character(id) is necessary when incorporating drive_ids into requests, because of:

ids
#> <drive_id[12]>
#>  [1] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
jsonlite::toJSON(ids)
#> Error: No method asJSON S3 class: vctrs_vctr

That 👆 was resolved by inheriting base class (character) in new_drive_id().

Created on 2021-06-28 by the reprex package (v2.0.0.9000)

R/dribble.R Outdated Show resolved Hide resolved
R/drive_cp.R Outdated Show resolved Hide resolved
Comment on lines 78 to 83
# full <- format(x)
# trunc <- format(paste0(substr(x, 1, 7), cli::symbol$continue))
# pillar::new_pillar_shaft(
# list(full = full, trunc = trunc),
# width = pillar::get_max_extent(full),
# min_width = pillar::get_max_extent(trunc),
# class = "pillar_shaft_drive_id"
# )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish this worked but it does not quite do so (yet?).

@jennybc jennybc marked this pull request as ready for review June 28, 2021 16:13
R/drive_cp.R Outdated Show resolved Hide resolved
R/drive_id-class.R Outdated Show resolved Hide resolved
R/drive_id-class.R Show resolved Hide resolved
R/drive_id-class.R Outdated Show resolved Hide resolved
@jennybc jennybc merged commit 84d5bfe into master Jul 7, 2021
@jennybc jennybc deleted the embrace-drive-id-class branch July 7, 2021 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply drive_id class to the id variable in a dribble
2 participants