Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent failure in all googledrive functions #279

Closed
JMKelleher opened this issue Oct 7, 2019 · 14 comments
Closed

Intermittent failure in all googledrive functions #279

JMKelleher opened this issue Oct 7, 2019 · 14 comments

Comments

@JMKelleher
Copy link

JMKelleher commented Oct 7, 2019

When using any googledrive function (downloading, uploading, sharing, etc.) I will intermittently receive the following error:

Error in add_id_path(nodes, root_id = root_id, leaf = leaf) :
!anyDuplicated(nodes$id) is not TRUE

This occurs anywhere for 25% to 80% of the time, and will eventually work after running the function multiple times. It appears to occur more frequently after running several commands in quick succession, e.g. in a function making multiple calls to the same folder. I've attached an example with drive_ls, though it is not particularly informative. For context- this is a shared drive folder owned by me in my home directory containing several other shared folders. No other folders in any directory have the same name.

googledriveExample

Unsure how to provide a reprex in this case as it appears that this issue is not occurring for most people, but would be happy to provide one if given instruction.

@jennybc
Copy link
Member

jennybc commented Oct 7, 2019

Sounds related to #277

I am beginning to suspect that something has changed on the backend (Google side) re: the timescale of "eventual consistency".

@JMKelleher
Copy link
Author

It seems like the issue has actually gotten worse over the past days/weeks. I've tried to run drive_mkdir() a dozen times with no success within a function where it at least occasionally worked last week. Is it possible that this is account-related to some degree?

@jennybc
Copy link
Member

jennybc commented Oct 9, 2019

What error do you see with drive_mkdir()? Because that sounds unrelated to me.

But my years of wrapping Google APIs definitely confirms that certain APIs seem to go through "troubled periods" that are never acknowledged publicly and often just clear up on their own. I assume it means there is some challenge in the back end systems, that probably cause lots of downstream infelicities, and it doesn't really get identified as a specific problem with individual products.

@JMKelleher
Copy link
Author

Same error w/ the duplicate nodes. Interestingly, I reran this without the path argument and it was successful in creating the folder in my home directory. I tried it without path again and it failed the second time though. For reference 'WeeklyLCAReports' is a shared drive owned by me.

image

@jennybc
Copy link
Member

jennybc commented Oct 9, 2019

The inclusion of overwrite = TRUE means we are actually doing quite a bit of path work in the background, FYI. So that could easily be the connection to #277.

@athena-small
Copy link

I am experiencing what seems like a related issue. Calls to several googledrive functions, including drive_update(), drive_ls(), and drive_trash(), get stuck in I/O-forever-land, requiring a manual abort.

The example below includes four identical calls to drive_put(). The first two fail, the third succeeds, the fourth fails:

image

@athena-small
Copy link

May or may not be related: For me the errors began appearing immediately after I upgraded OS X on my MacBook Pro to v. Catalina.

After the OS upgrade, googledrive began requiring reauthorization of credentials to execute operations on Google Drive, multiple times in succession. It also started throwing the "Error in add_id_path()" errors that @JMKelleher and other's have noted, as well as getting stuck in I/O land.

Shortly after this, I updated googledrive to v. 1.0.0. This update did not resolve the various errors I was experiencing with googledrive functions.

My guess (it is only a guess): the various errors people are experiencing have something to do with authorizations. Specifically, they may have something to do with the switch under googledrive v 1.0.0 to using gargle to handle Auth functionality.

Update: This morning my code ran fine, without any glitches from googledrive functions. In previous recent days drive_update()'s performance has hit-or-miss, mostly miss.

@jsstanley
Copy link

Similar issues I think - #281 (comment)

@dirkseidensticker
Copy link

I have experienced the error earlier today related to drive_download(). It turned out that the error was occurring only at times when my coworker, with whom I share access to the file I intended to download, had it opened up. Thus it seemed to us that functions can not be executed once the backend is expecting transactions. Not sure if this is applicable to the problems described here but might be worth to keep in mind while troubleshooting.

@millerjef
Copy link

I am experiencing a similar issue with drive_ls. The code previously worked last in November.

df_q <- drive_ls(path = d_phs_q)

I get the following error message:

Error in add_id_path(nodes, root_id = root_id, leaf = leaf) : 
  !anyDuplicated(nodes$id) is not TRUE

Where d_phs_q refers to a Google Drive directory shared with me. I verified that I have access.

I also attempted to use drive_ls on one of my personal Google Drive directories. This worked once and has given me the same error message on each subsequent attempt this evening.

This issue occurs on Windows and Mac. My session_info() on windows platform follows:

- Session info -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.6.1 (2019-07-05)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       America/New_York            
 date     2020-01-05                  

- Packages ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ! package      * version    date       lib source        
   anytime      * 0.3.6      2019-08-29 [1] CRAN (R 3.6.1)
   assertthat     0.2.1      2019-03-21 [1] CRAN (R 3.6.0)
   backports      1.1.5      2019-10-02 [1] CRAN (R 3.6.1)
   broom          0.5.3      2019-12-14 [1] CRAN (R 3.6.1)
   callr          3.4.0      2019-12-09 [1] CRAN (R 3.6.1)
   cellranger     1.1.0      2016-07-27 [1] CRAN (R 3.6.1)
   cli            2.0.0      2019-12-09 [1] CRAN (R 3.6.1)
   clipr          0.7.0      2019-07-23 [1] CRAN (R 3.6.1)
   colorspace     1.4-1      2019-03-18 [1] CRAN (R 3.6.0)
   crayon         1.3.4      2017-09-16 [1] CRAN (R 3.6.1)
   curl           4.3        2019-12-02 [1] CRAN (R 3.6.1)
   DBI            1.1.0      2019-12-15 [1] CRAN (R 3.6.1)
   dbplyr         1.4.2      2019-06-17 [1] CRAN (R 3.6.0)
   desc           1.2.0      2018-05-01 [1] CRAN (R 3.6.1)
   devtools     * 2.2.1      2019-09-24 [1] CRAN (R 3.6.1)
   digest         0.6.23     2019-11-23 [1] CRAN (R 3.6.1)
   dplyr        * 0.8.3      2019-07-04 [1] CRAN (R 3.6.1)
   ellipsis       0.3.0      2019-09-20 [1] CRAN (R 3.6.1)
   fansi          0.4.0      2018-10-05 [1] CRAN (R 3.6.1)
   forcats      * 0.4.0      2019-02-17 [1] CRAN (R 3.6.1)
   fs           * 1.3.1      2019-05-06 [1] CRAN (R 3.6.0)
   gargle         0.4.0      2019-10-04 [1] CRAN (R 3.6.1)
   generics       0.0.2      2018-11-29 [1] CRAN (R 3.6.1)
   ggplot2      * 3.2.1      2019-08-10 [1] CRAN (R 3.6.1)
   glue           1.3.1      2019-03-12 [1] CRAN (R 3.6.1)
   googledrive  * 1.0.0      2019-08-19 [1] CRAN (R 3.6.1)
   gtable         0.3.0      2019-03-25 [1] CRAN (R 3.6.0)
   haven          2.2.0      2019-11-08 [1] CRAN (R 3.6.1)
   hms            0.5.2      2019-10-30 [1] CRAN (R 3.6.1)
   httpuv         1.5.2      2019-09-11 [1] CRAN (R 3.6.1)
   httr           1.4.1      2019-08-05 [1] CRAN (R 3.6.1)
   janitor      * 1.2.0      2019-04-21 [1] CRAN (R 3.6.0)
   jsonlite       1.6        2018-12-07 [1] CRAN (R 3.6.1)
   knitr          1.26       2019-11-12 [1] CRAN (R 3.6.1)
   later          1.0.0      2019-10-04 [1] CRAN (R 3.6.1)
   lattice        0.20-38    2018-11-04 [1] CRAN (R 3.6.1)
   lazyeval       0.2.2      2019-03-15 [1] CRAN (R 3.6.1)
   lifecycle      0.1.0      2019-08-01 [1] CRAN (R 3.6.1)
   lubridate    * 1.7.4      2018-04-11 [1] CRAN (R 3.6.1)
   magrittr       1.5        2014-11-22 [1] CRAN (R 3.6.1)
   memoise        1.1.0      2017-04-21 [1] CRAN (R 3.6.1)
   modelr         0.1.5      2019-08-08 [1] CRAN (R 3.6.1)
   munsell        0.5.0      2018-06-12 [1] CRAN (R 3.6.1)
 R nlme           3.1-140    <NA>       [2] <NA>          
   openxlsx     * 4.1.4      2019-12-06 [1] CRAN (R 3.6.1)
   packrat        0.5.0      2018-11-14 [1] CRAN (R 3.6.1)
 P phsAthletics * 0.0.0.9000 2018-12-13 [?] local         
   pillar         1.4.3      2019-12-20 [1] CRAN (R 3.6.2)
   pkgbuild       1.0.6      2019-10-09 [1] CRAN (R 3.6.1)
   pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 3.6.1)
   pkgload        1.0.2      2018-10-29 [1] CRAN (R 3.6.1)
   prettyunits    1.0.2      2015-07-13 [1] CRAN (R 3.6.1)
   processx       3.4.1      2019-07-18 [1] CRAN (R 3.6.1)
   promises       1.1.0      2019-10-04 [1] CRAN (R 3.6.1)
   ps             1.3.0      2018-12-21 [1] CRAN (R 3.6.1)
   purrr        * 0.3.3      2019-10-18 [1] CRAN (R 3.6.1)
   R6             2.4.1      2019-11-12 [1] CRAN (R 3.6.1)
   Rcpp           1.0.3      2019-11-08 [1] CRAN (R 3.6.1)
   readr        * 1.3.1      2018-12-21 [1] CRAN (R 3.6.1)
   readxl         1.3.1      2019-03-13 [1] CRAN (R 3.6.1)
   remotes        2.1.0      2019-06-24 [1] CRAN (R 3.6.1)
   reprex         0.3.0      2019-05-16 [1] CRAN (R 3.6.0)
   rlang          0.4.2      2019-11-23 [1] CRAN (R 3.6.1)
   rprojroot      1.3-2      2018-01-03 [1] CRAN (R 3.6.1)
   rstudioapi     0.10       2019-03-19 [1] CRAN (R 3.6.0)
   rvest          0.3.5      2019-11-08 [1] CRAN (R 3.6.1)
   scales         1.1.0      2019-11-18 [1] CRAN (R 3.6.1)
   sessioninfo    1.1.1      2018-11-05 [1] CRAN (R 3.6.1)
   stringi        1.4.3      2019-03-12 [1] CRAN (R 3.6.0)
   stringr      * 1.4.0      2019-02-10 [1] CRAN (R 3.6.1)
   testthat       2.3.1      2019-12-01 [1] CRAN (R 3.6.1)
   tibble       * 2.1.3      2019-06-06 [1] CRAN (R 3.6.1)
   tidyr        * 1.0.0      2019-09-11 [1] CRAN (R 3.6.1)
   tidyselect     0.2.5      2018-10-11 [1] CRAN (R 3.6.1)
   tidyverse    * 1.3.0      2019-11-21 [1] CRAN (R 3.6.1)
   usethis      * 1.5.1      2019-07-04 [1] CRAN (R 3.6.1)
   vctrs          0.2.1      2019-12-17 [1] CRAN (R 3.6.2)
   withr          2.1.2      2018-03-15 [1] CRAN (R 3.6.1)
   xfun           0.11       2019-11-12 [1] CRAN (R 3.6.1)
   xml2           1.2.2      2019-08-09 [1] CRAN (R 3.6.1)
   zeallot        0.1.0      2018-01-28 [1] CRAN (R 3.6.0)
   zip            2.0.4      2019-09-01 [1] CRAN (R 3.6.1)

[1] C:/Users/c-jeffmil/Documents/R/win-library/3.6
[2] C:/Program Files/R/R-3.6.1/library

 P -- Loaded and on-disk path mismatch.
 R -- Package was removed from disk.

@millerjef
Copy link

As an update to my message above, the following is a valid workaround but I needed to traverse the nested directory structure to find the right object id, drive_ls(as_id(id_to_path_object))

@millerjef
Copy link

As I use drive_ls to drill down the directory structure using as_id as noted above I am getting another warning (Unknown or uninitialised column: 'email'.) that I don't understand and am wondering if it reflects something about the directory structure that is contributing to the above issue:

> drive_ls(as_id(my_obj_id))
# A tibble: 15 x 3
   name                                                         id                                                                       drive_resource  
...
Warning messages:
1: Unknown or uninitialised column: 'email'. 
2: Unknown or uninitialised column: 'email'.

@jennybc
Copy link
Member

jennybc commented Jan 14, 2020

I still have yet to experience this phenomenon or get enough data to truly study it.

But I have formed an untestable hypothesis about the root cause and installed a fix 🤞

Needless to say, please open a new issue if you update to this dev version and still see the phenomenon.

@sudheshshetty
Copy link

Just in case in anyone still looking for an answer, I was facing the same issue as well. I just updated my R version to the latest version 3.6.3 and it resolved this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants