Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in googledrive::drive_auth(): scheduled jobs accessing googlesheets are failing on Rstudio-server #425

Closed
naveenbussari opened this issue Apr 18, 2023 · 17 comments

Comments

@naveenbussari
Copy link

Hi @jennybc ,
continuation from #424

I am using rstudio-server, community version, installed on ubuntu 20.04.
I am running an R script which reads an excel file. I have done the authentication and tokens are generated. I am using googledrive and googlesheets4 libraries.

`library(gargle)
library(cronR)
library(googlesheets4)
library(googledrive)

options(gargle_oath_cache = "~/.cache/gargle",
        gargle_oauth_email = "bussari.xxxxxx.com",
        use_oob = TRUE)

googledrive::drive_auth(email = "bussari.xxxxxx.com",
                        cache = "~/.cache/gargle",
                        use_oob = TRUE)

googlesheets4::gs4_auth(email = "bussari.xxxxxx.com",
                        cache = "~/.cache/gargle",
                        use_oob = TRUE)

x <-read_sheet('https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0')

x
`

We installed the latest version of gargle(1.4.0).
This time , jobs that access googlesheets are running fine from console/terminal but failing when we schedule them from rstudio-server
image
image

logs of the scheduled job

`Attaching package: ‘googlesheets4’
The following object is masked from ‘package:gargle’:
    request_make
Attaching package: ‘googledrive’
The following objects are masked from ‘package:googlesheets4’:
    request_generate, request_make
The following object is masked from ‘package:gargle’:
    request_make
Error in `googledrive::drive_auth()`:
! Can't get Google credentials.
ℹ Are you running googledrive in a non-interactive session? Consider:
• Call `drive_deauth()` to prevent the attempt to get credentials.
• Call `drive_auth()` directly with all necessary specifics.
ℹ See gargle's "Non-interactive auth" vignette for more details:
ℹ <https://gargle.r-lib.org/articles/non-interactive-auth.html>
Backtrace:
    ▆
 1. └─googledrive::drive_auth(...)
 2.   └─googledrive:::drive_abort(...)
 3.     └─cli::cli_abort(message = message, ..., .envir = .envir)
 4.       └─rlang::abort(...)
Execution halted
`

Thanks in advance.

@jennybc
Copy link
Member

jennybc commented Apr 18, 2023

options(gargle_oath_cache = "~/.cache/gargle",
        gargle_oauth_email = "bussari.xxxxxx.com",
        use_oob = TRUE)

Looks like there's a typo in the option name.

BAD, NO: gargle_oath_cache
GOOD, YES: gargle_oauth_cache

More observations:

  • You can probably delete use_oob from all of this code.
  • It doesn't make sense to set the cache path, email, and use_oob as global options AND then set them in explicit calls to auth functions.

I suspect the code should look more like this (untested, obviously):

library(googlesheets4)
library(googledrive)

options(gargle_oath_cache = "~/.cache/gargle", gargle_oauth_email = "bussari.xxxxxx.com")

x <-read_sheet('https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0')

@naveenbussari
Copy link
Author

naveenbussari commented Apr 19, 2023

`
library(googlesheets4)
library(googledrive)

options(gargle_oauth_cache = "~/.cache/gargle",gargle_oauth_email = "bussari.xxxxxx.com")

x <-read_sheet('https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0')

x
`

When I run the above code console/terminal it works fine. But when I schedule it, the logs are:

`
Attaching package: ‘googledrive’
The following objects are masked from ‘package:googlesheets4’:
    request_generate, request_make
Error in `gs4_auth()`:
! Can't get Google credentials.
ℹ Are you running googlesheets4 in a non-interactive session? Consider:
• Call `gs4_deauth()` to prevent the attempt to get credentials.
• Call `gs4_auth()` directly with all necessary specifics.
ℹ See gargle's "Non-interactive auth" vignette for more details:
ℹ <https://gargle.r-lib.org/articles/non-interactive-auth.html>
Backtrace:
     ▆
  1. └─googlesheets4::read_sheet("https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0")
  2.   └─googlesheets4:::get_cells(...)
  3.     └─googlesheets4::gs4_get(ssid)
  4.       └─googlesheets4:::gs4_get_impl_(as_sheets_id(ss))
  5.         └─googlesheets4::request_generate(...)
  6.           ├─gargle::request_build(...)
  7.           └─googlesheets4::gs4_token()
  8.             └─googlesheets4::gs4_auth()
  9.               └─googlesheets4:::gs4_abort(...)
 10.                 └─cli::cli_abort(...)
 11.                   └─rlang::abort(...)
Execution halted
`

if I schedule my previous code

`
library(gargle)
library(cronR)
library(googlesheets4)
library(googledrive)

options(gargle_oauth_cache = "~/.cache/gargle",gargle_oauth_email = "bussari.xxxxxx.com")

drive_auth(email = "bussari.xxxxxx.com",cache = "~/.cache/gargle")
gs4_auth(email = "bussari.xxxxxx.com",cache = "~/.cache/gargle")

x <-read_sheet('https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0')

x

`

the logs are

`
Attaching package: ‘googlesheets4’
The following object is masked from ‘package:gargle’:
    request_make
Attaching package: ‘googledrive’
The following objects are masked from ‘package:googlesheets4’:
    request_generate, request_make
The following object is masked from ‘package:gargle’:
    request_make
Error in `drive_auth()`:
! Can't get Google credentials.
ℹ Are you running googledrive in a non-interactive session? Consider:
• Call `drive_deauth()` to prevent the attempt to get credentials.
• Call `drive_auth()` directly with all necessary specifics.
ℹ See gargle's "Non-interactive auth" vignette for more details:
ℹ <https://gargle.r-lib.org/articles/non-interactive-auth.html>
Backtrace:
    ▆
 1. └─googledrive::drive_auth(email = "bussari.xxxxxx.com", cache = "~/.cache/gargle")
 2.   └─googledrive:::drive_abort(...)
 3.     └─cli::cli_abort(message = message, ..., .envir = .envir)
 4.       └─rlang::abort(...)
Execution halted
`

I have another doubt. How was it able to run in console/terminal even with incorrect spelling of function name?

@jennybc
Copy link
Member

jennybc commented Apr 19, 2023

How was it able to run in console/terminal even with incorrect spelling of function name?

It was a misspelled option name, not function name. You can set any options you want, but if the name is misspelled, it just won't have the effect you intend.

I think you should add some print statements to the script to confirm the existence of the cache path and list its contents. Is the cron job's idea of ~/ (home directory) the same as yours?

@naveenbussari
Copy link
Author

naveenbussari commented Apr 19, 2023

This time I tried giving full path to gargle cache and also few print statements along with some info.

code:

`
library(gargle)
library(cronR)
library(googlesheets4)
library(googledrive)

gargle::gargle_oauth_sitrep()
drive_user()
gargle::gargle_oauth_cache()
drive_auth()
print("current working directory is " )
getwd()
print("before options gargle")
options(gargle_oauth_cache = "/home/bussari.naveen/.cache/gargle",gargle_oauth_email = "bussari.xxxxxx.com")

list.files("~/.cache/gargle/")
print("before drive_oauth")
drive_auth(email = "bussari.xxxxxx.com",cache = "/home/bussari.naveen/.cache/gargle")
print("before drive_oauth")
gs4_auth(email = "bussari.xxxxxx.com",cache = "/home/bussari.naveen/.cache/gargle")

x <-read_sheet('https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0')

x

`

logs when run in console:( this time it threw some warning kind of stufff. Hopefully this leads to something)

`
> library(gargle)
> library(cronR)
> library(googlesheets4)
> library(googledrive)
> 
> gargle::gargle_oauth_sitrep()
ℹ Taking cache location from the `"gargle_oauth_cache"` option.
2 tokens found in this gargle OAuth cache:
~/.cache/gargle

email                          app             scopes          hash...   
______________________________ _______________ _______________ __________
bussari.xxxxxx.com tidyverse-erato ...spreadsheets 54a075e...
bussari.xxxxxx.com tidyverse-erato ...drive        928dfaa...
> drive_user()
attempt to access internal gargle data from: googledrive
Logged in as:
• displayName: Bussari Naveen
• emailAddress: bussari.xxxxxx.com
> gargle::gargle_oauth_cache()
[1] "~/.cache/gargle"
> drive_auth()
> print("current working directory is " )
[1] "current working directory is "
> getwd()
[1] "/home/bussari.naveen"
> print("before options gargle")
[1] "before options gargle"
> options(gargle_oauth_cache = "/home/bussari.naveen/.cache/gargle",gargle_oauth_email = "bussari.xxxxxx.com")
> 
> list.files("~/.cache/gargle/")
[1] "54a075ef043e32727d17eb41a3d80974_bussari.xxxxxx.com"
[2] "928dfaa8a7345b1e2252d3bb85b912c6_bussari.xxxxxx.com"
> print("before drive_oauth")
[1] "before drive_oauth"
> drive_auth(email = "bussari.xxxxxx.com",cache = "/home/bussari.naveen/.cache/gargle")
> print("before drive_oauth")
[1] "before drive_oauth"
> gs4_auth(email = "bussari.xxxxxx.com",cache = "/home/bussari.naveen/.cache/gargle")
trying `token_fetch()`
Trying `credentials_byo_oauth()` ...
Error caught by `token_fetch()`:
inherits(token, "Token2.0") is not TRUE
trying `credentials_service_account()`
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_external_account()`
adding "userinfo.email" scope
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_app_default()`
Trying `credentials_gce()` ...
✖ We don't seem to be on GCE.
trying `credentials_user_oauth2()`
attempt to access internal gargle data from: googlesheets4
Gargle2.0 initialize
adding "userinfo.email" scope
loading token from the cache
matching token found in the cache
> 
> x <-read_sheet('https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0')
attempt to access internal gargle data from: googlesheets4
✔ Reading from rstudio_test.
✔ Range Sheet1.
attempt to access internal gargle data from: googlesheets4
> 
> x
# A tibble: 2 × 2
  Name      ID
  <chr>  <dbl>
1 Vinay      1
2 Naveen     2
`

logs when scheduled the job, the job failed at drive_auth()

`
The following object is masked from ‘package:gargle’:
    request_make
Attaching package: ‘googledrive’
The following objects are masked from ‘package:googlesheets4’:
    request_generate, request_make
The following object is masked from ‘package:gargle’:
    request_make
2 tokens found in this gargle OAuth cache:
'~/.cache/gargle'

email                          app             scopes          hash...   
______________________________ _______________ _______________ __________
bussari.xxxxxx.com tidyverse-erato ...spreadsheets 54a075e...
bussari.xxxxxx.com tidyverse-erato ...drive        928dfaa...
ℹ Not logged in as any specific Google user.
[1] NA
Error in `drive_auth()`:
! Can't get Google credentials.
ℹ Are you running googledrive in a non-interactive session? Consider:
• Call `drive_deauth()` to prevent the attempt to get credentials.
• Call `drive_auth()` directly with all necessary specifics.
ℹ See gargle's "Non-interactive auth" vignette for more details:
ℹ <https://gargle.r-lib.org/articles/non-interactive-auth.html>
Backtrace:
    ▆
 1. └─googledrive::drive_auth()
 2.   └─googledrive:::drive_abort(...)
 3.     └─cli::cli_abort(message = message, ..., .envir = .envir)
 4.       └─rlang::abort(...)
Execution halted
`

@jennybc
Copy link
Member

jennybc commented Apr 19, 2023

Assuming you're running the same code in both scenarios, we're getting a lot less information from the cron job, which is where we really need the info. How are you capturing this? Feels like a standard out vs standard error problem (re: this missing info).

There's still a lot extraneous code, along the lines of what I pointed out before (#425 (comment)), which just adds noise and is making this harder.

This is still my leading hypothesis:

I think you should add some print statements to the script to confirm the existence of the cache path and list its contents. Is the cron job's idea of ~/ (home directory) the same as yours?

I see you changed to an absolute path, but this could still be a permissions problem, if the cron job is running as a user that doesn't have access to the cache. That would be revealed if we saw what happened when listing the files in the cache.

@naveenbussari
Copy link
Author

The cron generates a log file for each job that's scheduled. the logs I shared are from one such log file.

I tried printing the info of existence of .cache folder including its location. And we got desired output in the scheduler logs also
gargle::gargle_oauth_sitrep()
drive_user()
gargle::gargle_oauth_cache()
getwd()
list.files("~/.cache/gargle/")

image

regarding permissions , we can see in the below image, all other users also have read permissions. I think read permission should do it.
image

Is there a way to print logs of what's actually happening when driver_auth() is run?

@naveenbussari
Copy link
Author

I tried giving executable permissions to .cache directory. Still the error remains.
image

@naveenbussari
Copy link
Author

in the code the "drive_user() and gargle::gargle_oauth_cache()" gives the following output
Screenshot 2023-04-20 at 3 20 41 PM

Where as in the scheduler logs, we can see
Screenshot 2023-04-20 at 3 15 19 PM

@jennybc
Copy link
Member

jennybc commented Apr 20, 2023

Can you print the gargle version? Could the cron job somehow be using an older version of gargle?

@naveenbussari
Copy link
Author

I edited the job to print gargle version. It uses latest version 1.4.0

Screenshot 2023-04-20 at 11 47 49 PM

image

@jennybc
Copy link
Member

jennybc commented Apr 20, 2023

I would get even more debugging info to compare by setting:

options(gargle_verbosity = "debug")

@naveenbussari
Copy link
Author

naveenbussari commented Apr 21, 2023

tried options(gargle_verbosity = "debug") before and after drive_auth()

code:

library(gargle)
library(cronR)
library(googlesheets4)
library(googledrive)

options(gargle_oauth_cache = "~/.cache/gargle",
        gargle_oauth_email = "bussari.xxxxxx.com")
gargle::gargle_oauth_sitrep()
options(gargle_verbosity = "debug")
drive_user()
gargle::gargle_oauth_cache()
# drive_auth()
options(gargle_verbosity = "debug")
googledrive::drive_auth(email = "bussari.xxxxxx.com",
                        cache = "~/.cache/gargle")

googlesheets4::gs4_auth(email = "bussari.xxxxxx.com",
                        cache = "~/.cache/gargle")
options(gargle_verbosity = "debug")
x <-read_sheet('https://docs.google.com/spreadsheets/d/1VWS3hbSP6Q5Q25YG16dqgBAejy_eojYzN_V-2Q9epFk/edit#gid=0')

x

The logs look hopeful

Attaching package: ‘googlesheets4’
The following object is masked from ‘package:gargle’:
    request_make
Attaching package: ‘googledrive’
The following objects are masked from ‘package:googlesheets4’:
    request_generate, request_make
The following object is masked from ‘package:gargle’:
    request_make

ℹ Taking cache location from the `"gargle_oauth_cache"` option.
2 tokens found in this gargle OAuth cache:
'~/.cache/gargle'

email                          app             scopes          hash...   
______________________________ _______________ _______________ __________
bussari.xxxxxx.com tidyverse-erato ...spreadsheets 54a075e...
bussari.xxxxxx.com tidyverse-erato ...drive        928dfaa...
ℹ Not logged in as any specific Google user.
[1] "~/.cache/gargle"
trying `token_fetch()`
Trying `credentials_byo_oauth()` ...
Error caught by `token_fetch()`:
inherits(token, "Token2.0") is not TRUE
trying `credentials_service_account()`
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_external_account()`
adding "userinfo.email" scope
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_app_default()`
Trying `credentials_gce()` ...
✖ We don't seem to be on GCE.
trying `credentials_user_oauth2()`
attempt to access internal gargle data from: googledrive
Gargle2.0 initialize
adding "userinfo.email" scope
loading token from the cache
no matching token in the cache
initiating new token
Error caught by `token_fetch()`:
OAuth2 flow requires an interactive session.
Error in `googledrive::drive_auth()`:
! Can't get Google credentials.
ℹ Are you running googledrive in a non-interactive session? Consider:
• Call `drive_deauth()` to prevent the attempt to get credentials.
• Call `drive_auth()` directly with all necessary specifics.
ℹ See gargle's "Non-interactive auth" vignette for more details:
ℹ <https://gargle.r-lib.org/articles/non-interactive-auth.html>
Backtrace:
    ▆
 1. └─googledrive::drive_auth(email = "bussari.xxxxxx.com", cache = "~/.cache/gargle")
 2.   └─googledrive:::drive_abort(...)
 3.     └─cli::cli_abort(message = message, ..., .envir = .envir)
 4.       └─rlang::abort(...)
Execution halted

@jennybc
Copy link
Member

jennybc commented Apr 21, 2023

My leading hypothesis is that, in the scheduled context, its defaulting to a different OAuth client, which would explain the cache miss:

trying `credentials_user_oauth2()`
attempt to access internal gargle data from: googledrive
Gargle2.0 initialize
adding "userinfo.email" scope
loading token from the cache
no matching token in the cache
initiating new token

However the known cause for that problem has been fixed:
r-lib/gargle@fa49d20

Maybe your cron job is not a sub-process of RStudio at all?

You could try setting this option explicitly to force the use of a web OAuth client, which I can tell is what your cached tokens are associated with.

options(gargle_oauth_client_type = "web")

@naveenbussari
Copy link
Author

Finally some good news!

Added the line options(gargle_oauth_client_type = "web") before loading gargle cache.
image

It worked. The scheduled job ran and returned the desired output(reading and printing data from excel sheet)

Attaching package: ‘googlesheets4’
The following object is masked from ‘package:gargle’:
    request_make
Attaching package: ‘googledrive’
The following objects are masked from ‘package:googlesheets4’:
    request_generate, request_make
The following object is masked from ‘package:gargle’:
    request_make
ℹ Taking cache location from the `"gargle_oauth_cache"` option.
2 tokens found in this gargle OAuth cache:
'~/.cache/gargle'

email                          app             scopes          hash...   
______________________________ _______________ _______________ __________
bussari.xxxxxx.com tidyverse-erato ...spreadsheets 54a075e...
bussari.xxxxxx.com tidyverse-erato ...drive        928dfaa...
ℹ Not logged in as any specific Google user.
[1] "~/.cache/gargle"
trying `token_fetch()`
Trying `credentials_byo_oauth()` ...
Error caught by `token_fetch()`:
inherits(token, "Token2.0") is not TRUE
trying `credentials_service_account()`
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_external_account()`
adding "userinfo.email" scope
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_app_default()`
Trying `credentials_gce()` ...
✖ We don't seem to be on GCE.
trying `credentials_user_oauth2()`
attempt to access internal gargle data from: googledrive
Gargle2.0 initialize
adding "userinfo.email" scope
loading token from the cache
matching token found in the cache
trying `token_fetch()`
Trying `credentials_byo_oauth()` ...
Error caught by `token_fetch()`:
inherits(token, "Token2.0") is not TRUE
trying `credentials_service_account()`
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_external_account()`
adding "userinfo.email" scope
Error caught by `token_fetch()`:
Argument 'txt' must be a JSON string, URL or file.
trying `credentials_app_default()`
Trying `credentials_gce()` ...
✖ We don't seem to be on GCE.
trying `credentials_user_oauth2()`
attempt to access internal gargle data from: googlesheets4
Gargle2.0 initialize
adding "userinfo.email" scope
loading token from the cache
matching token found in the cache
attempt to access internal gargle data from: googlesheets4
✔ Reading from "rstudio_test".
✔ Range 'Sheet1'.
attempt to access internal gargle data from: googlesheets4
# A tibble: 2 × 2
  Name      ID
  <chr>  <dbl>
1 Vinay      1
2 Naveen     2

But one thing that still worries is , as it can seen in the logs that says 'Not logged in as any specific Google user'.
Because we are looking forward to schedule jobs to trigger emails

@jennybc
Copy link
Member

jennybc commented Apr 21, 2023

drive_user() does not trigger auth, so if auth hasn't happened yet, this is what it says.

Once you do auth explicitly or some action that requires auth implicitly, drive_user() will tell you who you are auth'ed as.

@HugoGit39
Copy link

HugoGit39 commented May 24, 2023

Thanks a lot! the options(gargle_oauth_client_type = "web") did the trick with me too! Cost me a whole day but finally have it to work!

@jennybc
Copy link
Member

jennybc commented May 24, 2023

The possibility of setting options(gargle_oauth_client_type = "web") is now mentioned in more places in gargle and packages that derive their docs from it. So I think we're done here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants