Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gargle2.0 class #19

Merged
merged 14 commits into from
Nov 15, 2017
Merged

Gargle2.0 class #19

merged 14 commits into from
Nov 15, 2017

Conversation

jennybc
Copy link
Member

@jennybc jennybc commented Nov 7, 2017

Please consider this an actual PR now.

  • Gargle2.0 extends Token2.0.
    • endpoint and a few other things are hard-wired to values appropriate to Google APIs
    • email is a new field, used to distinguish tokens for different Google users that otherwise have same endpoint + app + scopes. With regular httr caching, these clobber each other, which is how I ended up with so much DIY token stuff in googlesheets originally.

I have a corresponding branch of googledrive (https://github.com/tidyverse/googledrive/tree/gargle) where I've removed all token-fetching logic and tests are passing.

I have a corresponding branch of gmailr (r-lib/gmailr#87) where I've removed all token-fetching logic and tests are passing (caveat: gmailr's tests don't cover auth). Anecdotally, I can send email, switch users, switch apps.

@hadley has already said I should rework caching more thoroughly. I'd like to do that in a second PR. Mandate:

  • Cache gargle-fetched tokens only or primarily at the user level.
  • Be able to auto discover tokens (already true with httr caching), including service tokens (not already true).
  • Consider designating a primary Google identity.
  • Store tokens ... in a tibble instead of a list? Lookup based on variables that hold app, scopes, email versus current hashing scheme? Remember you don't need an exact match on scopes, rather you just need to verify the needed scopes are present.

Back to this PR: here's what direct usage looks like:

> devtools::load_all(".")
Loading gargle
> unlink(".httr-oauth")

> x <- token_fetch(scopes = "https://www.googleapis.com/auth/drive")
Waiting for authentication in browser...
Press Esc/Ctrl + C to abort
Authentication complete.
> ## select jenny.f.bryan here in the browser

> x             ## note the modified print method
<Token (via gargle)>
  <oauth_endpoint> google
       <oauth_app> gargle-demo
           <email> jenny.f.bryan@...
          <scopes> ...drive, email
     <credentials> access_token, expires_in, id_token, refresh_token, token_type
                   (expires in 59.98333 mins)
---

> x$email       ## note the new field
[1] "jenny.f.bryan@..."

> ## inspect cache file, specifically the hash
> y <- readRDS(".httr-oauth")
> y             ## note the addition of '-EMAIL'
$`48e7...62a0-jenny.f.bryan@...`
<Token (via gargle)>
  <oauth_endpoint> google
       <oauth_app> gargle-demo
           <email> jenny.f.bryan@...
          <scopes> ...drive, email
     <credentials> access_token, expires_in, id_token, refresh_token, token_type
                   (expires in 59.98333 mins)
---

> purrr::map(y, c("params", "scope")) ## note the inclusion of "email" scope
$`48e7...62a0-jenny.f.bryan@...`
[1] "https://www.googleapis.com/auth/drive"
[2] "email"                                

> ## if I ask for another token with same scopes, existing is a match
> ## we use it, ie no oauth dance
> x2 <- token_fetch(scopes = "https://www.googleapis.com/auth/drive")
> all.equal(x, x2)
[1] TRUE

> ## if I ask for another token with same scopes and same email,
> ## existing is a match, we use it, ie no oauth dance
> x3 <- token_fetch(
+   email = "jenny.f.bryan@...",
+   scopes = "https://www.googleapis.com/auth/drive"
+ )
> all.equal(x, x3)
[1] TRUE

> ## if I ask for another token with same scopes and DIFFERENT email,
> ## we do the oauth dance
> x4 <- token_fetch(
+   email = "jenny@...",
+   scopes = "https://www.googleapis.com/auth/drive"
+ )
Waiting for authentication in browser...
Press Esc/Ctrl + C to abort
Authentication complete.
> all.equal(x, x4)
[1] "Component \"credentials\": Component \"access_token\": 1 string mismatch"                  
[2] "Component \"credentials\": Component \"expires_in\": Mean relative difference: 0.000277855"
[3] "Component \"credentials\": Component \"id_token\": 1 string mismatch"                      
[4] "Component \"credentials\": Component \"refresh_token\": 1 string mismatch"                 
[5] "Component \"email\": 1 string mismatch"                                                    
> x$email
[1] "jenny.f.bryan@..."
> x4$email
[1] "jenny@..."

> ## if I don't specify email but scope is available from multiple tokens, we
> ## get a chooser
> token_fetch(scopes = "https://www.googleapis.com/auth/drive")
Multiple cached tokens exist. Pick the one you want to use.
Or enter '0' to obtain a new token.
1: jenny.f.bryan@...
2: jenny@...

Selection: 2
<Token (via gargle)>
  <oauth_endpoint> google
       <oauth_app> gargle-demo
           <email> jenny@...
          <scopes> ...drive, email
     <credentials> access_token, expires_in, id_token, refresh_token, token_type
                   (expires in 60 mins)
---

My own observations:

The email entered above, if novel, is not really connected to anything:

  • It is only consulted insofar as "is it equal to an existing email?" Should there be an easier way to force oauth dance?
  • It does not preselect a Google identity in the browser. Hmmm ... is that even possible? Anyhow, if someone enters an email, they might expect this. Connected to notion of having a primary Google identity.
  • It is not compared to the email user actually chooses, which is what gets baked into the token. Should I compare and message for mismatch?

Make it easier to get an overview of the tokens in .httr-outh? To delete a token or some tokens but not all?

if (is.null(token$credentials$access_token) ||
!nzchar(token$credentials$access_token)) {
NULL
} else {
## TODO(jennybc) this needs to be baked into the sub-classed service token
## object, once such exists
message("email: ", info[["client_email"]])
token
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure the email needs to be baked in here, because service tokens aren't cached like OAuth tokens. Is there a reason to bother?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only strong feeling is that there should be a uniform way to see the email attached to any Gargle2.0 token. The emails are long enough that one would never type one, but you'd definitely confirm one matches what you're seeing in the Cloud Console.

R/Gargle-class.R Outdated
@@ -0,0 +1,213 @@
#' Generate a Gargle token
#'
#' Constructor function for objects of class [`Gargle2.0`]. The `"email"` scope
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be `[Gargle2.0]`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I tried that but it breaks the links. I'm doing what it says in the roxygen md vignette, which, as you know, I visit frequently!

https://github.com/klutometis/roxygen/blame/e2ed231a17644fb85b73a1d31bb1c84cb3bb7893/vignettes/markdown.Rmd#L149-L152

Copy link
Member Author

@jennybc jennybc Nov 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although that bullet point seems to contradict itself 🤔 The example puts the backticks outside but the prose says to put them inside. In my hands, they have to go inside.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it should just be [Gargle2.0]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

R/Gargle-class.R Outdated
user_params = user_params,
type = type,
use_oob = use_oob,
as_header = TRUE, # hard-wired, ok?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes to all hard-wiring questions

R/Gargle-class.R Outdated
)
cat(
" (expires in ",
self$credentials$expires_in / 60,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs format() to eliminate extra dps

R/oauth-app.R Outdated
#' key = "123456789.apps.googleusercontent.com",
#' secret = "abcdefghijklmnopqrstuvwxyz"
#' )
oauth_app <- function(appname = NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to make this a new function: oauth_app_from_json(path)

R/oauth-app.R Outdated

if (!is.null(path)) {
stopifnot(is.character(path), length(path) == 1)
info <- jsonlite::fromJSON(readChar(path, nchars = 1e5))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_json()?

@craigcitro
Copy link
Collaborator

Comments in response to the initial questions:

  1. I think "force a re-auth" only needs code wrapping it if we think people will want to just delete one auth entry (as opposed to wiping the whole cache).
  2. Choosing the user is delicate, but should be easy if you have an email: this page suggests you just want to include this as the login_hint. There is also the hd param, which lets you restrict to a domain, which is not the common case, but super useful when you need it. 😉
  3. I would compare emails.

In particular, I think the question of how much fidelity token management needs is closely related to the question of the default local file we cache in: if it's ./.httr-oauth, people probably won't build up a huge number of tokens; if it's ~/.httr-oauth, we'll probably end up with a fair number.

One other subtlety that I want to mention, but I don't think we should worry about code wise: some APIs accept multiple scopes. So basically anywhere you need a bigquery scope, it suffices to pass along a cloud-platform-equipped token. There are lots of gory details I'd like to spare you, but the reason I mention is that it may be the case that multiple sets of scopes suffice for talking to a given API. (So if you see weird behavior while testing in terms of what an API accepts, have a low bar for asking.)

Line-by-line comments in a sec, though I suspect @hadley has already had way more insightful comments than I will. 😁

@jennybc
Copy link
Member Author

jennybc commented Nov 14, 2017

I pushed the changes in response to @hadley. Will make a second pass when you're done @craigcitro.

@jennybc
Copy link
Member Author

jennybc commented Nov 14, 2017

I should add some tests around finding matching token(s).

if (is.null(token$credentials$access_token) ||
!nzchar(token$credentials$access_token)) {
NULL
} else {
## TODO(jennybc) this needs to be baked into the sub-classed service token
## object, once such exists
message("email: ", info[["client_email"]])
token
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only strong feeling is that there should be a uniform way to see the email attached to any Gargle2.0 token. The emails are long enough that one would never type one, but you'd definitely confirm one matches what you're seeing in the Cloud Console.

@@ -3,6 +3,7 @@ Title: Easier Auth for Google Services
Version: 0.0.0.9000
Authors@R: c(
person("Craig", "Citro", , "craigcitro@google.com", c("aut", "cre")),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I almost wonder if we should drop aut for my entry. 😉

R/Gargle-class.R Outdated
#' Constructor function for objects of class [`Gargle2.0`]. The `"email"` scope
#' is always added if not already present. This is needed to retrieve the email
#' address associated with the token. This is considered a "low value" scope and
#' does not appear on the consent screen.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd throw a link to https://developers.google.com/+/web/api/rest/oauth#email or the like in here? (Maybe I'm link-happy.)

print = function(...) {
cat("<Token (via gargle)>\n", sep = "")
# print(self$endpoint) ## this is TMI IMO
# also, it's boring --> always google for us
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on this being TMI

"\n", sep = ""
)
cat(
" (expires in ",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that code will autorefresh, should we just elide the expiration?

In particular, I feel like this info is relevant for one of us, but likely noise to the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right long-term. I sense this print method will change even more before we're done. I'm going to leave it for now because I find it useful re: sanity checking that things I think are happening (or not) are actually happening (or not).

token <- httr::TokenServiceAccount$new(
endpoint = NULL,
secrets = info,
## TODO(jennybc) Is is really true I can't add the "email" scope here?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be super annoying if so. 😁

(Also, Is is -> Is it)

@jennybc
Copy link
Member Author

jennybc commented Nov 15, 2017

I think "force a re-auth" only needs code wrapping it if we think people will want to just delete one auth entry (as opposed to wiping the whole cache)

It also comes up if you've gotten a token with one identity and now you want to get a token with another. I'm going to think on making it easier to say "new OAuth dance" and will pursue @craigcitro's leads re: using a login hint.

@jennybc jennybc merged commit 06e7f23 into master Nov 15, 2017
@jennybc jennybc deleted the gargle2.0-class branch November 15, 2017 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants