Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a whoami function (user, token, scopes); closes #39 #51

Merged
merged 15 commits into from
Dec 10, 2016
Merged

Add a whoami function (user, token, scopes); closes #39 #51

merged 15 commits into from
Dec 10, 2016

Conversation

jennybc
Copy link
Member

@jennybc jennybc commented Dec 8, 2016

Are you willing to have such a function here? I think it would be handy for new users and troubleshooting. But I could put it in githug instead, if you'd prefer not.

If you are receptive, I will flesh out the "HERE'S HOW TO GET A TOKEN AND WHERE TO STICK IT" message and add a test. And do anything else you suggest.

@codecov-io
Copy link

codecov-io commented Dec 8, 2016

Current coverage is 67.87% (diff: 96.00%)

Merging #51 into master will increase coverage by 2.95%

@@             master        #51   diff @@
==========================================
  Files             7          8     +1   
  Lines           228        249    +21   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            148        169    +21   
  Misses           80         80          
  Partials          0          0          

Powered by Codecov. Last update 23308b6...3b00a4b

@gaborcsardi
Copy link
Member

Hmmm. I can see that this is very useful, so it is tempting.

I am also not sure if gh is the right place for it. Why do you think it is better placed here than in githug?

TBH, I still haven't given up the hope of having a proper higher level GH API package, and I think gh_whoami could go there. But we don't have that package right now, and I won't write it anytime soon, so if you think it is better placed here than in githug, I'll be happy to merge it. :)

@jennybc
Copy link
Member Author

jennybc commented Dec 8, 2016

Why do you think it is better placed here than in githug?

Short-term, this is much closer to going to CRAN! Even long-term, gh should be the least common denominator for GitHub API work from R. I'd recommend anyone doing this in a script or package use gh. So it seems like the right place for this sort of diagnostic.

I will finish off the PR today and you can render your final judgment 🙏.

@gaborcsardi
Copy link
Member

I see gh as the basic machinery of a GH API, and gh_whoami does not qualify as basic to me. :)

But it will be fine here, for now. (Which, as we know, likely means forever. :)

@jennybc
Copy link
Member Author

jennybc commented Dec 8, 2016

I also just realized this sort of gets at addresses #39:

I'm mostly imagining this would be a place to hang the documentation for how to set up your GITHUB_PAT env var, and to give users a way to check that they're correctly configured.

@jennybc jennybc changed the title Add a whoami function (user, token, scopes) Add a whoami function (user, token, scopes); closes #39 Dec 8, 2016
@jennybc
Copy link
Member Author

jennybc commented Dec 8, 2016

My roxygen seems different from your roxygen.

Testing: shall I assume nothing, i.e. no token necessarily available? Or take advantage of whatever you've rigged up with httrmock (which I have not yet digested)?

@jennybc
Copy link
Member Author

jennybc commented Dec 8, 2016

BTW I have used successfully with GHE.

@jennybc
Copy link
Member Author

jennybc commented Dec 9, 2016

From another conversation:

I can create a token for the gh-testing user and share that with you using hadley/secure, then you can actually run the tests. I'll try to get to it today evening.

I basically see how this works. But yes I need that token. I played around with my own to get this far. I had some very puzzling times until I discovered httrmock::mocking_status()! I clearly don't understand the httrmock workflow.

I obviously won't leave the test like this, with my own info in there and skipping on travis and appveyor.

@gaborcsardi
Copy link
Member

Yes, sorry, I realized that hadley/secure is not going to be a long term solution, because it is not going to CRAN, so I started to look for other options. But for now I could just share the token with you. Based on this: https://gist.github.com/jeroenooms/1ffbfbdad40f4aad6657c337a4924f0e I used this code:

github_pubkey <- function(user){
  url <- sprintf("https://api.github.com/users/%s/keys", user)
  keys <- jsonlite::fromJSON(url)
  lapply(keys$key, openssl::read_pubkey)
}

jenny <- github_pubkey('jennybc')
pubkey <- jenny[[1]] # in case there are multiple keys

token <- readLines("~/works/r-pkgs/gh/tests/testthat/github-token.txt")
buf <- openssl::rsa_encrypt(charToRaw(token), pubkey)
cat(openssl::base64_encode(buf))

And here is the encrypted token:

MLDsFiOAVtdbbORHOCvfbTDMW6k1eQGPu7PfIXqzDe4SagO49eMAsv0f1g03eGORS+QU53MllUDPmYzlYczUVIdnNPjK1yygqS4NJDJNsWsbsDU1I19+KIFDqQ2lkOqq1/3fQeyvtmr7B4FHAbPMAtFLz/OFdbCPbcHurNCtIs9lck3YXp8IZBO9IaKzEEA1M+i+bnWBXxTbU8E0E1PDGPFXTAY34/GLsm7Th2Nz9OZ+EkZC7tLMHFtxuUYd8qi7lOZ6IxMYPs7b6t6QQ0l0QprHOUQWEVe7Ra/2m9Q5OWpvzoHNDkXFBwdSiNVQb0UhUpOfHL3M4yTjaZ/tYIZ3ZgE4nhKsogGLqhdPhuTXsFxbIqH1AxS+1ugOKcZ+f9VDdN2m/tdmjooxUITJI5+bNBSV9iawSaikD6nLPydqLIdH3a3UJ6akcMbBcjw10jvC67Q8JJhHiJVXjcIHKqwPfnVkwpPlAp6Lnn1Y3NNuzQsnIQJeRjQStd8kTxKNBake3X077d4PF7VloBghu396Waudjqmjy5D/VIIs+tvRlMkGDnf45r/MwjFPENIz9vXesNxSvmm7QdBULRJKcbrM62/BgVntEDUy2sF7nTFCaLnzbpRmz2H+msAAkugSyem3GGrkSScWz1iAGIQRrXFvM5AXQJGirVsM2XZdAu+c6ig=

You can just decrypt it with your private key.

@gaborcsardi
Copy link
Member

CC-ing @jimhester, who might be interested in doing this kind of mocking for devtools test cases, for GH queries and even downloads.

As for the httrmock workflow, I am also just working it out, so it is quite experimental. Right now we have this:

if (file.exists("github-token.txt")) {
  Sys.setenv(GITHUB_TOKEN = readLines("github-token.txt", n = 1))
  Sys.setenv(DEBUGME = "httrmock")
  httrmock::start_recording()
  httrmock::start_replaying()

} else {
  httrmock::stop_recording()
  httrmock::start_replaying()
}

which basically means, reading from the back, that if you don't have a token, then you just want to replay the recorded responses. This is how the tests run on CRAN. Everything that was recorded before is just replayed (start_replaying) and requests that were not recorded before are performed, but not recorded (stop_recording).

If you do have a token, then, 1) everything that was recorded before is replayed (start_replaying, so that a test run does not perform all GH requests), and 2) everything that was not recorded before is performed and then recorded (start_recording).

The idea is that most of the time you don't want to perform all GH requests while developing gh, only the new ones. While developing, you might want to change the first block, actually. E.g. to re-record everything, you would do start_recording and stop_replaying. Or, while adding some features or new API points, or tests for them, you might want stop_recording and start_replaying, so that all new test requests are performed, but the old ones are not.

Does it make sense? I realize that it is a bit messy, and it would be great to improve it. E.g. we could have some interactive mode chooser, that you could use to select the desired behavior for the current session. I would even show it in my R prompt.

@jimhester
Copy link
Member

I think one additional thing we need to make this nice is the ability to run a teardown file(s) in testthat / devtools. Because ideally you want recording and replaying on when you are running the tests, but off when you are developing. Also r-lib/devtools#1202, r-lib/devtools#1169 would be useful for the same reasons.

Also in lookup I had planned on writing a function with instructions on setting up a Github Token, as that package won't work well without one. I briefly looked into using https://developer.github.com/v3/oauth_authorizations/#create-a-new-authorization to request a token from the user automatically based on basic authentication, but I am not quite clear how the flow works with two factor authentication. Something like that might be useful for GitHug or here as well.

@gaborcsardi
Copy link
Member

I think one additional thing we need to make this nice is the ability to run a teardown file(s) in testthat / devtools.

Exactly. The code I cited is in "setup", i.e. helper.R, but there is currently no teardown in testthat.....

Something like that might be useful for GitHug or here as well.

Maybe in githug. :) I think gh should only contain the basic infrastructure, ideally.

@jennybc
Copy link
Member Author

jennybc commented Dec 9, 2016

Success with the token! Thanks.

ideally you want recording and replaying on when you are running the tests, but off when you are developing

Right now helper.R is run every time you load_all(), so it's easy to get into a recording/replaying state without really thinking about it. This is how I puzzled myself because I inadvertently recorded some >400s, tinkering with gh_whoami() and nonexistent/bad tokens. Then of course they got replayed until I figured out what I had done. I am wondering if the condition for changing the recording/replaying state should be more than just "github-token.txt exists".

Also in lookup I had planned on writing a function with instructions on setting up a Github Token, as that package won't work well without one. I briefly looked into using https://developer.github.com/v3/oauth_authorizations/#create-a-new-authorization to request a token from the user automatically based on basic authentication, but I am not quite clear how the flow works with two factor authentication. Something like that might be useful for GitHug or here as well.

Yes I would also like to help people store a PAT somehow, from one of these packages. However, given the way the Authorizations API works, it almost feels like we could create as much friction as we remove by using it. In Happy Git, I have instructions to obtain PAT in the browser and offer this code snippet to help store it:

cat("GITHUB_PAT=8c70...adf2\n",
    file = file.path(normalizePath("~/"), ".Renviron"), append = TRUE)

I know that's very low tech 😔.

@gaborcsardi
Copy link
Member

Right now helper.R is run every time you load_all(), so it's easy to get into a recording/replaying state without really thinking about it. This is how I puzzled myself because I inadvertently recorded some >400s, tinkering with gh_whoami() and nonexistent/bad tokens. Then of course they got replayed until I figured out what I had done. I am wondering if the condition for changing the recording/replaying state should be more than just "github-token.txt exists".

Agreed. How about controlling the behavior via environment variables? Then we could have a default behavior, for the user, and developers could change it via a simple command that would just get/set an environment variable?

})

test_that("whoami works in absence of PAT", {
expect_message(res <- gh_whoami(.token = ""),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimhester Re: #50. I know it's regarded as good practice to include a specific message inside expect_error() and friends. Do you think the HTTP version of this mindset is that one should write expectations in this context for a specific HTTP status? I'm trying to understand if your motivation for #50 is something you want to do in lookup or for testing or ....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think would be great if expect_error retained the error object, and we could test the error class. See r-lib/testthat#530

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I was using it in lookup https://github.com/jimhester/lookup/blob/e841d72819e39242e0987aa6f23b240c9d47d60c/R/rcpp.R#L3.

But I think if you are expecting a response to have a specific error class you should catch just that specific class and let any other error be signaled normally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now catching and expecting a specific error class and HTTP error.

})

test_that("whoami errors with bad PAT", {
expect_error(res <- gh_whoami(.token = NA), "Requires authentication")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimhester More examples of "should I expect a certain status?" instead of doing this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes here I would definitely check the error class rather than the message, the message could potentially be changed by GitHub any time, the HTTP error code should be more stable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now catching and expecting a specific error class and HTTP error.

@jennybc
Copy link
Member Author

jennybc commented Dec 9, 2016

How about controlling the behavior via environment variables?

That sounds great! Does the environment variable R_TESTS help us at all? It will be non-NULL when the tests are running via devtools::test().

@gaborcsardi
Copy link
Member

Yes, R_TESTS could indeed help, but I am a bit reluctant to use it, as it causes trouble for many packages, and some people (including myself) often unset it.

@jimhester
Copy link
Member

Re: environment variables @lionel- has a PR that should help some r-lib/devtools#1391, but it is not yet merged.

jennybc and others added 3 commits December 9, 2016 13:36
This allows you to catch a github_error or the exact status code
directly.
Note: there's only one new recording because the second test matches against first recording. It just happens to pass in a happy coincidence.
@gaborcsardi
Copy link
Member

Btw. if you skip() the tests, then I can merge this, and you don't have to wait for the httrmock updates.

@jennybc
Copy link
Member Author

jennybc commented Dec 9, 2016

I have skipped the one that I know must fail for now. The first one should not. Seeing if setting GITHUB_TOKEN to something other than NULL fixes that.

@jennybc
Copy link
Member Author

jennybc commented Dec 10, 2016

VICTORY. I had to set GITHUB_TOKEN to some value on Travis and Appveyor. I hope that's OK. It should never be actually used and, if it is, I guess you'd want to know and fail anyway?

@gaborcsardi
Copy link
Member

Yeah, I think that is fine.

if (token != "") auth <- c("Authorization" = paste("token", token))
if (isTRUE(token != "")) {
auth <- c("Authorization" = paste("token", token))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, can you write this with if () { ... } else { ... }? I think it shows the intent better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -21,7 +21,7 @@ gh_process_response <- function(response) {
headers = heads,
message = paste0("GitHub API error (", status_code(response), "): ",
heads$`status`, "\n ", res$message, "\n")
), class = c("condition", "error"))
), class = c("github_error", paste0("http_error_", status_code(response)), "error", "condition"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already added by Jim's PR, no? Can you please rebase, if it is easy? If it is not easy, then don't worry about it, it is not worth getting into git trouble for this.....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merged master into my branch in the middle of this adventure (my branch started before I pulled the commit from master with that PR). This will work out when you squash, yes?

Copy link
Member

@gaborcsardi gaborcsardi Dec 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right, should be fine I think.

#' your planned tasks. The \code{repo} scope, for example, is one many are
#' likely to need. The token itself is a string of 40 letters and digits. You
#' can store it any way you like and provide explicitly via the \code{.token}
#' argument to \code{\link{gh}()}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markdown roxygen parsing is on for this project, so feel free to write markdown if you want to. You would need the dev version of roxygen2, though, so if you don't want to deal with that, that's fine, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done this here. Maybe after the merge I could do it for all exported functions at once?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to do it, just mentioned it for next time. :) I don't think rewriting it it worth the effort.

#'
#' Reports wallet name, GitHub login, and GitHub URL for the current
#' authenticated user, the first few and last characters of the token, and the
#' associated scopes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. I am not a security expert, but I would not print out anything from the token. I know that the user has access to it, but I just don't want it to appear in printouts, Rmds, etc.

Let me think about this a bit. I see that it can be useful for beginners, but I am still concerned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought showing first 4 and last 4 characters was OK. I do the same over in googlesheets. It is helpful when you have more than one token in your life and there's no obvious way to give them names. But we can kill it or reveal even less of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, TBH. Showing 8 characters makes the token 2^(8*4) times less secure. We would show 32 bits of the 160.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, this is not sg you can "crack", an attacker would need to try each remaining valid key in an API request....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we only show the first two letters now? Is that still useful for the user?

I'll also ask my security consultant, @jeroenooms. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK 2 characters it is.

#' Put a line break at the end! If you’re using an editor that shows line
#' numbers, there should be (at least) two lines, where the second one is empty.
#' Restart R for this to take effect. Call \code{gh_whoami()} to confirm
#' success.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! You really have a lot of empathy for beginners. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have answered this question a few dozen times.

"For more on what to do with the PAT, see ?gh_whoami.")
return(invisible(NULL))
}
req <- gh_build_request(endpoint = "/user", token = .token,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we do not show the token, then you don't need to build a gh_request manually, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I can avoid all the manual stuff and simply use gh(). Have simplified that.

@gaborcsardi
Copy link
Member

Thanks much, looks awesome! I added some small comments.

Most important is about showing part of the token on the screen. I am not sure if that's a good idea to be honest, but I might be paranoid.

@gaborcsardi
Copy link
Member

OK, I will merge this now, and maybe I'll change the token printout slightly.

@gaborcsardi
Copy link
Member

Thanks much!

@gaborcsardi gaborcsardi merged commit 951031e into r-lib:master Dec 10, 2016
@gaborcsardi
Copy link
Member

@jennybc FYI, I added a simple version of httrmock contexts.

@jennybc
Copy link
Member Author

jennybc commented Dec 10, 2016

Great! I will pursue that. I might be able to resume the routes.json activity this week for automating tests and I'm sure this will come up.

@gaborcsardi
Copy link
Member

Great! I'll still add better request matching. E.g. for POST we should at least use the data....

@jennybc
Copy link
Member Author

jennybc commented Dec 10, 2016

I will start with GETs anyway, as I'm sure that will be highly educational.

@jennybc
Copy link
Member Author

jennybc commented Dec 10, 2016

This PR was a good warm-up because I understand the basic mojo of httrmock now but did not before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants