Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the interaction between rowLimit and batch type clearer #17

Open
MarkEdmondson1234 opened this issue Sep 9, 2016 · 11 comments
Open

Comments

@MarkEdmondson1234
Copy link
Owner

This means if you don't specify rowLimit it won't batch which is weird

  if(rowLimit > 5000){
    message("Batching data via method: ", walk_data)
    rowLimit0 <- rowLimit
    rowLimit <- 5000
  } else {
    walk_data <- "none"
  }
@mischa2002k2
Copy link

mischa2002k2 commented Dec 11, 2017

Hey Mark,
I can not request more than 5.000 Rows - is this related to this issue? The "walk_data='byDate'" won't work in any way. This is what I try:

scPage_data <- search_analytics(siteURL = website, startDate = startDate, endDate = endDate, dimensions = download_dimensions, searchType = type, rowLimit = 15000, walk_data = "byBatch")

Results in this error:
Error: The request did not match the specified API.

This error only appears if the rowLimit is above 5000.

@MarkEdmondson1234
Copy link
Owner Author

Hi @mischa2002k2 - please update to the GitHub version for now:

remotes::install_github("MarkEdmondson1234/searchConsoleR")

Its due to the updated batch policies implemented in googleAuthR 0.6.2

@mischa2002k2
Copy link

Thanks - this did fix my issue!

@daauerbach
Copy link

Mark, I'm using fresh github installs of searchConsoleR and googleAuthR.

If you have a moment, can you please confirm that googleAuthR.batch_endpoint should definitely be "https://www.googleapis.com/batch/webmasters/v3" not
"https://www.googleapis.com/batch" as set on googleAuthR load?

And, this may not be worth too much thought, but do you have any insight into the Error: Invalid Credentials that I see when walk_data="byBatch", rowLimit = 10000 if I only library(searchConsoleR) and scr_auth()? No error for rowLimit=1000. And, no error (batch or otherwise) if only library(googleAuthR); gar_auth() and call searchConsoleR::search_analytics(...) which seems update the batch endpoint option value silently.

Perhaps this belongs over at googleAuthR issues but it seems primarily searchConsoleR?

@MarkEdmondson1234
Copy link
Owner Author

MarkEdmondson1234 commented Dec 14, 2017

The latest version on GitHub sets the batch endpoint itself, ( here ) so no action from you should be needed.

There should be no difference in authentication if the rowLimits change, perhaps you have a cached result? In any case I would delete any .httr-oauth or similar tokens, start a new R session and reauthenticate, that should clear your problem. It should create the new name for th e auth token, sc.oauth

Also, are you running the script alongside other googleAuthR packages? There may be some compatibility issues that I need sorted if there are still problems.

@daauerbach
Copy link

Thanks, that's what I assumed, and "delete-all-and-start-fresh" was my first move, thinking along those lines. I took a deeper look at search_analytics() and see where you're setting the batch endpoint.

As far as I can tell, the Error: Invalid Credentials does indeed only seem to occur when I authorize via a token that has also been used to authorize for other APIs/libraries. For example, this works:

gar_auth("sc.oauth") #freshly generated from scr_auth()
2017-12-15 16:49:32> Setting googleAuthR.client_id to 858905045851-3beqpmsufml9d7v5d1pr74m9lnbueak2.apps.googleusercontent.com
2017-12-15 16:49:32> Setting googleAuthR.client_secret to bnmF6C-ScpSR68knbGrHBQrS

But this doesn't

gar_auth("/pathToMyOtherToken/.httr-oauth")
2017-12-15 16:54:15> Setting googleAuthR.client_id to MYCLIENTID.apps.googleusercontent.com
2017-12-15 16:54:15> Setting googleAuthR.client_secret to MYCLIENTSECRET
...
Batching data via method: byBatch
Error: Invalid Credentials

Things also work if I generate a fresh .httr-oauth via gar_auth(). That is, it seems the function/filename don't matter as long as the token hasn't also stored (been overwritten?) other authentications.

Perhaps it's related somehow to googleAuthR::gar_api_generator or other underlying functions in googleAuthR::gar_batch_walk (gar_batch or makeBatchRequest)? I haven't traced where authenticated <- "Token2.0" %in% class(googleAuthR::Authentication$public_fields$token) comes in...

At this point, things run okay when I'm careful about the order of attaching and authenticating. For my present needs, it works (and is probably better practice) to keep the data access chunks separated.

And thanks again for all your work making useful tools!!!

@MarkEdmondson1234
Copy link
Owner Author

MarkEdmondson1234 commented Dec 16, 2017

@daauerbach If you are using multiple libraries, then its most likely the scopes (set via options(googleAuthR.scopes.selected) that are causing your issues.

Could you see if this section on the googleAuthR website helps explain it to solution? http://code.markedmondson.me/googleAuthR/articles/google-authentication-types.html#multiple-authentication-tokens

Essentially since scr_auth() assumes you are authenticating with only the webmaster API, if using multiple googleAuthR libraries its better to set your own scopes and then do authentication via googleAuthR::gar_auth()

@daauerbach
Copy link

I'm dealing okay with the authentication issues in interactive sessions, but getting back closer to the original issue here, in case it helps anyone else...

It looks like the API rowLimit max is specified as 5000 in the google documentation (as well as set in the function). I wanted more per day, so I loop search_analytics over my own date vector, walking each day "byBatch" with a higher limit, and retaining in a list. Then bind_rows and presto, lots of data!

@daauerbach
Copy link

^^ (facepalm) or, it works fine to just add the "date" dimension to the single function call byBatch with an appropriately high rowLimit...

@MarkEdmondson1234
Copy link
Owner Author

Ok cool! Closing this as solved then.

@MarkEdmondson1234
Copy link
Owner Author

But I’ll add a bit to make it easier to add all rows available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants