New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support GitHub advanced search for repository selection #482
Conversation
Thanks for the contribution, @faec! You should receive feedback on your pull request within a few days. If you haven't already, please read through the contributing guide, and ensure that you've signed the CLA. Did you run into any issues when creating this PR? Please describe them in an issue so we can make the experience better for the next contributor. |
Tests are still pending, but the implementation is ready for review. |
cmd/repo-updater/repos/github.go
Outdated
var repos []*github.Repository | ||
var rateLimitCost int | ||
var err error | ||
repos, hasNextPage, rateLimitCost, err = c.client.ListRepositoriesForSearch(ctx, repositoryQuery, page) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the search endpoint has a separate rate limit then I think there is a problem here because the remaining rate limit is stored in the client. If a single client is shared between search and non-search requests, then the rate limit will keep getting overwritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right. Fundamentally the client assumes that there's only one rate limit to track, and it'll use whichever one it last processed a response from.
How fancy do we want to get to do the right thing here? On the simple end is just using a fixed delay for the search queries, since we only make them from one place so far. At the fancier end the client could maintain multiple rate limits, maybe keyed by the API being called. It might also be possible to factor rate information out of the client and get it from individual responses as needed.
My inclination is to go with a simple fixed delay for the initial checkin, but I don't have much context for the importance here, so let me know if you'd prefer something more precise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple suggestion is to have two clients in the githubConnection struct. For this change you would add a new client c.searchClient
that is only used right here. That way the rate limits are tracked independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I made that change I realized that this also creates two independent repository caches, which is probably undesirable. I think either the client needs to be aware there are multiple rate limits, or the cache needs to be moved up to the connection struct. Or we could punt and put in a one-off delay here unless / until we have another caller for the search API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My instinct would be to share the cache across two clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I tried to do it with minimal exposure of client internals to the connection, see if this approach looks right to you.
As for tests, I tried the existing tests but the build seems to be broken:
Is this a genuine problem in the current repo or am I misconfigured somehow? |
It looks like you have Sourcegraph checked out in GOPATH, which means Go modules are turned off by default. I filed #512 to fix our documentation. In the meantime you can try |
Moving the repo out of GOPATH fixed the testing problem, thanks! There is still a broken test at HEAD though:
That is no longer a blocking problem for me though, so I'll proceed. |
I filed an issue for the failing test |
Addressed comments and added unit tests. I tried adding a test around the independent rate limits too, since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding tests!
Can you update the changelong and PR description? |
Done on changelog, but the description looks up to date to me ("Support GitHub advanced search for repository selection" if that's what you meant), should I be looking somewhere else?
… On Oct 26, 2018, at 3:28 PM, Nick Snyder ***@***.***> wrote:
Can you update the changelong and PR description?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#482 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APSnfczBUYmc51zVkDIHy44UDEQhJVaPks5uo2JLgaJpZM4X0YL4>.
|
That is the title, I am referring to the first comment on this thread (which will become part of the commit message when we squash merge this into master). For example, you can delete this part at a minimum:
And optionally add a short description. |
Ah! Sorry, I haven't done code review on GitHub before, I didn't realize which parts were included :-) Done.
… On Oct 26, 2018, at 4:41 PM, Nick Snyder ***@***.***> wrote:
Support GitHub advanced search for repository selection
That is the title, I am referring to the first comment on this thread (which will become part of the commit message when we squash merge this into master).
For example, you can delete this part at a minimum:
This is still in progress. The initial implementation uses the "orgs" property suggested by @nicksnyder <https://github.com/nicksnyder>, but the plan now is to revise it to follow @sqs <https://github.com/sqs>'s ideas later in the thread.
And optionally add a short description.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#482 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APSnff9wtcyidEUglNVe4_xY3QGgXaxkks5uo3NwgaJpZM4X0YL4>.
|
Thanks! Tests for @sqs any comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Co-authored-by: Renovate Bot <bot@renovateapp.com>
Resolves #123.
With this change, if
github.repositoryQuery
is set to anything other than"public"
,"affiliated"
or"none"
it is treated as a GitHub advanced repository search.