New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-62479] Automatically retry rate limited Bitbucket Server requests with exponential backoff #581
Conversation
…ly on getting rate limit error.
…terruptedException
private static final Duration API_RATE_LIMIT_INITIAL_SLEEP = Duration.ofSeconds(5); | ||
private static final Duration API_RATE_LIMIT_MAX_SLEEP = Duration.ofMinutes(30); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#483 retried every 5 seconds, up to 300 times, for a total of 25 minutes of retries. This PR uses exponential backoff, and in the worst case would wait around 27 minutes, but in those 27 minutes it would only retry around 13 times.
I am not sure what settings would make the most sense here, so I just kept roughly the same timeframe from the earlier PR. Perhaps this should be lowered to only a few minutes with non-exponential backoff. Based on my testing, the in-product Bitbucket Server documentation, and https://confluence.atlassian.com/bitbucketserver0721/improving-instance-stability-with-rate-limiting-1115666697.html, it seems like anything more than a few minutes is probably unlikely to help in practice since the rate limit quota is refreshed on a per-second basis (unless maybe you have extremely bursty usage ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Given that each server can have its own settings (https://confluence.atlassian.com/bitbucketserver/improving-instance-stability-with-rate-limiting-976171954.html) I think this is a good trade-off.
@@ -335,7 +342,7 @@ public String getRepositoryUri(@NonNull BitbucketRepositoryProtocol protocol, | |||
return pullRequests; | |||
} | |||
|
|||
private void setupPullRequest(BitbucketServerPullRequest pullRequest, BitbucketServerEndpoint endpoint) throws IOException { | |||
private void setupPullRequest(BitbucketServerPullRequest pullRequest, BitbucketServerEndpoint endpoint) throws IOException, InterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of catching InterruptedException
and just propagating it as a flag with Thread.interrupt()
like in #483, I chose to just propagate the InterruptedException
directly since the BitbucketApi
interface that this class implements already has throws InterruptedException
in its signature.
...ain/java/com/cloudbees/jenkins/plugins/bitbucket/server/client/BitbucketServerAPIClient.java
Outdated
Show resolved
Hide resolved
* TODO: It would be better to log this to a context-appropriate TaskListener, e.g. an org/repo scan log. | ||
*/ | ||
LOGGER.log(Level.FINE, "Bitbucket server API rate limit reached, sleeping for {0} before retrying", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A notable side-effect of not being able to log to a TaskListener
here is that in things like org scan logs, there will be no indication that rate limiting is occurring, the log will just appear to hang. I do not see any way to improve that though without significant API changes, and the Bitbucket Cloud API client rate limit retrying has the same issue.
...ain/java/com/cloudbees/jenkins/plugins/bitbucket/server/client/BitbucketServerAPIClient.java
Outdated
Show resolved
Hide resolved
…t resets the request
private static final Duration API_RATE_LIMIT_INITIAL_SLEEP = Main.isUnitTest ? Duration.ofMillis(100) : Duration.ofSeconds(5); | ||
private static final Duration API_RATE_LIMIT_MAX_SLEEP = Duration.ofMinutes(30); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the best values here, see #581 (comment) for more details.
This would presumably fix [JENKINS-62479] Rate limit reached for Bitbucket server - Jenkins Jira. Other issues related to rate limiting: |
Thanks @dwnusbaum - this is great. I suppose at some point these changes would benefit the Cloud-Client as well (backoff) :) Will release soon. |
@KalleOlaviNiemitalo Thanks for the issue references, I updated the PR title/description. @lifeofguenter Great, let me know if you have any thoughts about the timing-related questions I noted in #581 (comment). |
See JENKINS-62479.
The Bitbucket Cloud API client in this plugin has had basic functionality to retry rate limited requests since #65, but the Bitbucket Server API client has no such functionality. This PR, built on top of #483, adds a basic retry loop when requests fail due to rate limiting, and backs off exponentially between retries.
I made some modifications to
BitbucketIntegrationClientFactory
so that I could add a very basic test inBitbucketServerAPIClientTest
that rate limits a single response to demonstrate that the basic behavior works.I also tested the PR manually against Bitbucket Server 7 using the latest Bitbucket Server 7 Docker images with a trial license. I enabled rate limiting and set very low limits to ensure that the new code was executed and verified that an org scan and multibranch project that interacted with the repo worked successfully.
Your checklist for this pull request