Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added rate-limit response error message #8826

Merged
merged 2 commits into from Jan 25, 2022
Merged

Conversation

dixler
Copy link
Contributor

@dixler dixler commented Jan 24, 2022

Description

Added handler for 429 rate limit response from pulumi service.

Fixes #7963

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • Yes, there are changes in this PR that warrants bumping the Pulumi Service API version

@codecov
Copy link

codecov bot commented Jan 24, 2022

Codecov Report

Merging #8826 (514c836) into master (1181311) will decrease coverage by 17.17%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #8826       +/-   ##
===========================================
- Coverage   59.39%   42.21%   -17.18%     
===========================================
  Files         639      632        -7     
  Lines       98051    95986     -2065     
  Branches     1386     1390        +4     
===========================================
- Hits        58235    40524    -17711     
- Misses      36533    52722    +16189     
+ Partials     3283     2740      -543     
Impacted Files Coverage Δ
pkg/backend/httpstate/client/api.go 66.18% <100.00%> (+0.98%) ⬆️
pkg/codegen/docs.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/codegen/go/gen_spill.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/codegen/dotnet/templates.go 0.00% <0.00%> (-100.00%) ⬇️
sdk/go/common/resource/stack.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/codegen/hcl2/model/format/func.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/codegen/python/python.go 0.00% <0.00%> (-97.80%) ⬇️
pkg/codegen/go/gen_program_optionals.go 0.00% <0.00%> (-94.21%) ⬇️
pkg/codegen/dotnet/gen_program.go 0.00% <0.00%> (-90.03%) ⬇️
pkg/codegen/go/utilities.go 0.00% <0.00%> (-90.00%) ⬇️
... and 166 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1181311...514c836. Read the comment docs.

@iwahbe
Copy link
Member

iwahbe commented Jan 25, 2022

Is the goal to just error out, or implement exponential backoff?
CC @EvanBoyle

@Frassle
Copy link
Member

Frassle commented Jan 25, 2022

I would of thought we'd do automatic backoff in response to this status code. Otherwise we're going to start hard failing a load of deployments that previously would of been fine.

@mikhailshilkov
Copy link
Member

@EvanBoyle Is there some kind of spec of the service behavior and expectations? Ideally, the service would return retry-after that we would respect here? I'd say we need to update #7963 with more details before we can ship this change.

@EvanBoyle
Copy link
Contributor

cc @jkisk who has been driving the service work to provide more detail via https://github.com/pulumi/pulumi-service/issues/6933

My understanding of WAF and our rate limiting is that thresholds are measured over five minute periods, meaning that exponential backoff isn't feasible. I'd expect this to be "just handle the error gracefully and return a clear message to the user about what has happened" but will let Jamie provide more detail here.

@dixler dixler changed the title added rate-limit response handling added rate-limit response ~handling~ error message Jan 25, 2022
@dixler dixler changed the title added rate-limit response ~handling~ error message added rate-limit response error message Jan 25, 2022
@dixler
Copy link
Contributor Author

dixler commented Jan 25, 2022

replaced handling in title with error message for more clarity.

@jkisk
Copy link
Contributor

jkisk commented Jan 25, 2022

I would of thought we'd do automatic backoff in response to this status code. Otherwise we're going to start hard failing a load of deployments that previously would of been fine.

Ok, I've chatted with Evan and Fraser and given that we are planning to initially set this limit at 5x or so our current highest usage, we don't expect this to be hit. There is a cloudwatch alarm attached so the service on call will be aware if it is triggered, and this error message will give useful feedback to the user in that case. There is more to do here when we start passing along 429s from third parties like Github, but at that point we can pass along a custom header to differentiate those errors.

@jkisk jkisk requested a review from Frassle January 25, 2022 17:32
@dixler dixler merged commit f99affc into master Jan 25, 2022
@pulumi-bot pulumi-bot deleted the dixler/7963/handle-ratelimit branch January 25, 2022 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gracefully handle 429 from pulumi service backend
6 participants