-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
atomicInt64Limiter WithoutSlack doesn't block #90
Comments
@twelsh-aw |
Reverting - @storozhukBM we can re-do the whole PR again (possibly add some tests earlier?) |
@rabbbit |
@rabbbit |
@rabbbit So we either should find concurrent mock clock implementation, or fix "github.com/benbjohnson/clock" or write our own. |
I'm sorry, I won't be able to look at this in detail, for probably next ~10 days (limited laptop access). For now, I can revert the clock implementation change in the meantime if you think it's better/stable - the tests were (mostly?) stable there. I'm opposed to implementing a clock as part of this package :) Looks like we'll be stuck with the old implementation for a while - "code freeze" until we fix the tests. |
@rabbbit agree, I'll look at possible options with time mocking and will fix our tests in the following PR |
This limiter was introduced and merged in the following PR #85 Later @twelsh-aw found an issue with this implementation #90 So @rabbbit reverted this change in #91 Our tests did not detect this issue, so we have a separate PR #93 that enhances our tests approach to detect potential errors better. With this PR, we want to restore the int64-based atomic rate limiter implementation as a non-default rate limiter and then check that #93 will detect the bug. Right after it, we'll open a subsequent PR to fix this bug.
This limiter was introduced and merged in the following PR uber-go#85 Later @twelsh-aw found an issue with this implementation uber-go#90 So @rabbbit reverted this change in uber-go#91 Our tests did not detect this issue, so we have a separate PR uber-go#93 that enhances our tests approach to detect potential errors better. With this PR, we want to restore the int64-based atomic rate limiter implementation as a non-default rate limiter and then check that uber-go#93 will detect the bug. Right after it, we'll open a subsequent PR to fix this bug.
Ack. This is pretty high on my priority list, will try to get to this soon. Again, sorry for the delay. |
In #90 @twelsh-aw found a bug in a new implementation. This turned out to be caused by us mocking time. Since time mocking will always have some risks this diff proposes to expand the `examples` we're using - since they use "real" time they should be good enough to cover most basic cases like #90. In particular: - we add a "withoutSlack" test so that the exact case reported in #90 doesn't happen. No slack option seems common enough to add it. - updates "withSlack" example to actually show how slack operates. Due to non-even execution times I'm forced to round the time a bit. Possible issues: - test stability: I re-run the test a 1000 times without issues - the timing seems to be stable. - test duration: in total we're extending the examples by 5ms, which shouldn't be human noticeable.
In #90 @twelsh-aw found a bug in a new implementation. This turned out to be caused by us mocking time. Since time mocking will always have some risks this diff proposes to expand the `examples` we're using - since they use "real" time they should be good enough to cover most basic cases like #90. In particular: - we add a "withoutSlack" test so that the exact case reported in #90 doesn't happen. No slack option seems common enough to add it. - updates "withSlack" example to actually show how slack operates. Due to non-even execution times I'm forced to round the time a bit. Possible issues: - test stability: I re-run the test a 1000 times without issues - the timing seems to be stable. - test duration: in total we're extending the examples by 5ms, which shouldn't be human noticeable.
This PR fixes the issue found by @twelsh-aw with int64 based implementation #90 Our tests did not detect this issue, so we have a separate PR #93 that enhances our tests approach to detect potential errors better.
@rabbbit We already have a fix in place. What do you think about making atomicInt64Limiter a default implementation again and allowing people to have a try before cutting a new release tag? Or we can make this implementation public so that users can instantiate it. I'd like the new testing approach to be merged first, but if you're unsure about it, we can skip it for now. |
The new atomic version seems to perform so much better than it feels we should enable it. However, after that, we might just have to declare a code freeze until someone else has more cycles for reviews. Sadly this includes the new time-testing, would need to look at this in much more detail before merging in. |
@rabbbit OK, so should I make a PR now? |
SGTM; also, thanks for pushing this through :) |
Closing as fixed - we reverted the bad change in #91, @storozhukBM fixed it in #95, but we were defaulting to the old implementation. #101 makes the new (faster) implementation the default. |
@twelsh-aw this change is now merged, can you please try again and give us your feedback |
thank you @rabbbit |
First off thanks for the library :)
Saw that a new rate limiter was introduced that benchmarked a lot better and pulled it down to try it out.
Noticed that when running WithoutSlack, it just allows everything through instead of waiting because all subsequent Take() calls fall into the
case now-timeOfNextPermissionIssue > int64(t.maxSlack)
Easiest way to repro is using your
example_test.go
:rl := ratelimit.New(100)
(slack=10):go test -run Example -count=1 === RUN Example --- PASS: Example (0.09s) PASS ok command-line-arguments 0.207s
rl := ratelimit.New(100, ratelimit.WithoutSlack)
go test -run Example -count=1 --- FAIL: Example (0.01s) got: 1 10ms 2 775µs 3 3µs 4 2µs 5 10µs 6 2µs 7 2µs 8 2µs 9 2µs want: 1 10ms 2 10ms 3 10ms 4 10ms 5 10ms 6 10ms 7 10ms 8 10ms 9 10ms FAIL FAIL command-line-arguments 0.126s FAIL
rl :=newAtomicBased(100, WithoutSlack)
go test -run Example -count=1 PASS ok go.uber.org/ratelimit 0.323s
I am not 100% sure why the other units with mocked clocks are passing, but your example test and my application tests fail consistently with this new limiter. On darwin if that helps.
The text was updated successfully, but these errors were encountered: