Fix delay behavior #660

constanca-m · 2025-07-14T15:31:11Z

Issue is #619.

constanca-m · 2025-07-14T15:36:30Z

processor/ratelimitprocessor/gubernator.go

+			current := r.addRequests(uniqueKey, hits)
+			delay := time.Duration(resp.GetResetTime()-createdAt+int64(current)*cfg.ThrottleInterval.Milliseconds()) * time.Millisecond


@vigneshshanmugam Can I ask your opinion on this logic before I do the same for the local rate limiter?

This bug was added to GA goals: #619

This implementation has some things that worry me, but I don't if it makes sense to be worried over this:

What if we receive tons of requests? The delay will be increasing for each request, and then we will have many processes on hold. I think it makes sense to maybe set the delay to a fixed time, and after that time, each request starts getting rejected, regardless of the strategy. WDYT?

What if we receive requests like this:

We get a request that takes 10 tokens

We then get a request that takes 4 tokens

Both are delayed. Should the request of 4 tokens take priority, thus less delay for it, or should the 10 tokens have priority since it arrived first?

I dont think you need to do this, The issue here is that we are allowing all the clients to retry and send data simultanously once we past the reset time which is wrong. Its assuming the reset time guarentees the availability of tokens - requests/bytes based on the strategy.

The ideal way we could fix is basically looping through and retry after the delay, re checking the rate limits at each time. This respects all the configurations like strategy/algorithm along with keeping the race condition in check.

Rough sudo code would be

for { // check rate limit resp = getCurrentLimits() if resp.IsUnderLimit { // not limited return nil } // we shouldn't use createAt here its flawed as its based on when the request is made and doesnt take retries in to account. delay := time.Duration(resp.GetResetTime()-time.Now().UnixMilli()) * time.Millisecond // same code as before }

Hope this helps. Let me know if you want more details.

Thanks! This helped. I have updated the code, it should be working correctly now

processor/ratelimitprocessor/gubernator.go

vigneshshanmugam · 2025-08-22T20:47:08Z

processor/ratelimitprocessor/gubernator.go

+					if err := makeRateLimitRequest(); err != nil {
+						return err
+					}
+					if resp.GetStatus() == gubernator.Status_UNDER_LIMIT {


this feels incorrect, we are using the old response, we need to get the new response after retry.

It's the new response, it changed on makeRateLimitRequest() call two lines above:

opentelemetry-collector-components/processor/ratelimitprocessor/gubernator.go

Line 154 in 26ae328

resp = responses[0]

Ack, it was quite confusing with the diff. Can we change the function to return resp, err so its easier to reason this? Thanks.

Sure thing, I have changed the code

Fix delay behavior

b091942

constanca-m commented Jul 14, 2025

View reviewed changes

make commands

76b8fa0

simitt reviewed Jul 15, 2025

View reviewed changes

processor/ratelimitprocessor/gubernator.go Outdated Show resolved Hide resolved

constanca-m and others added 2 commits August 18, 2025 13:54

Merge branch 'elastic:main' into ratelimiter-reset

8db73a0

Fix rate limiter with delay

26ae328

constanca-m marked this pull request as ready for review August 18, 2025 12:26

constanca-m requested a review from a team as a code owner August 18, 2025 12:26

constanca-m requested review from SylvainJuge and xrmx August 18, 2025 12:26

vigneshshanmugam reviewed Aug 22, 2025

View reviewed changes

Address comments

0a748ca

vigneshshanmugam approved these changes Aug 26, 2025

View reviewed changes

constanca-m merged commit 1dc8ec4 into elastic:main Aug 26, 2025
13 checks passed

constanca-m deleted the ratelimiter-reset branch August 26, 2025 05:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix delay behavior #660

Fix delay behavior #660

Uh oh!

constanca-m commented Jul 14, 2025

Uh oh!

constanca-m Jul 14, 2025

Uh oh!

vigneshshanmugam Jul 14, 2025

Uh oh!

constanca-m Aug 18, 2025

Uh oh!

Uh oh!

vigneshshanmugam Aug 22, 2025

Uh oh!

constanca-m Aug 25, 2025

Uh oh!

vigneshshanmugam Aug 25, 2025

Uh oh!

constanca-m Aug 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		current := r.addRequests(uniqueKey, hits)
		delay := time.Duration(resp.GetResetTime()-createdAt+int64(current)cfg.ThrottleInterval.Milliseconds()) time.Millisecond

Fix delay behavior #660

Fix delay behavior #660

Uh oh!

Conversation

constanca-m commented Jul 14, 2025

Uh oh!

constanca-m Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

vigneshshanmugam Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

constanca-m Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vigneshshanmugam Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

constanca-m Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

vigneshshanmugam Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

constanca-m Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants