Skip to content

[Enhancement] Configurable dispatch rate limiter backoff to reduce the 1-second latency penalty when limits are reached #24036

Open
@lhotari

Description

@lhotari

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

The current dispatch rate limiter implementation introduces a fixed 1-second additional latency when the rate limit is reached. This delay is hardcoded as MESSAGE_RATE_BACKOFF_MS = 1000 in the PersistentTopic class:

public static final int MESSAGE_RATE_BACKOFF_MS = 1000;

Solution

Simply making MESSAGE_RATE_BACKOFF_MS configurable would be insufficient for several reasons:

  1. Token Replenishment Frequency: Tokens are currently added to the rate limiter once per second. If the backoff time were reduced (e.g., to 100ms) without changing the token addition frequency, dispatchers would check for available tokens too frequently, wasting CPU resources.

  2. Implementation Differences:

    • Classic RateLimiterImpl: Uses a scheduled job to add permits periodically (controlled by ratePeriod).
    • PIP-322 AsyncTokenBucket: Calculates tokens on-demand when the limiter is used (controlled by addTokensResolutionNanos), enabling better scaling to millions of rate limiter instances without any overhead of scheduled jobs.

The backoff mechanism is triggered in the AbstractBaseDispatcher class:

if (readLimits.getLeft() == 0 || readLimits.getRight() == 0) {
if (log.isDebugEnabled()) {
log.debug("[{}] message-read exceeded {} message-rate {}/{}, schedule after {}ms", getName(),
limiterType.name().toLowerCase(),
rateLimiter.getDispatchRateOnMsg(), rateLimiter.getDispatchRateOnByte(),
MESSAGE_RATE_BACKOFF_MS);
}
reScheduleRead();
readLimits.setLeft(-1);
readLimits.setRight(-1L);
return false;
}

Possible solution:

Fairness (as defined in fair queuing and fairness measure) Considerations:

The current fixed 1-second backoff, which matches the 1-second token replenishment interval, may inadvertently provide some level of fairness in resource allocation. Changing this ratio could impact the fairness properties of the system.

Fairness is currently unaddressed in Pulsar's dispatch rate limiting. Addressing fairness is crucial to improving Pulsar's rate limiting and capacity management capabilities as described in the Pulsar 4.0 blog post.

To make Pulsar competitive with Confluent's Kora, which according to the Kora paper includes features like "backpressure and auto-tuning" and "dynamic quota management," we need to enhance Pulsar's approach to fairness in resource allocation, including in dispatch rate limiting.

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions