[pip] PIP-430: Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count #24444

lhotari · 2025-06-23T18:40:54Z

Motivation

I'd like to propose PIP-430, which addresses performance and
efficiency issues in Pulsar broker's entry cache eviction mechanisms
and introduces a more efficient caching strategy.

The current broker entry cache implementation has several
production-impacting issues. The size-based eviction doesn't guarantee
removal of globally oldest entries, leading to suboptimal cache
utilization. More critically, the timestamp-based eviction iterates
through all ManagedLedgers every 10ms by default, causing high CPU
utilization in brokers with many topics. Mixed read patterns like
tailing, catch-up, and Key_Shared replays break eviction assumptions,
resulting in unnecessary BookKeeper and S3 reads that increase
operational costs.

PIP-430 introduces two main improvements.
First, a centralized eviction mechanism using a global
RangeCacheRemovalQueue that tracks all cached entries in insertion
order. This replaces the expensive per-ledger iteration with a single
periodic task and ensures true oldest-first eviction globally. The
implementation PR for this part is
#24363.
Second, a new "expected read count" cache strategy where entries track
how many active cursors are anticipated to read them. This allows the
cache to intelligently retain entries that have higher utility,
especially in high fan-out catch-up read scenarios and Key_Shared
subscriptions.

The benefits include reduced CPU overhead, improved cache hit rates
through better eviction decisions, and proper handling of diverse read
patterns. The new strategy is configurable via
cacheEvictionByExpectedReadCount (default: true) and maintains full
backward compatibility with no client-facing API changes.

This addresses long-standing performance issues that particularly
affect production deployments with high topic counts or diverse
consumption patterns. The refactored architecture also provides a
solid foundation for future cache optimizations.

The full proposal can be found at: #24444
Rendered PIP document:
https://github.com/lhotari/pulsar/blob/lh-pip-430/pip/pip-430.md

I welcome your feedback and discussion on this proposal. Please share
your thoughts, concerns, or suggestions.

Mailing list discussion: https://lists.apache.org/thread/o1ozbg468kxfd38pxk2ppzsstdnxnok2

Documentation

doc
doc-required
doc-not-needed
doc-complete

… and adding a new cache strategy based on expected read count

lhotari · 2025-06-23T18:55:12Z

Implementation related:

First part is implemented in [refactor][ml] Replace cache eviction algorithm with centralized removal queue and job #24363
There's a WIP PoC of the second part in [WIP] Add cacheEvictionByExpectedReadCount solution lhotari/pulsar#209
- This doesn't implement managedLedgerCacheEvictionTimeThresholdMillisMax yet.

[pip] PIP-430: Pulsar Broker cache improvements: refactoring eviction…

549d70c

… and adding a new cache strategy based on expected read count

lhotari added the ready-to-test label Jun 23, 2025

github-actions bot added PIP doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. labels Jun 23, 2025

Add mailing list discussion thread link

951f50b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pip] PIP-430: Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count #24444

[pip] PIP-430: Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count #24444

Uh oh!

lhotari commented Jun 23, 2025 •

edited

Loading

Uh oh!

lhotari commented Jun 23, 2025

Uh oh!

Uh oh!

[pip] PIP-430: Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count #24444

Are you sure you want to change the base?

[pip] PIP-430: Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count #24444

Uh oh!

Conversation

lhotari commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Documentation

Uh oh!

lhotari commented Jun 23, 2025

Uh oh!

Uh oh!

lhotari commented Jun 23, 2025 •

edited

Loading