Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve] [pip] PIP-348: Trigger offload on topic load stage #22650

Merged
merged 4 commits into from
May 13, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions pip/pip-348.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# PIP-348: Trigger offload on topic load stage

# Background knowledge

Pulsar tiered storage is introduced by [PIP-17](https://github.com/apache/pulsar/wiki/PIP-17:-Tiered-storage-for-Pulsar-topics) to offload cold data from BookKeeper to external storage. Ledger is the basic offload unit, and one ledger will trigger offload only when the ledger rollover. Pulsar topic offload can be triggered by the following ways:
- Manually trigger offload by using the `bin/pulsar-admin` command.
- Automatically trigger offload by the offload policy.


# Motivation
For triggering offload, the offload policy is the most common way. The offload policy can be defined in cluster level, namespace level and topic level, and the offload policy is triggered by the following ways:
- One ledger is closed or rollover
- Check the offload policy
- Trigger offload if the offload policy is satisfied

If one topic has multiple ledgers and the latest ledgers rollover triggered offload, all the previous ledgers will be added into pending offload queue and trigger offload one by one. However, if the topic is unloaded and loaded again, the offload process will be interrupted and needs to waiting for the next ledger rollover to trigger offload. This will cause the offload process is not efficient and the offload process is not triggered in time.


# Goals

## In Scope

Trigger offload on topic load stage to improve the offload process efficiency and make sure the offload process is triggered in time.


# Detailed Design

## Design & Implementation Details

When the topic is loaded, we can check the offload policy to see if the offload policy is satisfied. If the offload policy is satisfied, we can trigger offload immediately. This will improve the offload process efficiency and make sure the offload process is triggered in time.

In order to reduce the impact on topic load when Pulsar is upgraded from the old versions, I introduce a flag named `triggerOffloadOnTopicLoad` to control whether enable this feature or not.

# Backward & Forward Compatibility

Fully compatible.

# Links
* Mailing List discussion thread: https://lists.apache.org/thread/2ndomp8v4wkcykzthhlyjqfmswor88kv
* Mailing List voting thread: https://lists.apache.org/thread/q4mfn8x69hbgv19nmqx4dmknl3vsn9y8