Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lost compaction data due to compaction properties missed during reset-cursor #12698

Merged

Conversation

codelipenghui
Copy link
Contributor

Motivation

Fix lost compaction data due to compaction properties missed during reset-cursor.

  1. The compaction reader will seek to the earliest position to read data from the topic, but the compaction properties missed during the cursor reset, this will lead to the inited compaction subscribe without compaction horizon, so the compaction reader will skip the last compacted data. It will only happen when init the compaction subscription, so can introduced by the loadbalance or topic unloading manually.

  2. Advance the cursor should also keep the properties, otherwise, the properties will lost during the cursor trimming.

Changes

  1. Keep the properties for resetting the cursor while the cursor is for data compaction.
  2. Copy the properties to the new mark delete entry while advance the cursor, this is triggered byt the managed ledger internal, so it's not only for compacted topic, the internal task should not loss the properties when trimming the cursor.

Tests

New tests added to make sure the compaction will not loss data during topic unloading and the reader can read all the compacted data after the compaction task complete

Documentation

Check the box below and label this PR (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

Fix lost compaction data due to compaction properties missed during reset-cursor.

1. The compaction reader will seek to the earliest position to read data from the topic, but the compaction properties missed during the cursor reset, this will lead to the inited compaction subscribe without compaction horizon, so the compaction reader will skip the last compacted data. It will only happen when init the compaction subscription, so can introduced by the loadbalance or topic unloading manually.

2. Advance the cursor should also keep the properties, otherwise, the properties will lost during the cursor trimming.

### Changes

1. Keep the properties for resetting the cursor while the cursor is for data compaction.
2. Copy the properties to the new mark delete entry while advance the cursor, this is triggered byt the managed ledger internal, so it's not only for compacted topic, the internal task should not loss the properties when trimming the cursor.

### Tests

New tests added to make sure the compaction will not loss data during topic unloading and the reader can read all the compacted data after the compaction task complete
@codelipenghui codelipenghui self-assigned this Nov 9, 2021
@codelipenghui codelipenghui added this to the 2.10.0 milestone Nov 9, 2021
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 9, 2021
@codelipenghui codelipenghui merged commit 98e2c66 into apache:master Nov 10, 2021
@codelipenghui codelipenghui deleted the penghui/compaction_properties branch November 10, 2021 13:49
codelipenghui added a commit that referenced this pull request Nov 18, 2021
Fix lost compaction data due to compaction properties missed during reset-cursor.

1. The compaction reader will seek to the earliest position to read data from the topic, but the compaction properties missed during the cursor reset, this will lead to the inited compaction subscribe without compaction horizon, so the compaction reader will skip the last compacted data. It will only happen when init the compaction subscription, so can introduced by the loadbalance or topic unloading manually.

2. Advance the cursor should also keep the properties, otherwise, the properties will lost during the cursor trimming.

### Changes

1. Keep the properties for resetting the cursor while the cursor is for data compaction.
2. Copy the properties to the new mark delete entry while advance the cursor, this is triggered byt the managed ledger internal, so it's not only for compacted topic, the internal task should not loss the properties when trimming the cursor.

### Tests

New tests added to make sure the compaction will not loss data during topic unloading and the reader can read all the compacted data after the compaction task complete

(cherry picked from commit 98e2c66)
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Nov 18, 2021
eolivelli pushed a commit to eolivelli/pulsar that referenced this pull request Nov 29, 2021
Fix lost compaction data due to compaction properties missed during reset-cursor.

1. The compaction reader will seek to the earliest position to read data from the topic, but the compaction properties missed during the cursor reset, this will lead to the inited compaction subscribe without compaction horizon, so the compaction reader will skip the last compacted data. It will only happen when init the compaction subscription, so can introduced by the loadbalance or topic unloading manually.

2. Advance the cursor should also keep the properties, otherwise, the properties will lost during the cursor trimming.

### Changes

1. Keep the properties for resetting the cursor while the cursor is for data compaction.
2. Copy the properties to the new mark delete entry while advance the cursor, this is triggered byt the managed ledger internal, so it's not only for compacted topic, the internal task should not loss the properties when trimming the cursor.

### Tests

New tests added to make sure the compaction will not loss data during topic unloading and the reader can read all the compacted data after the compaction task complete
codelipenghui added a commit that referenced this pull request Dec 20, 2021
Fix lost compaction data due to compaction properties missed during reset-cursor.

1. The compaction reader will seek to the earliest position to read data from the topic, but the compaction properties missed during the cursor reset, this will lead to the inited compaction subscribe without compaction horizon, so the compaction reader will skip the last compacted data. It will only happen when init the compaction subscription, so can introduced by the loadbalance or topic unloading manually.

2. Advance the cursor should also keep the properties, otherwise, the properties will lost during the cursor trimming.

### Changes

1. Keep the properties for resetting the cursor while the cursor is for data compaction.
2. Copy the properties to the new mark delete entry while advance the cursor, this is triggered byt the managed ledger internal, so it's not only for compacted topic, the internal task should not loss the properties when trimming the cursor.

### Tests

New tests added to make sure the compaction will not loss data during topic unloading and the reader can read all the compacted data after the compaction task complete

(cherry picked from commit 98e2c66)
@codelipenghui codelipenghui added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Dec 20, 2021
@codelipenghui codelipenghui restored the penghui/compaction_properties branch May 17, 2022 01:20
@codelipenghui codelipenghui deleted the penghui/compaction_properties branch May 17, 2022 01:28
@Technoboy- Technoboy- added the cherry-picked/branch-2.7 Archived: 2.7 is end of life label Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker area/compaction cherry-picked/branch-2.7 Archived: 2.7 is end of life cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life doc-not-needed Your PR changes do not impact docs release/2.7.5 release/2.8.2 release/2.9.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants