-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix update ledger list to znode version mismatch failed, ledger not delete #12015
Fix update ledger list to znode version mismatch failed, ledger not delete #12015
Conversation
Good catch I am not sure that this applies to2.7.x |
return; | ||
} | ||
} | ||
|
||
synchronized (ManagedLedgerImpl.this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add the exception check here? We could record the lastLedgerCreationFailureTimestamp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gaoran10 I have add record the lastLedgerCreationFailureTimestamp
into the exception block. We couldn't merge the exception block here due to the metadataMutex
unlock.
@@ -1466,6 +1452,20 @@ public void operationFailed(MetaStoreException e) { | |||
|
|||
metadataMutex.unlock(); | |||
|
|||
if (e instanceof BadVersionException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to put this logic before releasing the lock? Now we changed the previous behavior. I'm not sure if it will cause other thread safety issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
@hangc0276 Thanks for your contribution. For this PR, do we need to update docs? (The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks) |
@Anonymitaet I added "no-need-docs", thanks |
@hangc0276 can you please answer the comments ? |
2a9b2a1
to
bd74c58
Compare
…elete (#12015) ### Motivation When Zookeeper throws `Failed to update ledger list. z-node version mismatch. Closing managed ledger` exception when update ZNode list, it will not clean up the created ledger, which will lead to the new created ledger not be indexed to the topic managedLedger list, and can't be cleanup as topic retention. What's more, it will cause ZNode number increase in Zookeeper if the `z-node version mismatch` exception keeping throw out. The exception list as follow: ``` 10:44:29.017 [main-EventThread] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Created new ledger 67311140 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Failed to update ledger list. z-node version mismatch. Closing managed ledger 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] INFO org.apache.pulsar.broker.service.Producer - Disconnecting producer: Producer{topic=PersistentTopic{topic=persistent://test/test/test_v1-partition-4}, client=/10.1.2.3:38938, producerName=pulsar-101-1123, producerId=20} ``` ### Modification 1. When updating ZNode list failed, delete the created ledger from broker cache and BookKeeper, regardless of whether the exception type is BadVersionException or not. (cherry picked from commit e7b0e3d)
…elete (apache#12015) ### Motivation When Zookeeper throws `Failed to update ledger list. z-node version mismatch. Closing managed ledger` exception when update ZNode list, it will not clean up the created ledger, which will lead to the new created ledger not be indexed to the topic managedLedger list, and can't be cleanup as topic retention. What's more, it will cause ZNode number increase in Zookeeper if the `z-node version mismatch` exception keeping throw out. The exception list as follow: ``` 10:44:29.017 [main-EventThread] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Created new ledger 67311140 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Failed to update ledger list. z-node version mismatch. Closing managed ledger 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] INFO org.apache.pulsar.broker.service.Producer - Disconnecting producer: Producer{topic=PersistentTopic{topic=persistent://test/test/test_v1-partition-4}, client=/10.1.2.3:38938, producerName=pulsar-101-1123, producerId=20} ``` ### Modification 1. When updating ZNode list failed, delete the created ledger from broker cache and BookKeeper, regardless of whether the exception type is BadVersionException or not.
…elete (#12015) ### Motivation When Zookeeper throws `Failed to update ledger list. z-node version mismatch. Closing managed ledger` exception when update ZNode list, it will not clean up the created ledger, which will lead to the new created ledger not be indexed to the topic managedLedger list, and can't be cleanup as topic retention. What's more, it will cause ZNode number increase in Zookeeper if the `z-node version mismatch` exception keeping throw out. The exception list as follow: ``` 10:44:29.017 [main-EventThread] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Created new ledger 67311140 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Failed to update ledger list. z-node version mismatch. Closing managed ledger 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] INFO org.apache.pulsar.broker.service.Producer - Disconnecting producer: Producer{topic=PersistentTopic{topic=persistent://test/test/test_v1-partition-4}, client=/10.1.2.3:38938, producerName=pulsar-101-1123, producerId=20} ``` ### Modification 1. When updating ZNode list failed, delete the created ledger from broker cache and BookKeeper, regardless of whether the exception type is BadVersionException or not. (cherry picked from commit e7b0e3d)
…elete (apache#12015) ### Motivation When Zookeeper throws `Failed to update ledger list. z-node version mismatch. Closing managed ledger` exception when update ZNode list, it will not clean up the created ledger, which will lead to the new created ledger not be indexed to the topic managedLedger list, and can't be cleanup as topic retention. What's more, it will cause ZNode number increase in Zookeeper if the `z-node version mismatch` exception keeping throw out. The exception list as follow: ``` 10:44:29.017 [main-EventThread] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Created new ledger 67311140 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Failed to update ledger list. z-node version mismatch. Closing managed ledger 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] INFO org.apache.pulsar.broker.service.Producer - Disconnecting producer: Producer{topic=PersistentTopic{topic=persistent://test/test/test_v1-partition-4}, client=/10.1.2.3:38938, producerName=pulsar-101-1123, producerId=20} ``` ### Modification 1. When updating ZNode list failed, delete the created ledger from broker cache and BookKeeper, regardless of whether the exception type is BadVersionException or not.
…elete (#12015) ### Motivation When Zookeeper throws `Failed to update ledger list. z-node version mismatch. Closing managed ledger` exception when update ZNode list, it will not clean up the created ledger, which will lead to the new created ledger not be indexed to the topic managedLedger list, and can't be cleanup as topic retention. What's more, it will cause ZNode number increase in Zookeeper if the `z-node version mismatch` exception keeping throw out. The exception list as follow: ``` 10:44:29.017 [main-EventThread] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Created new ledger 67311140 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [test/test/persistent/test_v1-partition-4] Failed to update ledger list. z-node version mismatch. Closing managed ledger 10:44:29.018 [bookkeeper-ml-workers-OrderedExecutor-2-0] INFO org.apache.pulsar.broker.service.Producer - Disconnecting producer: Producer{topic=PersistentTopic{topic=persistent://test/test/test_v1-partition-4}, client=/10.1.2.3:38938, producerName=pulsar-101-1123, producerId=20} ``` ### Modification 1. When updating ZNode list failed, delete the created ledger from broker cache and BookKeeper, regardless of whether the exception type is BadVersionException or not. (cherry picked from commit e7b0e3d)
Motivation
When Zookeeper throws
Failed to update ledger list. z-node version mismatch. Closing managed ledger
exception when update ZNode list, it will not clean up the created ledger, which will lead to the new created ledger not be indexed to the topic managedLedger list, and can't be cleanup as topic retention. What's more, it will cause ZNode number increase in Zookeeper if thez-node version mismatch
exception keeping throw out. The exception list as follow:Modification