New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARTEMIS-3340 layered over ARTEMIS-2716 - activation sequence tracking to protect the journal #3646
Conversation
leaving as draft pending complete test run and some conversation. these are new policies so the behaviour is more restrictive as indicated in the tests. I think activationSequence captures the intent nicely but am open to suggestions for better naming. there are some nodeManager behaviours that may need a revisit. |
0f149b6
to
0ddd602
Compare
added some more tests to validate activation tracking and extended 'primary' policy to allow explicitly setting the nodeID. Not sure why that is a UUID and not human readable, there is already a UUID identifier. In any event, this allows 'peer' primary brokers to coordinate with each other on a shared identity which can support multi primary or peer mode. The activation sequence ensures that they don't step in each other as only the server with the current activation sequence will activate/go live. In the event of both crashing while replicated, any restarted server can activate. |
all the tests are good on this branch :-) |
I don't want to squash, I think all of the commits franz has are mostly isolated. |
@franz1981 @michaelandrepearce bits I would like to improve:
If the values are collocated is is trivial to check and revert an activation-sequence. I think this will help with maintainability but it may have other ramifications. It may need some consideration.
|
aa27651
to
7eaa6a2
Compare
Not a biggie, but I see that data rotation on backup start need to (re)set activation sequence to 0 or during a fail-back the broker risk to believe to have any meaningful data (because sequence > 0), while it's not; it should look like an empty backup (with activation sequence == 0). In addition (but I can work on this as a separate PR) I would like to provide logs or some folder naming strategy to relate activation sequence with the data version, to help recovering crazy bugs (we haven't yet discovered really :P). This PR is getting in a good shape, enough to merge this and move on, well done @gtully !!! |
.../src/main/java/org/apache/activemq/artemis/core/server/impl/ReplicationBackupActivation.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/activemq/artemis/core/server/impl/ReplicationBackupActivation.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/activemq/artemis/core/server/impl/ReplicationBackupActivation.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/activemq/artemis/core/server/impl/ReplicationBackupActivation.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/activemq/artemis/core/server/impl/ReplicationBackupActivation.java
Outdated
Show resolved
Hide resolved
Going to send a PR to the @gtully branch with:
These changes are related to #3646 (comment) |
…ble quorum replication policies
…g data when appropriate and have backup wait for activation before restarting as backup after failback which avoids a race
* ARTEMIS-2716 - speed up failback backup behaviour by ignoring existing data when appropriate and have backup wait for activation before restarting as backup after failback which avoids a race * ARTEMIS-3340 Implemented reusable curator primitive abstraction Co-authored-by: gtully <gary.tully@gmail.com>
…lidate lock release on check for in sync replica
…y activation policy
…a, only using lock for activation
…on the shared id in zk (#3)
I'm going to send a new PR after squashing all commits of the @gtully's branch |
@clebertsuconic #3680 replace this PR, that can be marked as not to be merged |
this is supeceeded by Franz's PR. |
#3680 replaces this one. |
No description provided.