New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed: MultiDcSplitBrainSpec #23306
Comments
Just so I remember it: I saw it fail once or twice locally as well while doing unrelated things. |
Looking into it I think it was related to waiting on unreachable becoming empty/nonEmpty which we do not do anymore. Will keep the ticket open for a while anyway to see if it fails again on the ci-server. |
Haven't seen it fail after more things got merged, so closing. |
Saw it (or something similar?) at https://jenkins.akka.io:8498/job/akka-nightly-2.12/332/consoleFull |
Another one: https://jenkins.akka.io:8498/job/akka-multi-node-nightly/5294/consoleFull Stack dump for searchability:
|
(Cleared the assignee since it was reopened. Maybe let's not reopen failures in the future but create new issues linking to old instances if related.) |
hmm, I don't think this is multi-dc issue. Looks more like a remoting issue.
and it seems like it doesn't "heal" even though it could also be something with the test transport? |
Tried to reproduce without success. |
Note the latest failure doesn't have quite the most up to date code for this test: This fails as The leader (third) in DC2 marks the restarted fifth as Up
We can see cross DC heart beats happening after that. You can see
The last logged event from
In the 3 seconds between I also think there is an issue with the barrier which means nodes move on before WIll add the gossip logging and fix the barrier. |
The last time this failed there was no gossip to or from a node that didn't see fifth coming back. Also note that this test doesn't quite test what it says as the split brain is repaired before starting the second actor system but without extensions to the multi jvm test kit this can't be improved. Refs akka#23306
I can't "fix" the barrier unless we can handle a remote system being restarted in the test kit. Raised: |
…cy (#24024) The last time this failed there was no gossip to or from a node that didn't see fifth coming back. Also note that this test doesn't quite test what it says as the split brain is repaired before starting the second actor system but without extensions to the multi jvm test kit this can't be improved. Refs #23306
…cy (akka#24024) The last time this failed there was no gossip to or from a node that didn't see fifth coming back. Also note that this test doesn't quite test what it says as the split brain is repaired before starting the second actor system but without extensions to the multi jvm test kit this can't be improved. Refs akka#23306
…cy (akka#24024) The last time this failed there was no gossip to or from a node that didn't see fifth coming back. Also note that this test doesn't quite test what it says as the split brain is repaired before starting the second actor system but without extensions to the multi jvm test kit this can't be improved. Refs akka#23306
https://jenkins.akka.io:8498/job/akka-multi-node-repeat/13905
could be real, but much has also changed that is not merged yet
@johanandren
The text was updated successfully, but these errors were encountered: