New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARTEMIS-3496 Replica connection to its live should fail fast #3771
Conversation
@clebertsuconic do you remember any reason why |
it was set to 1 as in not meant to retry.. .a single connection... its a bug... I meant for one connection only. Replication and clustering should have the retries through the cluster bridge and clustered connection. not through the serverLocator. Is there a test that fails with this? run the whole test suite just in case.. and if you can add a failing test it would be even greater. |
I'm still investigating on multiple test failure after applied this change :( |
188f67e
to
14ebe11
Compare
@clebertsuconic I'm going to add a separate test for this tomorrow: I've rebased and now the CI is fully green with this change |
I guess a test would be best, but this looks a trivial fix and it is important for time to recover from failure. I would like to see this in 2.19.0. |
I started with mokito and wow was it involved, the end result is not pretty but it does validate the fix. comments welcome! |
Main comment, without looking at the code, the tests all hung and caused the GHA job to time out after 6 hours. The tests should have an appropriate timeout to stop any one test taking an excessive amount of time before failing (also makes clear which one goes bang when they do, aiding following analysis of what happened). |
fair, I guess it is related to the use of the default port, that needs sorting. |
daf5f5f
to
650a92c
Compare
…ased and quite involved
https://issues.apache.org/jira/browse/ARTEMIS-3496