New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WFLY-16279] Fixing intermittent test failure #15444
Conversation
@@ -82,7 +82,7 @@ private Server load(String xmlFile, String serverName) throws Exception { | |||
|
|||
assertNotNull(serverService); | |||
serverService.setMode(ServiceController.Mode.ACTIVE); | |||
final Server server = serverService.getValue(); | |||
final Server server = serverService.awaitValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ropalka This change makes sense given line 84 above, but I don't understand why L84 was needed. ServerAdd installs the service without calling ServiceBuilder.setInitialMode, so it should default to ACTIVE. And then the boot of the test ModelController should result in the normal wait for MSC stability when the boot op runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The observed regression might happen because of https://issues.redhat.com/browse/MSC-255 @bstansberry . I just little bit enhanced the code. What about line #84 it is of course useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the MSC-255 scenario, i.e.
After step 6) user code might expect Service A is in stable state.
But since StabilityMonitor doesn't block services in any way this might not be true.
Do you mean that something else might cause the service to change state? If so, that's understood, and I can see how in some scenarios that could be a problem. I don't really see it for the OperationContext tracking configuration write ops case though, at least not as something that shouldn't be very rare. We don't allow concurrent mgmt ops that alter MSC state.
We need to be sure that MSC provides something that gives us the current semantics that an OperationContext relies on when executing management operations.
That said, this change is fine, as awaitValue seems like a better choice. I think we should leave the JIRA open though until we see evidence it fixes the problem. I still don't see what logic ever ensured that the Host service was started, as nothing demands it. Unless I forgot something that depends on it and thus demands it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the StabilityMonitor is tracking only subset of transtivie closure of service dependencies graph then any service that isn't registered with StabilityMonitor and is part of that transitive closure can cause the state change (concurrently). So even if concurrent mgmt ops are forbidden iff mgmg op is not tracking all services (via StabilityMonitor) from transitive closure of dependencies graph there are no 'stability' guaranties of mgmt ops mentioned above @bstansberry .
https://issues.redhat.com/browse/WFLY-16279