New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defer initialization of JGroups after logging is set up by Quarkus #29131
Defer initialization of JGroups after logging is set up by Quarkus #29131
Conversation
Closes keycloak#29129 Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahus1 Thanks for the investigation around it, good job! This is unfortunate that these 'revert' changes are required, as they removed the no-work CPU gap as shown here (don't focus on the red circle, but rather on the CPU utilization during the time ~2.3s-4.2s): It was expected that the CPU usage would be higher because these tasks for different threads leverage more parallel executions than in the figure above. However, as you mentioned, the issue is more critical with the trace logs, assumedly written into the DelayedHandler. I'm just wondering if we have some possibilities to avoid providing changes in this PR. At the time of executing the build step, we already have initialized the whole configuration, so if the To summarize it, we could:
@ahus1 I haven't tried it, but it might be feasible, right? |
@mabartos - the pause you see, is that maybe the pause where the discovery of ISPN takes place? It only applies when the first node is started. My usual comment about optimizations applies here: If it makes things more fragile and complex, it is a trade-off with maintainability, and I'd rather not do it. Even if we fix it for the logging of the protocol, we might miss other locations, as the JGroups library is also "optimizing". @pruivo - is the a way using the regular Java APIs to get hold of the instance of |
@ahus1 after the var t = GlobalComponentRegistry.componentOf(cacheManager, Transport.class);
((JGroupsTransport)t).getChannel().getProtocolStack().getTransport().isTrace(false); |
Yep, AFAIK, during that phase, the cluster is analyzed and checks the existence of the coordinator. If there is any, the node will become the coordinator after exceeding the time. So, it should apply only to the first node, as you've mentioned. Perhaps it'd be really better to stick with these changes you've provided, as the CPU gap for the first node is not a big deal. It'll probably be better in terms of maintainability, as you've mentioned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my previous comment, I'm ok with these changes.
Closes #29129