New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WFLY-16383 mod_cluster: Do not register contexts when in suspend mode #15563
Conversation
This change prevents mod_cluster from registering web contexts when the container is started in suspended mode. The result is, that when the contexts remain unregistered, reverse proxy will keep serving 404 responses (rather than 503) on those contexts. There has been previous attempts to achieve this behavior (WFLY-14121, WFLY-13074).
@rhusar @pferraro / @soul2zimate @spyrkob would you like to review this? I found these previous PRs that are related. https://github.com/jbossas/jboss-eap7/pull/3411 From what I understand from reading the customer case and the previous tickets, the desired behavior is such:
The reason why it only works partially, is that the
The My suggestion is to move the
|
This does not make sense to me. Such a condition should result in a 503, not a 404. |
@pferraro I would say you are objecting to the other point then you quoted. Let me rephrase. Current behavior:
This to me is in line with what you say about deployment-awareness, unavailable URL is distinguishable from unknown URL, and this would stay. What this PR changes, is that when a node is started in suspend mode, i.e. we know it will never be able to respond to requests until it is restarted, then it wouldn't register its deployments to the proxy. |
A server that was started, and then suspended should be in the same LB state as a server started in suspended mode. A suspended server is meant to be distinguishable from a worker that is not started. A suspended server should quickly made available by an ENABLE-APP command. It seems to me that this proposed change would break that. |
Yes, it's true that a server that was started and then suspended would result in the app context being registered, while a server started in suspended context would not result in context registration. Context registration would happen when the server resumed for the first time (via this call: https://github.com/wildfly/wildfly/pull/15563/files#diff-35a73bec63a02dacc2ca8f93fde1fa881a68eabbba27c45bf0171fe2688106a8R212) |
@TomasHofman We must establish the state of the server with the LB on startup, regardless of whether the server is suspended or not, since it may have previous crashed, in which case we must make sure to disable any contexts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👎 for the changes.
- The proxy state should be the same regardless whether the server was started in suspended mode or got into the suspend mode eventually.
- The
404
is a state when the resource does not exist however in the suspended situation the context does exist but not receiving requests in which case the503
is the correct response. Should a different handling be required, the onus is on the proxy configuration how it handles the situation. - The purpose of the suspended state is to resume operation as quickly as possible while the proposed changes would require an additional add operation for the proxy.
@pferraro @rhusar sorry for delay and thanks for all your input. In the mean time I reviewed the previous issues and did some more experiments. Firstly, I did misunderstood the purpose of the previous issues (I had some bias from reading the customer case): The purpose of those issues was not to prevent context registration when server is suspended (as I thought it was), but to prevent the contexts from being enabled when server is suspended. Apologies for that. However, one of the reasons I got the wrong idea is that current Wildfly do behave the wrong way in certain configuration. That I think confused also our support. Reproducers: Case 1) Modcluster on the worker node is configured with Case 2) Modcluster on the worker node is configured with I think this difference in behaviors is a timing issue, caused by the fact that in the first case, the connection to the proxy is (most often) not yet established at the moment when server attempts to send the Also, the customer still seems to have the expectation that 404 is the right behavior with suspended server, so this is to be cleared up with support. This PR in current state is wrong in any case. |
@TomasHofman A different resulting behaviour depending on proxy discovery configuration is definitely a bug (due to handling of container events). Please open an issue for that. BTW regarding the previous issues you linked, if I am remembering correctly, the WFLY-13074 dealt with the initial suspend mode support (when a simple call to suspend was just unhandled by the mod_cluster subsystem) and WFLY-14121 dealt with a subsequent fix around starting or restarting suspended. |
Follow up ticket: https://issues.redhat.com/browse/WFLY-16416 |
Closing this PR, incorrect behavior. |
This change prevents mod_cluster from registering web contexts when the
container is started in suspended mode. The result is, that when the
contexts remain unregistered, reverse proxy will keep serving 404
responses (rather than 503) on those contexts.
There has been previous attempts to achieve this behavior (WFLY-14121,
WFLY-13074).
https://issues.redhat.com/browse/WFLY-16383