Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't perform protection checks in Unix Domain Socket mode #532

Closed
wants to merge 4 commits into from

Conversation

aooohan
Copy link
Member

@aooohan aooohan commented Jul 18, 2022

See stackoverflow and #402

9.0.x, 10.0.x all have this problem.

aooohan added 3 commits July 18, 2022 12:52
…n OS bugs to ensure connections can be closed in time.
…ing known OS bugs to ensure connections can be closed in time."

This reverts commit 8d64d9a.
@aooohan aooohan changed the title Fixes NPE and Request hang issues in AprEndpoint Fixes NPE issues in AprEndpoint Jul 18, 2022
@aooohan
Copy link
Member Author

aooohan commented Jul 18, 2022

Due to one's limited level, after my deep research, I found that it is not a simple NPE problem, but in Unix Domain Sokcet mode, there is not that OS bug, so there is no need to perform protection checks. I hope my understanding is right, hahahaha

@aooohan aooohan changed the title Fixes NPE issues in AprEndpoint No protection checks are performed in Unix Domain Socket mode Jul 18, 2022
@aooohan aooohan changed the title No protection checks are performed in Unix Domain Socket mode Don't perform protection checks in Unix Domain Socket mode Jul 18, 2022
@exabrial
Copy link

Just want to say thanks! Our initial testing shows that bypassing the TCP layer and using Unix sockets is very significant performance increase.... like a lot faster. Pretty danged cool! We'd probably terminate TCP/TLS with Haproxy then load balance to Tomcat listening on Unix sockets.

We still have a few more issues to overcome, but those are separate items or perhaps bugs, for another discussion:

  • We noticed the the socket file doesn't seem to get cleaned up, despite the documentation indicating it should. As a workaround, we have the systemd unit remove the file after Tomcat stops. We are trying to root cause this in Tomcat code and see if we can figure out whats wrong.
  • We noticed request.getRemoteAddr() and request.getRemoteHost() are broken (expected). We improvised a Valve to hardcode 127.0.0.1 for each and that seems to do the trick, then use the RemoteAddrValve to set the real values. A better option might be: Haproxy has something akin to AJP called "Proxy Protocol" that can pass a lot of information from the original request. There's a open bug about it that we might look into helping get across the finish line: https://bz.apache.org/bugzilla/show_bug.cgi?id=57830

Anyway thank you! Very much appreciated!

@aooohan
Copy link
Member Author

aooohan commented Jul 20, 2022

We noticed the the socket file doesn't seem to get cleaned up, despite the documentation indicating it should. As a workaround, we have the systemd unit remove the file after Tomcat stops. We are trying to root cause this in Tomcat code and see if we can figure out whats wrong.

@exabrial I also noticed this problem. I found that the automatic cleaning of UDS file was implemented in NioEndpoint, and I didn't find any logic about it in AprEndpoint.

@rmaucher
Copy link
Contributor

The cleanup in NIO is still limited (on purpose): it is only cleaned up if the socket was created successfully by the connector, and it anything fails really badly it will stay there.

@aooohan
Copy link
Member Author

aooohan commented Jul 20, 2022

We noticed the the socket file doesn't seem to get cleaned up, despite the documentation indicating it should. As a workaround, we have the systemd unit remove the file after Tomcat stops. We are trying to root cause this in Tomcat code and see if we can figure out whats wrong.

@exabrial I also noticed this problem. I found that the automatic cleaning of UDS file was implemented in NioEndpoint, and I didn't find any logic about it in AprEndpoint.

@exabrial Aha, i found it! The cleanup function is implemented in tc-native, you can take a look at this
https://github.com/apache/tomcat-native/blob/a3498fa0992ac37c7358e00d1555395b52762e9b/xdocs/index.xml#L180
and apache/tomcat-native@a3498fa

@aooohan
Copy link
Member Author

aooohan commented Jul 20, 2022

The cleanup in NIO is still limited (on purpose): it is only cleaned up if the socket was created successfully by the connector, and it anything fails really badly it will stay there.

Thanks for your reply. Indeed, the cleanup does have limitations.

markt-asf added a commit that referenced this pull request Aug 1, 2022
markt-asf added a commit that referenced this pull request Aug 1, 2022
@markt-asf
Copy link
Contributor

Merged manually. Thanks for the PR.

@markt-asf markt-asf closed this Aug 1, 2022
markt-asf added a commit that referenced this pull request Aug 1, 2022
@markt-asf
Copy link
Contributor

Note that 8.5.x is not affected by this issue.

@aooohan aooohan deleted the bugfix_npe_aprendpoint branch August 1, 2022 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants