New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP Server Stops Responding to Any/All Requests #40724
Comments
cc @nodejs/platform-ibmi |
We have been able to recreate on an IBM i 7.4 system. Will look into. |
@DavidRusso , as you can see via the PR to libuv, we think we have this one figured out. We'll be building patches for the IBM repository shortly. I'll post here when they're published, and I would appreciate if you could do some verification |
@ThePrez , that's great news, thanks! Yes, I'll be glad to test/verify. |
Hi. Are the patches released yet in the IBM i repos? If so, what version(s) include the fix? |
Hello @DavidRusso - apologies for the delay. There is a thread and a pr with ongoing discussion about bumping libuv version in Nodejs to 1.44.2 (which is the earliest libuv version that includes the fix for this issue). Once landed node 18 should have it fixed, but I am not sure if libuv version change will end up backported into earlier Node versions. |
Version
14.18.1
Platform
IBM i 7.4
Subsystem
http
What steps will reproduce the bug?
The HTTP server can suddenly stop responding to any and all requests. When the server gets into this state it will remain listening and accepting client connections but will stop running the
requestListener
callback, so clients will suddenly stop getting responses to any request. When the server gets into this state, there is no output at stdout/stderr and the behavior persists for the duration of the process.The problem is intermittent and it's not clear what causes it, but it seems to be triggered by certain network activity. The only way I can reproduce it on demand is by running a network vulnerability scan against the server using Nessus Essentials, and having the server running in multiple processes using
cluster
.To be clear, I have seen the problem occur many times during normal use of the HTTP server without any network scans taking place. I have also seen it happen without
cluster
in play. This is just the only way I have found to reliably reproduce it.To reproduce, run this simple server on IBM i:
Then run a Basic Network Scan using Nessus Essentials:
https://www.tenable.com/downloads/nessus
I've been using Nessus 8.15.2 and running it on Windows 10. To setup the scan:
Click "New Scan"
Choose the Basic Network Scan
Leave all other settings at default
Launch scan and wait for it to complete
Once the scan is complete, the server will stop responding to any requests. The
requestListener
callback won't run, as shown by lack of console output, and the client will never get a response. This behavior will persist for the duration of the server process.How often does it reproduce? Is there a required condition?
The steps above will reproduce the problem every time.
What is the expected behavior?
The server should continue responding to requests.
What do you see instead?
The server inexplicably stops responding to requests for the duration of the process.
Additional information
I have been able to reproduce this problem reliably on IBM i 7.4 and 7.2. I haven't yet tried on 7.3, but I have had users of my package report what I think is the same problem on IBM i 7.3. I have never seen this happen with Node running on other platforms.
IBM i NETSTAT reports everything normal while the server is in the bad state. For example, if I run this query while trying some requests:
I get this output:
Running a WireShark trace on the client side while using Chrome to make a request also looks normal. The TCP 3-way connection handshake goes normally, and the server also ACKs the HTTP request frame. Then Chrome just waits and waits for the response that never comes. Meanwhile it sends TCP Keep Alive probes to the server, and the server ACKs them as expected.
When the server process is in this state, the IBM i active job status and callstack look normal. The active job status is SELW (select wait) and the callstack shows the process is waiting on I/O via
poll()
call. This state is identical to when the server is responding to requests normally. Also, there is nothing in the IBM i job log.This is a serious stability issue for Node.js on IBM i. As I mentioned above, I have seen this happen without any network scans going on, and without
cluster
in play. Use ofcluster
seems to exacerbate the problem.The text was updated successfully, but these errors were encountered: