Skip to content

Gateway service tcp connection leak #57946

@AcherTT

Description

@AcherTT

Version

v24.0.0-pre

Platform

I've only tested it on linux and mac.

Subsystem

No response

What steps will reproduce the bug?

step

  1. Simulate a backend service that sends 4KB of content every 100ms, for a total duration of 5 seconds.
  2. Implement a proxy in between.
  3. Simulate a client that initiates two requests, but aborts the connection after 3 seconds.

code

Below is the minimal reproduction code:

server

const http = require('http');
const crypto = require('crypto');

const SERVER_PORT = 10000;

// random data
const data = crypto.randomBytes(1024 * 4);

// Simulate a server
http.createServer((req, res) => {
  res.writeHead(200, {
    'Content-Type': 'text/plain',
    'Transfer-Encoding': 'chunked'
  });
  setInterval(() => { res.write(data) }, 100);
  setTimeout(() => { res.end('end') }, 5000);
}).listen(SERVER_PORT);

proxy gateway

const http = require('http');
const url = require('url');

// Simulate a proxy server
const PROXY_PORT = 10001;
http.createServer((req, res) => {
  const targetUrl = req.url.startsWith('http') ? req.url : `http://${req.url}`;
  const parsedUrl = url.parse(targetUrl);

  const options = {
    hostname: "localhost",
    port: SERVER_PORT,
    path: parsedUrl.path,
    method: req.method,
    headers: req.headers,
    agent: false // Do not use keep-alive, or it will be recycled.
  };

  delete options.headers['host'];
  delete options.headers['content-length'];

  const proxyReq = http.request(options, (proxyRes) => {
    res.writeHead(proxyRes.statusCode, proxyRes.headers);
    proxyRes.pipe(res, { end: true });
  });

  proxyReq.on('error', (err) => {
    console.error('proxyReqErr:', err);
    res.writeHead(500, { 'Content-Type': 'text/plain' });
    res.end(err.message);
  });

  req.pipe(proxyReq, { end: true });

  // he request is short, it will automatically close
  req.on('close', function () {
    console.log('client close, but response not end');
  });

  res.on('close', function () {
    proxyReq.destroy();
  });
}).listen(PROXY_PORT);

client

const net = require('net');

const HOST = 'localhost';
const PORT = 10001;

const requestLines = [
  'GET / HTTP/1.1',
  `Host: ${HOST}:${PORT}`,
  'Connection: keep-alive',
  'Keep-Alive: timeout=60, max=5',
  '',
  ''
];
const requestData = requestLines.join('\r\n');

const client = net.createConnection({ host: HOST, port: PORT }, () => {
  // send two requests
  client.write(requestData);
  client.write(requestData);

  setTimeout(() => {
    console.log('client abort');
    client.destroy();
  }, 3000);
});

let index = 0;
client.on('data', (chunk) => { console.log('-', ++index) });
client.on('end', () => { console.log('连接已关闭') });
client.on('error', (err) => { console.error('连接出错:', err) });

How often does it reproduce? Is there a required condition?

always

What is the expected behavior? Why is that the expected behavior?

In this scenario, both proxy -> server TCP connections should be properly closed and reclaimed — not just one.

What do you see instead?

After approximately 5 seconds, do not terminate the process.
Use the netstat -ant | grep 10000 command to observe the TCP connection between the proxy and the server.
You will notice that the connection remains in the CLOSE_WAIT state on the proxy side, while the server side stays in the FIN_WAIT_2 state.

On Linux, connections in FIN_WAIT_2 will eventually timeout and be reclaimed, but CLOSE_WAIT connections will not close automatically, leading to a potential resource leak.
On macOS, the server eventually sends an RST packet to forcibly close the connection.

Image

This should be the case even if the http.Agent is not configured with keep-alive.
Relying solely on TCP-level keep-alive for connection cleanup is not reliable nor sufficient.

Additional information

What's more, most proxy projects in the npm ecosystem do not even listen to the res.on('close') event.
As a result, none of the proxy -> server TCP connections get properly closed, which exacerbates the risk of connection leaks in long-running services.

After further investigation, I found that the res object which is still waiting in state.outgoing does not trigger the close event.

So, I wonder if a potential fix could be to explicitly clean up inside the state.onClose function.
For example (simplified logic just for demonstration purposes):

// _http_server.js
function socketOnClose(socket, state) {
  debug('server socket close');
  freeParser(socket.parser, null, socket);
  abortIncoming(state.incoming);
  abortIncoming(state.outgoing) // <-- add this line,If adopted, this function needs to be renamed
}

function abortIncoming(incoming) {
  while (incoming.length) {
    const req = incoming.shift();
    req.destroy(new ConnResetException('aborted'));
  }
  // Abort socket._httpMessage ?
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions