Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue when sending large responses for concurrent requests #37680

Open
xue-cai opened this issue Mar 9, 2021 · 8 comments
Open
Labels
question Issues that look for answers.

Comments

@xue-cai
Copy link

xue-cai commented Mar 9, 2021

  • Version: 12.20.1
  • Platform: Linux xxx 5.4.0-1038-aws Working with nvm, nvmw and similar installation managers #4018.04.1-Ubuntu SMP Sat Feb 6 01:56:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux, or Darwin xxx 19.6.0 Darwin Kernel Version 19.6.0: Tue Jan 12 22:13:05 PST 2021; root:xnu-6153.141.161/RELEASE_X86_64 x86_64
  • Subsystem:

What steps will reproduce the bug?

const express = require("express");

function makeBigObject(leaves, depth) {
  if (depth === 0) {
    return "howdy";
  } else {
    const ret = {};
    for (let i = 0; i < leaves; ++i) {
      ret[i] = makeBigObject(leaves, depth - 1);
    }
    return ret;
  }
}

function getTimeMs() {
  return Date.now() - firstRequestStartTime;
}

const app = express();

const bigObject = makeBigObject(2000, 2);
let requestCount = 0;
let firstRequestStartTime;

async function requestHandler({ requestIndex, req, res }) {
  if (requestIndex === 1) {
    firstRequestStartTime = Date.now();
  }

  console.log(
    `[${getTimeMs()}] Serializing response for request ${requestIndex}...`
  );
  const serializedBigObject = JSON.stringify(bigObject);

  const flushStartTimeMs = Date.now();
  res.on("finish", () => {
    const flushDurationMs = Date.now() - flushStartTimeMs;
    console.log(
      `[${getTimeMs()}] -- Took ${flushDurationMs}ms to flush response for request ${requestIndex} --`
    );
  });

  console.log(
    `[${getTimeMs()}] Sending ${Math.round(
      serializedBigObject.length / 1024 / 1024
    )}MB response for request ${requestIndex}...`
  );
  res.send(serializedBigObject);
  // res.send("ok");

  console.log(`[${getTimeMs()}] - Handler done for request ${requestIndex} -`);
}

app.get("/", async (req, res) => {
  const requestIndex = ++requestCount;
  requestHandler({ requestIndex, req, res });
});

app.listen("/tmp/sock", () =>
  console.log(`Example app listening on Unix domain socket /tmp/sock!`)
);

How often does it reproduce? Is there a required condition?

Send multiple concurrent requests to the app. For example,

(for x in $(seq 5); do time curl --unix-socket /tmp/sock http://localhost/ > /dev/null & done)

What is the expected behavior?

I expect that responses are flushed fairly fast and early requests don't "wait" for later requests to flush together

[0] Processing request 1...
[23] Serializing response for request 1...
[390] Sending 55MB response for request 1...
[536] - Handler done for request 1 -
[537] Processing request 2...
[547] -- Took 158ms to flush response for request 1 --
[558] Serializing response for request 2...
[925] Sending 55MB response for request 2...
[1065] - Handler done for request 2 -
[1065] Processing request 3...
[1077] -- Took 152ms to flush response for request 2 --
[1086] Serializing response for request 3...
[1444] Sending 55MB response for request 3...
[1568] - Handler done for request 3 -
[1568] Processing request 4...
[1579] -- Took 135ms to flush response for request 3 --
[1589] Serializing response for request 4...
[1948] Sending 55MB response for request 4...
[2075] - Handler done for request 4 -
[2075] Processing request 5...
[2087] -- Took 139ms to flush response for request 4 --
[2096] Serializing response for request 5...
[2455] Sending 55MB response for request 5...
[2581] - Handler done for request 5 -
[2591] -- Took 136ms to flush response for request 5 --

What do you see instead?

Earlier requests were blocked and not making progress until the later requests finished processing/serializing. The more concurrent requests, the longer the delay and the worse the problem is.

[0] Serializing response for request 1...
[334] Sending 55MB response for request 1...
[491] - Handler done for request 1 -
[492] Serializing response for request 2...
[820] Sending 55MB response for request 2...
[957] - Handler done for request 2 -
[958] Serializing response for request 3...
[1292] Sending 55MB response for request 3...
[1430] - Handler done for request 3 -
[1432] Serializing response for request 4...
[1763] Sending 55MB response for request 4...
[1902] - Handler done for request 4 -
[1910] Serializing response for request 5...
[2229] Sending 55MB response for request 5...
[2371] - Handler done for request 5 -
[2536] -- Took 773ms to flush response for request 4 --
[2538] -- Took 309ms to flush response for request 5 --
[2539] -- Took 1719ms to flush response for request 2 --
[2540] -- Took 2206ms to flush response for request 1 --
[2540] -- Took 1248ms to flush response for request 3 --

Additional information

if replacing res.send(serializedBigObject) with res.send("ok"), then you get expected behavior. So response size matters.

[0] Serializing response for request 1...
[341] Sending 55MB response for request 1...
[345] - Handler done for request 1 -
[346] -- Took 5ms to flush response for request 1 --
[348] Serializing response for request 2...
[660] Sending 55MB response for request 2...
[661] - Handler done for request 2 -
[661] -- Took 1ms to flush response for request 2 --
[661] Serializing response for request 3...
[976] Sending 55MB response for request 3...
[977] - Handler done for request 3 -
[977] -- Took 1ms to flush response for request 3 --
[977] Serializing response for request 4...
[1296] Sending 55MB response for request 4...
[1297] - Handler done for request 4 -
[1297] -- Took 1ms to flush response for request 4 --
[1297] Serializing response for request 5...
[1633] Sending 55MB response for request 5...
[1633] - Handler done for request 5 -
[1634] -- Took 1ms to flush response for request 5 --

@mscdex
Copy link
Contributor

mscdex commented Mar 9, 2021

FWIW 0df581c introduced this change in behavior.

@jasnell jasnell added the question Issues that look for answers. label Mar 9, 2021
@xue-cai
Copy link
Author

xue-cai commented Mar 10, 2021

Thanks @mscdex for replying! So is this behavior intended? If so, is there a recommendation on how to achieve the expected behavior?

@mscdex
Copy link
Contributor

mscdex commented Mar 10, 2021

It seems that @lpinca mentioned potential issues like this in his comment here in the original PR from that commit. Perhaps he has some ideas on how we can avoid this extra latency?

@lpinca
Copy link
Member

lpinca commented Mar 10, 2021

There might be some unneeded writes for empty chunks introduced by 0df581c as per #27709 (comment) but the issue here is different.

Before 0df581c the 'finish' event was emitted before the actual data was written. The finish callback was called before the write_() callback.

After 0df581c the 'finish' event is emitted after all data is written (after all previous socket.write() callbacks are called). I think this is the correct and expected behavior and is consistent with other writable streams.

@xue-cai
Copy link
Author

xue-cai commented Mar 10, 2021

Just to clarify, so is that true that we haven't seen this issue reported before?

This issue happened in our production servers, and then got reproduced on our dev mac laptops.

If it helps to investigate, we kind of solved this issue by doing both of the following:

  • only process one request at a time using async.queue(), however, flushing request 1's response (res.send(serializedBigObject)) still overlaps with request 2's processing
  • introduce more async gaps (like await 1ms 100 times in a loop) in the beginning of request 2's processing to let request 1's response to flush

With these two workarounds, we were able to achieve the "expected behavior" pasted above.

It seems to us that event loop getting blocked is a problem here, and it has something to do with large response flushing.

@lpinca
Copy link
Member

lpinca commented Mar 11, 2021

Are you using keep-alive connections? If so it makes sense. Because the client socket is the same so the second response has to wait for the data of the first response to be flushed.

Some blocking is expected because you are synchronously serializing and sending 55 MiB of data for every request.

@xue-cai
Copy link
Author

xue-cai commented Mar 11, 2021

They are different file descriptors according to Node.js Diagnostic report (https://nodejs.org/api/report.html) so I don't think second response needs to wait for first response to finish flushing.
Also you can see in "What do you see instead?", the requests finished out of order:

[2536] -- Took 773ms to flush response for request 4 --
[2538] -- Took 309ms to flush response for request 5 --
[2539] -- Took 1719ms to flush response for request 2 --
[2540] -- Took 2206ms to flush response for request 1 --
[2540] -- Took 1248ms to flush response for request 3 --

@lpinca
Copy link
Member

lpinca commented Mar 12, 2021

It is more or less consistent with

const net = require('net');

function makeBigObject(leaves, depth) {
  if (depth === 0) {
    return 'howdy';
  } else {
    const ret = {};
    for (let i = 0; i < leaves; ++i) {
      ret[i] = makeBigObject(leaves, depth - 1);
    }
    return ret;
  }
}

const bigObject = makeBigObject(2000, 2);
let connectionCount = 0;
let firstConnectionStartTime;

function getTimeMs() {
  return Date.now() - firstConnectionStartTime;
}

function handleConnection({ connectionIndex, socket }) {
  socket.resume();

  if (connectionIndex === 1) {
    firstConnectionStartTime = Date.now();
  }

  console.log(
    `[${getTimeMs()}] Serializing data for connection ${connectionIndex}...`
  );
  const serializedBigObject = JSON.stringify(bigObject);

  const flushStartTimeMs = Date.now();
  socket.on('finish', function () {
    const flushDurationMs = Date.now() - flushStartTimeMs;
    console.log(
      `[${getTimeMs()}] -- Took ${flushDurationMs}ms to flush data for connection ${connectionIndex} --`
    );
  });

  console.log(
    `[${getTimeMs()}] Sending ${Math.round(
      serializedBigObject.length / 1024 / 1024
    )}MB data for connection ${connectionIndex}...`
  );
  socket.end(serializedBigObject);

  console.log(
    `[${getTimeMs()}] - Handler done for connection ${connectionIndex} -`
  );
}

const server = net.createServer(function (socket) {
  handleConnection({ connectionIndex: ++connectionCount, socket})
});

server.listen(function () {
  const port = server.address().port;

  for (let i = 0; i < 5; i++) {
    net.createConnection({ port }).resume();
  }
});

so, if anything, it is not an HTTP issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Issues that look for answers.
Projects
None yet
Development

No branches or pull requests

4 participants