Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream.pipeline abruptly kill the node process #48406

Closed
Thammachart opened this issue Jun 9, 2023 · 14 comments · Fixed by #48435
Closed

stream.pipeline abruptly kill the node process #48406

Thammachart opened this issue Jun 9, 2023 · 14 comments · Fixed by #48435
Labels
stream Issues and PRs related to the stream subsystem.

Comments

@Thammachart
Copy link

Thammachart commented Jun 9, 2023

Version

v18.16.0

Platform

Linux 6.3.6-zen1-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Mon, 05 Jun 2023 15:12:42 +0000 x86_64 GNU/Linux

Subsystem

stream

What steps will reproduce the bug?

This is the snippet that generate Readable stream by pushing start, finished, and 1024 UUID instances in between, and then using that to pipe to file Writable stream named foobar.txt

There are two code paths in this snippet

  • Without flag: node pipeline.mjs will use stream.pipeline(rs,ws) to stream the file
    • function name: pipelineStream
  • With flag: node pipeline.mjs m will use rs.pipe(ws) + Promise to stream the file
    • function name: manuallyPipeStream
import { createWriteStream } from "fs";
import { Readable } from "stream";
import { pipeline } from "stream/promises";

const OUTPUT_FILENAME = "foobar.txt";

const generateContent = async (rs) => {
  for (let i = 0; i < 1024; i++) {
    rs.push("11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000");
    rs.push("\n");
  }
  rs.push("finished\n");
  rs.push(null);
};

const pipelineStream = async (rs) => {
  const ws = createWriteStream(OUTPUT_FILENAME);

  await pipeline(rs, ws);
  console.log("finished piping");
};

const manuallyPipeStream = async (rs) => {
  const ws = createWriteStream(OUTPUT_FILENAME);

  rs.pipe(ws);
  await new Promise((resolve, reject) => {
    ws.on("error", reject);
    ws.on("finish", resolve);
  });
  console.log("finished piping");
};

const main = async () => {
  const manualFn = process.argv[2] === "m";
  console.log("--- node version:", process.version);
  console.log(`--- using ${manualFn ? "`rs.pipe(ws)`" : "`stream.pipeline(rs,ws)`"}`)

  const rs = new Readable();
  rs.push("start\n");

  await Promise.all([
    generateContent(rs),
    manualFn ? manuallyPipeStream(rs) : pipelineStream(rs),
  ]);

  console.log("--- program finished successfully");
};

main();

How often does it reproduce? Is there a required condition?

I've tested the code against v16.20.0, v18.15.0, v18.16.0, v20.3.0, in Arch Linux and macOS 12

OSes doesn't seems to be the factor of this problem.

The problem seems to only exist when all of these conditions are met:

  • Node Versions: v18.16.0 or v20.3.0
  • Running the code without flag: node pipeline.mjs, to use stream.pipeline(rs,ws) code path

What is the expected behavior? Why is that the expected behavior?

When running the code successfully,

  • the standard output is:
--- node version: v18.15.0
--- using `stream.pipeline(rs,ws)`
finished piping
--- program finished successfully
  • File named foobar.txt with the size of 37 KiB, with proper start, finished and UUIDs in between
start
11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000
11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000
11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000
11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000
11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000
# ... 1019 other UUIDs
finished

What do you see instead?

  • The standard output is only just:
--- node version: v18.16.0
--- using `stream.pipeline(rs,ws)`
  • The node process is abnormally killed without any error/exception thrown and exit code is just 0

  • File foobar.txt become malformed:

    • Only start and 443 UUIDs, without finished
    • Only the size of 16 KiB, which suspiciously match the size of highWaterMark 16384

Additional information

Regardless of node versions and OSes, running the code using flag: node pipeline.mjs m, to use rs.pipe(ws) + Promise code path, do always produce the correct results.

Minimal snippet that has one code path, and only focus on the issue

import { createWriteStream } from "fs";
import { Readable, promises } from "stream";

const generateContent = async (rs) => {
  rs.push("start\n");
  for (let i = 0; i < 1024; i++)
    rs.push("11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000\n");
  rs.push("finished\n");
  rs.push(null);
};

const main = async () => {
  const [rs, ws] = [new Readable(), createWriteStream("foobar.txt")];
  await Promise.all([generateContent(rs), promises.pipeline(rs, ws)]);
  console.log("--- program finished successfully");
};

main().catch((e) => console.error("*** ERR:", e));
@Thammachart Thammachart changed the title stream.pipeline abruptly kill the node process without any signalling stream.pipeline abruptly kill the node process Jun 9, 2023
@rohith-commits
Copy link

You got into Async and Await hell. In function pipeLineStream() you should not use 'await'. Instead use .then() as it is returning promise. like:
`const pipelineStream = async (rs) => {
const ws = createWriteStream(OUTPUT_FILENAME);

pipeline(rs, ws).then(()=>{
console.log("finished piping");
})

};`

This will work.

@Thammachart
Copy link
Author

Thammachart commented Jun 10, 2023

@rohith-bot The standard output seems to be "half" working, but the file foobar.txt is still in malformed state.

--- node version: v18.16.0
--- using `stream.pipeline(rs,ws)`
--- program finished successfully

Notice that finished piping didn't get printed, but --- program finished successfully did

Additionally, can you explain why some earlier versions of node (v16.20.0, v18.15.0) can run this code successfully using await pipeline(rs, ws), but v18.16.0 or later can't ?

And also, why can't I await the promisified steam.pipeline ? even the example from the nodejs itself (link) do use await pipeline(...)

@debadree25 debadree25 added the stream Issues and PRs related to the stream subsystem. label Jun 10, 2023
@rohith-commits
Copy link

have you tried using a buffer

@Thammachart
Copy link
Author

Thammachart commented Jun 10, 2023

@rohith-bot This issue isn't about how to circumvent the problem, as I've already found the viable workaround without the need to use Buffer (the workaround is in the function manuallyPipeStream).

It is about pointing out the valid usecase of stream.pipeline that leads to unexpected behaviors, even in the same major versions of nodejs (v18.15.0 is ok, but v18.16.0 isn't).

@bnoordhuis
Copy link
Member

@nodejs/streams

@ronag
Copy link
Member

ronag commented Jun 11, 2023

Can you make a more minimal sample?

@Thammachart
Copy link
Author

@ronag From 50 to 18 LoC, only focus on the bug, node v18.16.0 or later fails

import { createWriteStream } from "fs";
import { Readable, promises } from "stream";

const generateContent = async (rs) => {
  rs.push("start\n");
  for (let i = 0; i < 1024; i++)
    rs.push("11bf5b37-e0b8-42e0-8dcf-dc8c4aefc000\n");
  rs.push("finished\n");
  rs.push(null);
};

const main = async () => {
  const [rs, ws] = [new Readable(), createWriteStream("foobar.txt")];
  await Promise.all([generateContent(rs), promises.pipeline(rs, ws)]);
  console.log("--- program finished successfully");
};

main().catch((e) => console.error("*** ERR:", e));

@mcollina
Copy link
Member

I think this is normal (unfortunately) and expacted. Promises do not keep the event loop alive and I think there is a race condition where there are only promises waiting to be processed.

To fix this, you should use ESM and Top-Level Await and await your main function.

@Thammachart
Copy link
Author

Thammachart commented Jun 12, 2023

@mcollina

  • It is already ESM from the beginning (notice the import statements), and it need to be run in .mjs file

  • I don't get the "Promises do not keep the event loop alive" part, if that's the case then this code snippet would exit immediately without waiting for 5 seconds, whether it's ESM or not.

const main = async () => {
  await new Promise((resolve) => setTimeout(resolve, 5000));
  console.log("finished waiting");
};
main();

But for the sake of conversation and progress, as you suggest, I did change the last line into

await main().catch((e) => console.error("*** ERR:", e));

And nothing changed. foobar.txt still malformed.

About the race conditions part, possibly, but certainly not the Promises. Otherwise it would affect all versions of node that support Promise.


I re-checked the code again, and it's clearly that I did await all Promises that need to be done, which is only 1 place.

// ...
await Promise.all([generateContent(rs), promises.pipeline(rs, ws)]);
// ...

@ronag
Copy link
Member

ronag commented Jun 12, 2023

I just ran the sample on v18.13.0 and I don't see any issue?

@ronag
Copy link
Member

ronag commented Jun 12, 2023

v18.16.0 seems to have it.

@Thammachart
Copy link
Author

Thammachart commented Jun 12, 2023

@ronag It affects v18.16.0 or later, v18.15.0 or earlier are ok. You can check the required conditions in the original post.

@Thammachart
Copy link
Author

Let me re-phrase the required conditions that this issue affects again, using minimal snippet.

  • node >= v18.16.0: affected by it
  • node <= v18.15.0: NOT affected by it

@ronag
Copy link
Member

ronag commented Jun 12, 2023

Yea, I see the problem.

ronag added a commit to nxtedition/node that referenced this issue Jun 12, 2023
ronag added a commit to nxtedition/node that referenced this issue Jun 12, 2023
ronag added a commit to nxtedition/node that referenced this issue Jun 12, 2023
nodejs-github-bot pushed a commit that referenced this issue Jun 24, 2023
Fixes: #48406
PR-URL: #48435
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Marco Ippolito <marcoippolito54@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
RafaelGSS pushed a commit that referenced this issue Jul 3, 2023
Fixes: #48406
PR-URL: #48435
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Marco Ippolito <marcoippolito54@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
Fixes: nodejs#48406
PR-URL: nodejs#48435
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Marco Ippolito <marcoippolito54@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
Fixes: nodejs#48406
PR-URL: nodejs#48435
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Marco Ippolito <marcoippolito54@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
ruyadorno pushed a commit that referenced this issue Sep 10, 2023
Fixes: #48406
PR-URL: #48435
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Marco Ippolito <marcoippolito54@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
ruyadorno pushed a commit that referenced this issue Sep 13, 2023
Fixes: #48406
PR-URL: #48435
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Marco Ippolito <marcoippolito54@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants