Make subprocess async iterable #923

ehmicky · 2024-03-22T20:33:22Z

This makes subprocesses async iterable.

for await (const line of $`npm run build`) {
  if (line.includes('ERROR')) {
    console.log(line);
  }
}

A subprocess.iterable(options) method is also available to pass options. The options are the same as subprocess.readable(): {from?: string, binary?: boolean, preserveNewlines?: boolean}.

for await (const line of $`npm run build`.iterable({from: 'stderr'})) {
  // ...
}

Advantages over `subprocess.readable()`

Since streams are async iterable, this is similar to:

for await (const line of $`npm run build`.readable(options)) {
  // ...
}

However, it is better than subprocess.readable() when the user just wants to iterate over output lines:

It uses Symbol.asyncIterator, which results in a simpler syntax (first example above).
It defaults to iterating over line strings (with newlines stripped) instead of arbitrary chunks of data. That's because we know the user only wants to iterate. We do not know this with subprocess.readable() (since streams can be used in other ways), which means the user must explicitly opt-in to line iteration using subprocess.readable({binary: false, preserveNewlines: false}).
It does not involve streams.
- Iterating over subprocess.readable() actually adds a few additional layers. First, a stream is created, with some state that must be synced with the subprocess (which is not simple). Then, that stream consumes the underlying async iterable in its .read() method. Then, the Readable iteration calls .read() repeatedly, in a different async iterable.
- This results in: subprocess.stdout (stream) -> on(stream, 'data') (iterable) -> subprocess.readable() (stream) -> Readable.iterator() (iterable).
- Between each of those layers, some buffering applies.
- This PR uses the underlying async iterable directly instead. This is simpler, i.e. more efficient and probably more stable.

Advantages over iterating over `subprocess.stdout`

Just like subprocess.readable(), subprocess.iterable() is better than just iterating over subprocess.stdout.

for await (const line of $`npm run build`.stdout) {
  // ...
}

Because:

It iterates over lines, with newlines stripped (by default)
Each chunk is a string (by default)
It waits for the subprocess to complete
It throws if the subprocess fails (as opposed to only catching subprocess.stdout failures, which rarely happens)
It allows for multiple readers at once

Difference with Execa transforms

Unlike Execa transforms:

This is not intended for mapping the subprocess output
It only applies to output, not input
It does not modify subprocess.stdout, which can be either good or bad, depending on the use case

sindresorhus · 2024-03-23T13:18:20Z

I love it 👍

HerbCaudill · 2024-03-24T16:46:40Z

I just used execa for the first time today - great stuff. (It never ceases to amaze me how often I need something and there turns out to be a @sindresorhus package that does what I want.)

Anyway I was trying to get this async iterable API to work, and was tearing my hair out because the code I copy-pasted from the docs wasn't working. Imagine my surprise when I went to look and saw that the feature was only merged yesterday and hasn't been released! 😅

It looks like there's been a ton of work done since the last release. Any idea when a new release might be cut? And would it make sense within your workflow to cut prerelease versions for people who wanted to try out the new stuff early?

Either way, seems like you'd probably want to keep unreleased changes -- or at the very least, the docs for unreleased changes -- in another branch, so that the readme etc. in main goes with the the code in the current release.

Thank you both for all your contributions to the ecosystem, we're all in your debt.

ehmicky · 2024-03-24T17:09:51Z

Hi @HerbCaudill,

Thanks a lot for your enthusiasm for this feature, I'm also pretty excited about it.

There are lots of back-and-forth and breaking changes with the upcoming release. We're getting closer to release time, but we're not there yet, not even a pre-release. However, we should be ready in a few weeks.

HerbCaudill · 2024-03-24T18:30:40Z

👍 sounds good.

seems like you'd probably want to keep unreleased changes -- or at the very least, the docs for unreleased changes -- in another branch, so that the readme etc. in main goes with the the code in the current release.

I do think a next branch or something would be a good practice in general, so the docs match the released code. I was doubting my sanity there for a good hour or so.

And just in the last few minutes I've run into other features in the docs (e.g. .pipe()) that turn out not to be available yet, and that's pretty frustrating. 🙂

Make subprocess async iterable

3c6f0c1

sindresorhus merged commit 2e8e8c9 into main Mar 23, 2024
14 checks passed

sindresorhus deleted the iterator branch March 23, 2024 13:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make subprocess async iterable #923

Make subprocess async iterable #923

ehmicky commented Mar 22, 2024 •

edited

sindresorhus commented Mar 23, 2024

HerbCaudill commented Mar 24, 2024 •

edited

ehmicky commented Mar 24, 2024 •

edited

HerbCaudill commented Mar 24, 2024

Make subprocess async iterable #923

Make subprocess async iterable #923

Conversation

ehmicky commented Mar 22, 2024 • edited

Advantages over subprocess.readable()

Advantages over iterating over subprocess.stdout

Difference with Execa transforms

sindresorhus commented Mar 23, 2024

HerbCaudill commented Mar 24, 2024 • edited

ehmicky commented Mar 24, 2024 • edited

HerbCaudill commented Mar 24, 2024

ehmicky commented Mar 22, 2024 •

edited

Advantages over `subprocess.readable()`

Advantages over iterating over `subprocess.stdout`

HerbCaudill commented Mar 24, 2024 •

edited

ehmicky commented Mar 24, 2024 •

edited