Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make subprocess async iterable #923

Merged
merged 1 commit into from
Mar 23, 2024
Merged

Make subprocess async iterable #923

merged 1 commit into from
Mar 23, 2024

Conversation

ehmicky
Copy link
Collaborator

@ehmicky ehmicky commented Mar 22, 2024

This makes subprocesses async iterable.

for await (const line of $`npm run build`) {
  if (line.includes('ERROR')) {
    console.log(line);
  }
}

A subprocess.iterable(options) method is also available to pass options. The options are the same as subprocess.readable(): {from?: string, binary?: boolean, preserveNewlines?: boolean}.

for await (const line of $`npm run build`.iterable({from: 'stderr'})) {
  // ...
}

Advantages over subprocess.readable()

Since streams are async iterable, this is similar to:

for await (const line of $`npm run build`.readable(options)) {
  // ...
}

However, it is better than subprocess.readable() when the user just wants to iterate over output lines:

  • It uses Symbol.asyncIterator, which results in a simpler syntax (first example above).
  • It defaults to iterating over line strings (with newlines stripped) instead of arbitrary chunks of data. That's because we know the user only wants to iterate. We do not know this with subprocess.readable() (since streams can be used in other ways), which means the user must explicitly opt-in to line iteration using subprocess.readable({binary: false, preserveNewlines: false}).
  • It does not involve streams.
    • Iterating over subprocess.readable() actually adds a few additional layers. First, a stream is created, with some state that must be synced with the subprocess (which is not simple). Then, that stream consumes the underlying async iterable in its .read() method. Then, the Readable iteration calls .read() repeatedly, in a different async iterable.
    • This results in: subprocess.stdout (stream) -> on(stream, 'data') (iterable) -> subprocess.readable() (stream) -> Readable.iterator() (iterable).
    • Between each of those layers, some buffering applies.
    • This PR uses the underlying async iterable directly instead. This is simpler, i.e. more efficient and probably more stable.

Advantages over iterating over subprocess.stdout

Just like subprocess.readable(), subprocess.iterable() is better than just iterating over subprocess.stdout.

for await (const line of $`npm run build`.stdout) {
  // ...
}

Because:

  • It iterates over lines, with newlines stripped (by default)
  • Each chunk is a string (by default)
  • It waits for the subprocess to complete
  • It throws if the subprocess fails (as opposed to only catching subprocess.stdout failures, which rarely happens)
  • It allows for multiple readers at once

Difference with Execa transforms

Unlike Execa transforms:

  • This is not intended for mapping the subprocess output
  • It only applies to output, not input
  • It does not modify subprocess.stdout, which can be either good or bad, depending on the use case

@sindresorhus sindresorhus merged commit 2e8e8c9 into main Mar 23, 2024
14 checks passed
@sindresorhus sindresorhus deleted the iterator branch March 23, 2024 13:17
@sindresorhus
Copy link
Owner

I love it 👍

@HerbCaudill
Copy link

HerbCaudill commented Mar 24, 2024

I just used execa for the first time today - great stuff. (It never ceases to amaze me how often I need something and there turns out to be a @sindresorhus package that does what I want.)

Anyway I was trying to get this async iterable API to work, and was tearing my hair out because the code I copy-pasted from the docs wasn't working. Imagine my surprise when I went to look and saw that the feature was only merged yesterday and hasn't been released! 😅

It looks like there's been a ton of work done since the last release. Any idea when a new release might be cut? And would it make sense within your workflow to cut prerelease versions for people who wanted to try out the new stuff early?

Either way, seems like you'd probably want to keep unreleased changes -- or at the very least, the docs for unreleased changes -- in another branch, so that the readme etc. in main goes with the the code in the current release.

Thank you both for all your contributions to the ecosystem, we're all in your debt.

@ehmicky
Copy link
Collaborator Author

ehmicky commented Mar 24, 2024

Hi @HerbCaudill,

Thanks a lot for your enthusiasm for this feature, I'm also pretty excited about it.

There are lots of back-and-forth and breaking changes with the upcoming release. We're getting closer to release time, but we're not there yet, not even a pre-release. However, we should be ready in a few weeks.

@HerbCaudill
Copy link

👍 sounds good.

seems like you'd probably want to keep unreleased changes -- or at the very least, the docs for unreleased changes -- in another branch, so that the readme etc. in main goes with the the code in the current release.

I do think a next branch or something would be a good practice in general, so the docs match the released code. I was doubting my sanity there for a good hour or so.

And just in the last few minutes I've run into other features in the docs (e.g. .pipe()) that turn out not to be available yet, and that's pretty frustrating. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants