Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely high CPU usage when piping data #18702

Closed
5 tasks done
ytimenkov opened this issue Dec 1, 2022 · 2 comments
Closed
5 tasks done

Extremely high CPU usage when piping data #18702

ytimenkov opened this issue Dec 1, 2022 · 2 comments
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-Answered The question is answered. Resolution-Duplicate The issue is a duplicate.

Comments

@ytimenkov
Copy link

Prerequisites

Steps to reproduce

Linux, Powershell Core 7.2.3

When I try to pass large mounts of data (an image), for example:

$ curl -L https://download.opensuse.org/distribution/leap/15.4/appliances/iso/openSUSE-Leap-15.4-CR-DVD-x86_64-Media.iso | dd of=/dev/null

(with fast internet, alternatively one may try piping xzcat's output to dd).

Expected behavior

For the similar command bash doesn't use any CPU at all.

Actual behavior

CPU usage of `pwsh` process us 150-160% (4 cores).

Error details

I understand that pwsh may read input and split into lines while bash simply does `dup2`, but maybe there could be some short-circuit if both ends of pipe operator are fed to an external process.

Environment data

Name                           Value
----                           -----
PSVersion                      7.3.0
PSEdition                      Core
GitCommitId                    7.3.0
OS                             Linux 6.0.8-1-default #1 SMP PREEMPT_DYNAMIC Fri Nov 11 08:02:50 UTC 2022 (1579d93)
Platform                       Unix
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

Visuals

No response

@ytimenkov ytimenkov added the Needs-Triage The issue is new and needs to be triaged by a work group. label Dec 1, 2022
@jborean93
Copy link
Collaborator

See #1908, piping data currently will have PowerShell encode the output from the native command from bytes to a string then back to bytes when it sends the data into the input of the next native application. This breaks scenarios like this as you cannot safely roundtrip the bytes to and from a string when they are not bytes in the first place. There's a WIP PR to implement this #17857 and while it might not be as efficient as bash it should enable such scenarios.

The workaround for now it so run this command in a sub shell like

/bin/bash -c 'curl -L ... | dd ...'

@iSazonov iSazonov added Resolution-Duplicate The issue is a duplicate. Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-Answered The question is answered. labels Dec 2, 2022
@ytimenkov
Copy link
Author

Ah, I did search issues, maybe used wrong keyword :(

I'm aware of bash trick. Just realize too late that the process goes slow because of PS. Also it lacks completion.

Well, you've had pretty long discussion already and I realize that it's not that straightforward to fix.

@ghost ghost removed the Needs-Triage The issue is new and needs to be triaged by a work group. label Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-Answered The question is answered. Resolution-Duplicate The issue is a duplicate.
Projects
None yet
Development

No branches or pull requests

3 participants