-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve stdout byte stream for native commands #17857
Preserve stdout byte stream for native commands #17857
Conversation
Also preserve for `native > file.txt`.
src/System.Management.Automation/engine/AsyncByteStreamDrainer.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/NativeCommandProcessor.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/NativeCommandProcessor.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/runtime/Operations/MiscOps.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/runtime/Operations/MiscOps.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/runtime/Operations/MiscOps.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/NativeCommandProcessor.cs
Outdated
Show resolved
Hide resolved
Glad to see this being taken seriously! It is important for generic shells--especially ones that aspire to work cross platform--to be able to pass data through agnostically. (I cloned and built Powershell just to try this PR! 😃) The stdout redirection did appear to work on my tests (which require not introducing CR into LF-only streams). That is good! I was able to pipe | and redirect > successfully. However, the powershell I built seems to put the redirected native outputs wherever powershell was started up, not in the current directory. So if run from some directory like:
But we find the file in the original directory:
Piping commands that aren't native output, e.g. Redirecting 2> for errors is introducing CRs when CMD.EXE does not give them, and does not have the directory issue...so I assume it is not running the new handling. (As per my post on the discussion thread, my vote is very much that stderr be covered by the same handling!) |
I do not know much about PowerShell, but wanted to try [building a PR to help test it](PowerShell#17857 (comment)). (The PR addresses one big reason why I *can't* use PowerShell...[pipe/redirect corruption](PowerShell#1908).) It is nice that there is a script which will automate the build process. But the default disposition of powershell is to reject it. And the link suggested by the warning message given goes to a very long page, with no obvious prescription for the exact incantation to make to run the script. I don't know if what I did is the "right" answer: (`pwsh -ExecutionPolicy Unrestricted`). But it worked--so I'm offering it as a first draft. If there's a better way to advise first-time-builders, then that should be said instead. But some sort of guidance here would be very helpful.
❤️
Ah yeah that makes sense. That'd be part of what gets fixed with my comment above about hooking up path resolution via
Yeah I can definitely understand the desire, but the only real way to do that with consistency would be to rewrite the Maybe in the future dotnet will add something to the |
Can you explain in more detail why this is significantly different from the stdout case? |
The biggest technical issue is around merging stderr into stdout. So the ideal way this would work (using Windows APIs as an example) is I would create a single pipe (or file for redirection) with Doing the same for This doesn't directly exclude lighting up the It's also not necessarily a simple switch I'd be flipping to enable it for stderr as well. There are a lot of places it needs to be special cased in pipeline creation similarly to how I'm doing it for stdout but subtly different. If/when this PR is merged I'll create a separate issue that is specifically for preserving bytes when redirecting stderr so the engine WG can discuss it. This PR will be for stdout specifically, but that doesn't necessarily mean another PR can't address it. |
The stdout case probably would satisfy most people, whose concerns are more about binary preservation with things like curl/zip. My issue with cross-platform consistency on CRLF is its own crusade, which is distinct from this if one is going to actually "use" PowerShell in a bigger sense. But it happens to be measurably better with this change. I can see that interleaving the streams raises a lot of issues. I'm not clear on how it ever works (are UNIX shells at risk of doing half-codepoints in UTF-8? If not, how? Would their guarantee be accomplished by line buffering vs. understanding UTF-8 encodings specifically? If so, what happens when the line buffer size is exceeded?) 😦 Sounds like something one would find a lot of inconvenient truths by reading up on... |
The trouble mostly comes from utilizing It's much easier when you are synchronously reading from and writing to a single pipe, as most platform specific shells with less output processing will do. |
This pull request has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 15 days. It will be closed if no further activity occurs within 10 days of this comment. |
@adityapatwardhan What sort of feedback would be needed for PowerShell to prioritize reviewing this? It is a rather popular request (see #1908, and touched upon by some other issues), where these seem like things people expect to work in a shell:
Modulo the output directory I found when trying it, it seemed to work for that. |
It's not finished yet, it's a work in progress PR. You can ignore the bot, if it closed it then I'd just reopen |
src/System.Management.Automation/engine/runtime/Operations/MiscOps.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/AsyncByteStreamDrainer.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left one more suggestion, mostly ready to merge now 🚀
src/System.Management.Automation/engine/AsyncByteStreamTransfer.cs
Outdated
Show resolved
Hide resolved
…r.cs Co-authored-by: Dongbo Wang <dongbow@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple of small suggestions
being overly descriptive here isn't necessarily a desired thing. It increases our payload size and ingestion. Think of the string as a simple key that can be queried for in the telemetry data (terabytes of it)
It essentially turns into a column title, but I think the string as is still flows on the wire.
@SeeminglyScience Thanks for pushing this change through! |
🎉 Handy links: |
Wondering will the feature also be enabled by default in 7.4.0 GA eventually? So I don't have to detect PowerShell version -> set ExperimentalFeature every time in my PS script. |
@hez2010 Will if we get no negative feedbacks. |
With this change, any native commands in a pipeline will be able to detect if the downstream command is also native. In that case (and when stderr is not merged into stdout), the native command will write directly to the input stream of the downstream command.
Also added in this change is the ability to pipe bytes directly to a native command, e.g.
PR Summary
PR Context
Left to do:
PowerShell.AddCommand
)Stream.CopyToAsync
(thank you @daxian-dbw!)PR Checklist
.h
,.cpp
,.cs
,.ps1
and.psm1
files have the correct copyright headerWIP:
or[ WIP ]
to the beginning of the title (theWIP
bot will keep its status check atPending
while the prefix is present) and remove the prefix when the PR is ready.native | native
or$bytes | native
ornative > file
will no longer corrupt binary data MicrosoftDocs/PowerShell-Docs#10134(which runs in a different PS Host).