-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Prerequisites
- Existing Issue: Search the existing issues for this repository. If there is an issue that fits your needs do not file a new one. Subscribe, react, or comment on that issue instead.
- Descriptive Title: Write the title for this issue as a short synopsis. If possible, provide context. For example, "Document new
Get-Foo
cmdlet" instead of "New cmdlet."
PowerShell Version
7.5
Summary
- add “How improve and parallelize your script performance” listing all approaches including start-process. https://stackoverflow.com/a/73999295 shouldn’t be the location people need to go to understand that just adding -parallel everywhere is a bad idea.
- alter ForEach-Object and other related articles to link to 1.
Details
The function I wanted has a folder path as an input and returns array of all files in all subfolders, including archives and including archives inside archives, and for each .dll and .exe it also returns File Version and Assembly Version.
This is just so I could compare the impact of PR changes on the drop we produce.
We have a non-recursive implementation of that now and I’ve tried to improve/speed up/parallelize it so that it’d do what I want and I’ve hit the fact that foreach -parallel is super expensive due to runspaces.
• I asked much smarter people than me to parallelize my script (Claude in VS Code) and without any further prompt than “ensure fastest execution” it used foreach -parallel that turned out to be a bad idea with the function choking on large archives.
• I asked Claude to write a script from scratch and it still used foreach -parallel
• I came up with an implementation that spawns 7z.exe via Start-Process to unpack necessary archives and files and executes way faster
• I looked at the docs for ForEach-Object (Microsoft.PowerShell.Core) - PowerShell | Microsoft Learn and talk about -parallel being slow due to RunSpaces, but don’t list any remedy
• I started writing this wanting to complain about the performance and me having to use 3rd party program 7zip, but in the process of chasing ends I’ve stumbled on https://stackoverflow.com/a/73999295 answer that mentioned Start-ThreadJob as a lightweight alternative.
• I’ve asked Claude to write the function using Start-ThreadJob and it turns out that almost matches performance of my 7zip implementation.
To me as a layman, my first thought is “to improve powershell performance I can use -parallel”. I can see that it’s the same for Claude as well. The fact that -parallel is incredibly expensive is not obvious.
Proposed Content Type
Concept, About Topic
Proposed Title
No response