Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of Windows PowerShell is better then PowerShell Core #7112

Closed
iSazonov opened this issue Jun 19, 2018 · 59 comments

Comments

Projects
None yet
10 participants
@iSazonov
Copy link
Collaborator

commented Jun 19, 2018

I get very strange and disturbing performance results for Import-Csv cmdlet.

Import 111 Mb csv log file:

  • On Windows PowerShell always ~ 3.7 sec. Memory ~ 10 Mb.
    -On PowerShell Core from 6 sec up to 27 sec - each subsequent launch increases the time by 1-2-3 seconds. Memory grow from 1 Mb to 105 Mb. After waiting some minutes memory decreases to ~1 MB and agaim the cmdlet take ~6 sec.

I suppose we get the same for any script.

@markekraus

This comment has been minimized.

Copy link
Collaborator

commented Jun 19, 2018

@iSazonov we have seen similar memory growth with string related activities in JSON too. I suspect .NET Core is just greedier with its GC than Framework.

But the slow down.. that is concerning.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 19, 2018

@daxian-dbw If you are working on performance tests this may be of interest to you.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 20, 2018

Maybe related #6965 (memory).
/cc @alx9r

@alx9r

This comment has been minimized.

Copy link

commented Jun 20, 2018

@iSazonov Your description sounds like it might be a case of the garbage collector falling behind. Do you have a repro I can run? It's probably worth measuring any difference between GCSettings.IsServerGC true and false.

FWIW, GCSettings.IsServerGC seems to be false by default for both powershell.exe and pwsh.exe. As I understand it, that default means that there is one GC thread regardless of how many threads are simultaneously producing garbage. If I'm understanding this right, any managed code invoked by PowerShell that is both CPU-bound and well-parallelized is likely to run out of memory if you have enough cores and it runs long enough.

It could be that whatever Import-Csv (and the JSON activities @markekraus reported) are doing became more parallelized from Windows PowerShell to PowerShell Core.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 20, 2018

@alx9r The repo is simple

Measure-Command { Import-Csv file.csv }

The file I used has ~110 Mb size.

It could be that whatever Import-Csv (and the JSON activities @markekraus reported) are doing became more parallelized from Windows PowerShell to PowerShell Core.

Then my test had to be even faster than in Windows PowerShell but it is 3-10(!) times slower from the first run.
Although speed and memory can be different problems.

@alx9r

This comment has been minimized.

Copy link

commented Jun 20, 2018

@iSazonov I'm not seeing the dramatic speed difference you reported. Perhaps the specifics of the .csv file matter. Here is my test code:

$PSVersionTable

"GCSettings.IsServerGC : $([System.Runtime.GCSettings]::IsServerGC)"
$filePath = "$([System.IO.Path]::GetTempPath())5abb9571859c48cea807d755cc48058b.csv"

# uncomment the following to generate a ~130MB .csv file first
<#
$numLines = (50 * [System.Math]::Pow(2,12))

1..$numLines |
    % { ($_..($_ + 100) | % {"$_"}) -join ',' } |
    Set-Content $filePath
#>

"Import-Csv $filePath"
Measure-Command { Import-Csv $filePath } | select TotalSeconds | Format-List

These are the results for Windows PowerShell and PowerShell core, respectively:

Name                           Value
----                           -----
PSVersion                      5.1.14409.1012
PSEdition                      Desktop
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   10.0.14409.1012
CLRVersion                     4.0.30319.42000
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
GCSettings.IsServerGC : False
Import-Csv C:\Users\un1\AppData\Local\Temp\5abb9571859c48cea807d755cc48058b.csv
TotalSeconds : 37.3680663
Name                           Value
----                           -----
PSVersion                      6.1.0-preview.688
PSEdition                      Core
GitCommitId                    v6.1.0-preview.688
OS                             Microsoft Windows 6.3.9600
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0
GCSettings.IsServerGC : False
Import-Csv C:\Users\un1\AppData\Local\Temp\5abb9571859c48cea807d755cc48058b.csv
TotalSeconds : 57.3899851

Peak memory usage for the test runs was as follows:

  • Windows PowerShell: 23.8 MB
  • PowerShell Core: 35.4 MB
@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 20, 2018

Perhaps the specifics of the .csv file matter.

Yes, but Windows PowerShell frees the memory so quickly that sometimes memory consumption can not be seen.
I also see that in your test PowerShell Core is much slower.

@rthardin

This comment has been minimized.

Copy link

commented Jun 29, 2018

I did a bit of iterative testing with a smaller CSV file (104866 bytes). On my system, the results show that Import-Csv performance does vary between back-to-back runs. The minimum runtime was 71 ms and the maximum was 192 ms. It looks like the runtime would decrease each time the GC came along. I'm running PowerShell inside a Docker container (mcr.microsoft.com/powershell:centos-7).

I understand this issue was filed based on processing much larger input files (>100 MB), so I hope these findings are not a distraction; I thought the findings may still be relevant.

$PSVersionTable

Write-Host "Creating source data..."
"random_numbers" | Out-File source.csv
Get-Random -SetSeed 1 | Out-Null
for ($i=0; $i -le 10000; $i++) {
	(Get-Random).ToString() | Out-File source.csv -Append
}
Write-Host "Measuring Import-Csv performance over time..."
"index,duration_ms,bytes_consumed" | Out-File 7112_results.csv
for ($i=0; $i -le 5000; $i++) {
	$millis = (Measure-Command { Import-Csv stats.csv }).TotalMilliseconds
	$memory = [System.GC]::GetTotalMemory($false)
	$i.ToString() + "," + $millis.ToString() + "," + $memory | Out-File 7112_results.csv -Append
}
Write-Host "Done"
$PSVersionTable                                                                                                                                                                                                                          

Name                           Value                                                                                                                                                                                                          
----                           -----                                                                                                                                                                                                          
PSVersion                      6.0.2                                                                                                                                                                                                          
PSEdition                      Core                                                                                                                                                                                                           
GitCommitId                    v6.0.2                                                                                                                                                                                                         
OS                             Linux 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018                                                                                                                                                 
Platform                       Unix                                                                                                                                                                                                           
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}                                                                                                                                                                                        
PSRemotingProtocolVersion      2.3                                                                                                                                                                                                            
SerializationVersion           1.1.0.1                                                                                                                                                                                                        
WSManStackVersion              3.0             

screen shot 2018-06-29 at 3 04 45 pm

Quartile Runtime Memory
Minimum 71.37 ms 10.19 MB
Q1 125.87 ms 39.88 MB
Median 138.86 ms 50.56 MB
Q3 151.39 ms 59.35 MB
Maximum 192.12 ms 65.38 MB

source.csv.txt
7112_results.csv.txt

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 30, 2018

@rthardin Thanks for great tests! Could you please continue and test on Windows - for PowerShell Core and Windows PowerShell for comparison?

@rthardin

This comment has been minimized.

Copy link

commented Jul 2, 2018

@iSazonov Unfortunately I don't have the time to spare at the moment, but the test script is posted above for others to try.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 3, 2018

Updated scripts for testing

##############################
Write-Host "Creating source data..."
"random_numbers" | Out-File source.csv
Get-Random -SetSeed 1 | Out-Null
for ($i=0; $i -le 10000; $i++) {
	(Get-Random).ToString() | Out-File source.csv -Append
}Write-Host "Measuring Import-Csv performance over time..."
"index,duration_ms,bytes_consumed" | Out-File psobject_results.csv
for ($i=0; $i -le 5000; $i++) {
	$millis = (Measure-Command { Import-Csv source.csv }).TotalMilliseconds
	$memory = [System.GC]::GetTotalMemory($false)
	$i.ToString() + "," + $millis.ToString() + "," + $memory | Out-File psobject_results.csv -Append
}
Write-Host "Done"

######################
Write-Host "Measuring Import-Csv performance over time..."
"index,duration_ms,bytes_consumed" | Out-File psobject_results.csv
for ($i=0; $i -le 5000; $i++) {
 $millis = (Measure-Command { [psobject]::new() }).TotalMilliseconds
 $memory = [System.GC]::GetTotalMemory($false)
 $i.ToString() + "," + $millis.ToString() + "," + $memory | Out-File psobject_results.csv -Append
}
Write-Host "Done"


######################
Write-Host "Measuring Import-Csv performance over time..."
"index,duration_ms,bytes_consumed" | Out-File psobject_results.csv
for ($i=0; $i -le 5000; $i++) {
 $millis = (Measure-Command { [object]::new() }).TotalMilliseconds
 $memory = [System.GC]::GetTotalMemory($false)
 $i.ToString() + "," + $millis.ToString() + "," + $memory | Out-File psobject_results.csv -Append
}
Write-Host "Done"

I started with Import-Csv and temporary disable output in the C# code. Results was well. So I guess that root of problems in creating PSObject and next two tests confirm this. Windows PowerShell is 30-100 percent faster.
It's unclear where the problem is - in PowerShell Core or in .Net Core.

/cc @SteveL-MSFT @daxian-dbw

@SteveL-MSFT

This comment has been minimized.

Copy link
Member

commented Jul 10, 2018

@iSazonov to see if we slowed down creating PSObjects (looking at the Import-CSV cmdlet, it's really just adding NoteProperty members to a PSObject), I ran this test:

measure-command { 1..1000 | % { $a = [PSObject]::new(); 1..10 | % { $a | Add-Member -NotePropertyName "a$" -NotePropertyValue $ }; $a | Add-Member -NotePropertyName id -NotePropertyValue $_; $a } }

Windows PowerShell 5.1 consistently took around 15.7s on my machine. PSCore6.1-3 consistently took around 8.4s, so about twice as fast.

Will have to investigate more where the bottleneck is.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 10, 2018

@SteveL-MSFT I can confirm that property adding is faster on PowerShell Core. I think [PSObject]::new() is slower.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Jul 11, 2018

When looking at performance traces, comparing 5.1 to 6.0.2 (and preview, almost indistinguishable from 6.0.2) I found the following using the sample csv from @alx9r :

It is calls to System.Collections.Hashtable.InitHash(Object, Int32, out UInt32, out UInt32) that is twice as slow on 6.0.2. (5263ms compared to 9250ms). This comes from the fact that the windows version uses System.OrdinalComparer.GetHashCode(String) where core is using System.Globalization.CompareInfo.GetHashCodeOfStringCore(String, CompareOptions)

This function is called from several code paths from System.Management.Automation.PSMemberInfoIntegratingCollection``1.AddToTypesXmlCache(T, Boolean) which I think is called when we add properties to our psobject.

@SteveL-MSFT - I think you have to compare creating a PSObject with many properties. That is what seems to be expensive - not the creation of a PSObject per se.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Jul 11, 2018

@SteveL-MSFT In general, core is faster the full framework, but I think the code sample you used is running so much pipeline stuff that the general speed improvements hides the fact that the addition of properties to psobjects is slower.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Jul 11, 2018

Basically, this seems to boil down to String.GetHashCode being twice as slow.

foreach($i in 1..10000000) {$Null = "PowerShell6HasSlowerHashCode".GetHashCode()}

4.165s on Windows PowerShell.
7.167s on PowerShell 6.0.2.

@SteveL-MSFT

This comment has been minimized.

Copy link
Member

commented Jul 11, 2018

@powercode thanks for doing the analysis. I can confirm that I have similar result on my machine where PSCore6 is slower than Windows PowerShell calling GetHashCode(). Opened dotnet/corefx#30994

@SteveL-MSFT

This comment has been minimized.

Copy link
Member

commented Jul 12, 2018

Per dotnetcore team, this is expected as they made an improvement to GetHashCode() to be randomized (per session) for security reasons.

@markekraus

This comment has been minimized.

Copy link
Collaborator

commented Jul 12, 2018

Hmm.. I bet the issue in #6199 is related too then. The sampled JSON there was creating a large number of properties on a large number of PSObjects and then the same JSON was converted with NewtonSoft, the performance improvement was significant in terms of both memory usage and execution time.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 13, 2018

Per dotnetcore team, this is expected as they made an improvement to GetHashCode() to be randomized (per session) for security reasons.

"per session" - this should not affect because we see the slowdown inside the session each time an object is created.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Jul 13, 2018

@iSazonov It is about them having a more expensive algorithm now. Not just different per session.

However, it seems like we have an opportunity for optimization. As it looks now, we recalulate the hash code for the every property name for every line in the csv. If we were smarter about how we created custom objects (some sort of prototype object maybe) a lot of the work could be amortized.

I'm thinking in the lines of "Create a prototype object, doing all the work needed in the typesystem" and then "Instantiate 10 million of these objects" given this array of property values.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 13, 2018

@powercode I tried cloning PSObject but don't get benefits. I believe we need more generic improvements. Also not only Import-Csv is slow - ConvertTo-Csv and web cmdlets too #6199.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Jul 13, 2018

@iSazonov We would need new API:s

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 19, 2018

@vancem Please look the PerfView traces (for Core and Windows)
https://ufile.io/s3t7v (I haven't access to \\clrmain).

You can use follow PowerShell scripts to record new traces

cd c:\tmp\

# For measuring creating PSObject-s
"random_numbers" | Out-File .\source1.csv
Get-Random -SetSeed 1 | Out-Null
for ($i=0; $i -le 1000; $i++) {
	(Get-Random).ToString() | Out-File .\source1.csv -Append
}

function perf_test($source, $result) {
    Write-Host "Measuring Import-Csv performance over time..."

    "index,duration_ms,bytes_consumed" | Out-File $result
    for ($i=0; $i -le 200; $i++) {

            $millis = (Measure-Command { Import-Csv $source }).TotalMilliseconds
            # Uncomment this if you want analize results in Excel
            $memory = [System.GC]::GetTotalMemory($false)
	    $i.ToString() + "," + $millis.ToString() + "," + $memory | Out-File $result -Append
    }
    Write-Host "Done"
}

perf_test .\source1.csv .\resultsWindows1.csv




###############
# For measuring adding new properties to PSObject-s
$fields = 0..19 | ForEach-Object { "random_numbers$_" }
($fields -join ",") | Out-File .\source2.csv
Get-Random -SetSeed 1 | Out-Null
for ($i=0; $i -le 1000; $i++) {
    $values = 0..19 | ForEach-Object { (Get-Random).ToString() }
    ($values -join ",") | Out-File .\source2.csv -Append
}

perf_test .\source2.csv .\resultsWindows2.csv

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Aug 8, 2018

When I looked at the traces, the hash code calculation showed up over and over again, when comparing 5.1 to 6.0. So that at least one of the root causes. PowerShell is a very heavy user of GetHashCode.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 9, 2018

We could implement fast GetHashCode for some (what?) internal classes.

@SteveL-MSFT

This comment has been minimized.

Copy link
Member

commented Aug 9, 2018

I would rather push for corefx to have an opt-out option for randomized GetHashCode() (and use it in specific cases where it's not a security concern) rather than doing our own implementation if this is the perf blocker.

@lzybkr

This comment has been minimized.

Copy link
Member

commented Aug 9, 2018

It's worth checking if we're computing the hash code too often. Can the algorithm be improved?

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 10, 2018

@lzybkr Tracking issue dotnet/corefx#30994 . There seems to be no chance of doing Marvin algorithm faster.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 10, 2018

@SteveL-MSFT I guess that having an opt-out option for randomized GetHashCode() is a problem - CoreFX do de-virtualization optimization for default GetHashCode() so without the optimization other GetHashCode() can be slower then de-virtualized Marvin.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Aug 10, 2018

@lzybkr There are for sure things we do that could be improved, but I agree with @SteveL-MSFT that at the end of the day, there still be quite a lot of calls to GetHashCode. So having a way to opt out seems like a good path.

In my view, one does not exclude the other.

@powercode powercode referenced this issue Aug 10, 2018

Closed

WIP: PSObjectBuilder design feedback #7493

0 of 11 tasks complete
@benaadams

This comment has been minimized.

Copy link

commented Aug 10, 2018

I would rather push for corefx to have an opt-out option for randomized GetHashCode()

Already exists

We could implement fast GetHashCode for some (what?) internal classes.

For HashTable your can pass a IEqualityComparer to the constructor. Better would to be to use the strongly typed HashSet<string> and pass IEqualityComparer<string> to its constructor; which implements a non-randomized HashCode. It can be a readonly static singleton object as it shouldn't require state.

Dictionary<string, TKey> already has an opitimization for this if you don't specify the comparer for the string; but this doesn't currently exist in HashSet<string> (I don't think could exist in HashTable at the framework level as it is not strongly typed, so couldn't apply a fast path for strings).

@benaadams

This comment has been minimized.

Copy link

commented Aug 11, 2018

Sorry I completely forgot what Hashtable did 😄

Its strongly typed equivalent is Dictionary<TKey, TValue>

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 24, 2018

Current status.

  1. We added some optimizations to reduce allocations and increase performance. (Thanks @powercode!)

  2. I see @benaadams start work on 3 optimizations in CoreCLR. Thanks!

  3. We get new Issue #7603 about memory garbage. The same we see in current issue too - PowerShell Core doesn't free memory so fast as Windows PowerShell.
    I found related Issue aspnet/AspNetCore#1976 It was splitted on some issues.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 27, 2018

In 6.1.0 RC1 trace I see that we do many allocations in System.Management.Automation.ReservedNameMembers.
generatepstypenames
GeneratePSTypeNames(System.Object)

Most in MemberMatch..ctor(). I have no idea if we can improve this, but it's surprising that we create links to standard methods dynamically while they could be static.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Aug 27, 2018

@iSazonov I'd say that there is something strange with your profiling session. You have a call to MemberMatch.ctor, but that is never invoked. I tried changing the definition to a static class, and it still compiled fine.

There doesn't seem to be a path from GeneratePSTypeNames to PSNoteProperty.Copy either, so I'm not sure how much I would trust that data.

When debugging, it is the same old AddToTypesXmlCache that shows up as the source of the copy. Not sure why the data is off. @vancem, can you help us here?

A breakpoint in GeneratePSTypeNames is not hit at all during debugging, and yet it shows up in the profile.

@powercode

This comment has been minimized.

Copy link
Collaborator

commented Aug 27, 2018

@iSazonov, you must be running something else than me. Can you share your command?

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 28, 2018

@powercode I am using the test published above for Import-Csv (columns = 700, rows = 5000 - I start the test and stop the trace recording after some seconds). PowerShell Core 6.1.0 RC1 with crossgen.exe coped into pshome directory.
Screenshot above is GC Heap Alloc Ignore Free.
Traces https://files.fm/u/b2pysja2
Perhaps I do something wrong :-(

@vancem

This comment has been minimized.

Copy link

commented Aug 28, 2018

If you have symbols to system.management.Automation, setting up you symbol path (File -> Set Symbol Path) and source path (File -> Set Source Path), and then you can right click on a node name and select Goto Source, and PerfView will take you to the line (attributed with metric). This allows you to see exactly what line is involved.

Note that inlining can often make it hard to follow because the name in the source is really the caller (or caller's caller …). Having the line number (which by the way is approximate, it may be the line before as well), is helpful to disambiguate.

If you ever get suspicious about things, you can fall back to the debugger. Set breakpoints (probably with hit counts, every 1000 or so) for some of the locations that PerfView pointed you at and you can see if they agree with the debugger.

I have looked at lots of traces, and I have only rarely seen something where a name could not be explained by inlining only very, very rarely (and it could have been me misinterpreting the code). If by checking with the debugger, you believe you have found such a case, I would need all the source, PDBs and repro information to really investigate the issue.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 29, 2018

@vancem We haven't published PDB files for RC1. In PerfView log I saw that it search crossgen.exe and coped the utility in startup directory - right? After that I can see entry points in system.management.Automation

@SteveL-MSFT

This comment has been minimized.

Copy link
Member

commented Jun 20, 2019

@iSazonov now that we have moved to .NET Core 3.0, can you rerun your perf test?

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 20, 2019

I'm even afraid to think about it :-))))

Will do.

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 20, 2019

- Windows PowerShell PowerShell Core
One column 6-17s/9-16Mb 1.8-3.4s/7.6-9.3Mb
700 columns 55s/16Mb 16s/10Mb
@daxian-dbw

This comment has been minimized.

Copy link
Member

commented Jun 20, 2019

@iSazonov So the perf issue with Import-Csv is now gone after moving to .NET 3.0 preview? What do you mean by one column and 700 columns?

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 20, 2019

@daxian-dbw After @powercode improvements for PSObject all cmdlets works faster with the objects and make less allocations.
After moving to .Net Core 3.0 we get better GC. Now GC does cleanups even better .Net Framework (I say about the tests I did here). Perhaps .Net Core 3.0 add a performance too but I don't compare with previous PowerShell Core released versions.

What do you mean by one column and 700 columns?

In my tests I generate csv files. In first test the csv file consists of one column. Second - 700 columns. The first test shows a general increase in performance (I would assume that most of it comes from .Net Core 3.0). Second one explicitly shows an increase in performance which comes from @powercode PR.

@SteveL-MSFT

This comment has been minimized.

Copy link
Member

commented Jun 20, 2019

@iSazonov so can we close this?

@iSazonov

This comment has been minimized.

Copy link
Collaborator Author

commented Jun 21, 2019

We get great progress for Import-Csv cmdlet so I close the issue.

Make sense to open new issues if we will find another scenarios with bad performance.

(Although I would like to have something to see performance history - improvements and regressions after new commits)

@daxian-dbw

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

(Although I would like to have something to see performance history - improvements and regressions after new commits)

Absolutely. We need to have some perf tests that serve as benchmarks to help us see the trend. Currently we have none of those tests. Opened #9996

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.