Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Regression]: Performance regression for cold restores in .NET 5.0.x #11031

Closed
marcin-krystianc opened this issue Jul 14, 2021 · 8 comments · Fixed by NuGet/NuGet.Client#4186
Closed
Assignees
Labels
Category:Quality Week Issues that should be considered for quality week Tenet:Performance Performance issues Triage:NeedsTriageDiscussion Type:Bug Type:Learning

Comments

@marcin-krystianc
Copy link

marcin-krystianc commented Jul 14, 2021

NuGet Product Used

dotnet.exe

Product Version

5.x

Worked before?

3.1.407

Impact

Other

Repro Steps & Context

I've noticed a NuGet performance regression between versions 3.1.x and 5.0.x of .NET SDK.
It only affects cold restores (i.e. when packages need to be downloaded from remote feed). When the restore happens for a warm cache (i.e. global packages folder already contains all necessary packages) then the regression does not occur.

Tested solutions:

Test scripts - https://github.com/NuGet/NuGet.Client/tree/dev/scripts/perftests

Results:
image
image

We can see that for the "arctic" scenario (i.e. scenario where NuGet needs to download packages from remote feed) the restore time got much worse with the release of .NET 5.
No such negative effect exists though for the "force" scenario (i.e. scenario where global packages folder already contains all packages so nothing needs to be downloaded).

Raw data:

Verbose Logs

No response

@marcin-krystianc marcin-krystianc changed the title [Bug]: [Regression]: Performance regression for cold restores in .NET 5.0.x Jul 14, 2021
@heng-liu
Copy link
Contributor

Hi @marcin-krystianc , may I know if the testing is on Windows platform?
If yes, it might be related to this issue: #10251. Could you check if this comment works for you?
.NET3.1 doesn't have package signature verification during restore, while .NET5.0 and .NET6.0 will do the signature verification during restore.
FYI @dtivel

@heng-liu heng-liu added the WaitingForCustomer Applied when a NuGet triage person needs more info from the OP label Jul 15, 2021
@marcin-krystianc
Copy link
Author

may I know if the testing is on Windows platform?

Yes, my testing was done on Windows. But I don't think that it has anything to do with firewall or access to the Internet in general. I've been able to reproduce in two different machines, both of them have unrestricted (and very fast) Internet access.

@ghost ghost added WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. and removed WaitingForCustomer Applied when a NuGet triage person needs more info from the OP labels Jul 16, 2021
@heng-liu
Copy link
Contributor

Could you help check if there is any difference of the dependencies when changing the .NET version? Thanks!
That is, clear the global package folder before running restore, and check the numbers and sizes of the files under global package folder after restore.
About the global package folder, pls refer to https://docs.microsoft.com/en-us/nuget/reference/nuget-config-file

@heng-liu heng-liu added WaitingForCustomer Applied when a NuGet triage person needs more info from the OP and removed WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. Triage:Untriaged labels Jul 16, 2021
@marcin-krystianc
Copy link
Author

Hi, I've compared the content of global packages folder for v3.1.407 and v5.0.202. There is only one ignorable difference for the OrchardCore solution. The difference is in the packages\.tools\dotnet-xunit\2.3.0\netcoreapp2.2\project.assets.json file.

v3.1.407 v5.0.202
Orleans 2,277,816,762 bytes, 16,603 Files, 11,670 Folders 2,277,816,762 bytes, 16,603 Files, 11,670 Folders
OrchardCore 1,767,854,688 bytes, 15,788 Files, 10,781 Folders 1,767,854,772 bytes, 15,788 Files, 10,781 Folders
SanitisedNet471 3,586,806,903 bytes, 8,486 Files, 5,035 Folders 3,586,806,903 bytes, 8,486 Files, 5,035 Folders

@ghost ghost added WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. and removed WaitingForCustomer Applied when a NuGet triage person needs more info from the OP labels Jul 22, 2021
@zivkan zivkan added Tenet:Performance Performance issues and removed WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. labels Jul 22, 2021
@zivkan
Copy link
Member

zivkan commented Jul 22, 2021

As Heng mentioned in an earlier comment, a major difference between the .NET Core 3.1 SDK and the .NET 5 SDK, is that package signature verification was added.

This needs to:

  1. Take a SHA256 hash of the nupkg (with special rules, so it won't match the SHA256 hash taken by other programs) to ensure it matches the hash written to the signature file
  2. Check the certificate chain of the certificate that signed the package. Possibly making HTTP requests to check for certificate revocation.
  3. If the certificate has expired, check the timestamp server to ensure that the package was signed when the certificate was still valid. The timestamp server has its own certificate and therefore certificate chain, so it also has certificate revocation to check.

A package may or may not have an author signature, and if the package has been uploaded to nuget.org, it will have a repository signature. So commonly there's at least 1, and sometimes 2 certificate chains to check, possibly with timestamp servers, and may involve HTTP requests and there server latency. And the SHA256 hash is going to use a lot of CPU.

I'm not saying the perf impact you measured is necessarily acceptable, but this is new work that .NET Core 3.1 SDK was not doing, so no amount of optimization is going to make this get back to earlier performance. Signed packages is one tool to reduce risk of supply chain attacks, which has been in the news a lot in the last 6 months, at least in the USA.

Anyway, this issue is assigned to @aortiz-msft, so I'll leave it to him to prioritize.

@marcin-krystianc
Copy link
Author

I've done some performance testing and my early conclusion is that the signature verification is only a red herring. Removing the code responsible for signature verification doesn't make it much faster.
My current theory is that the regression was introduced by the NuGet/NuGet.Client#3524 (Although that change was expected to make it faster, not slower). After reverting that change, performance is back to "normal".
So far I've done performance testing only on Windows using the orleans solution. My plan is to look closer at the PR in question and perform more exhaustive benchmarks to confirm my findings.

@zivkan
Copy link
Member

zivkan commented Jul 27, 2021

@marcin-krystianc thanks, that's fascinating!

Since you're on Windows, when you look at Task Manager's Performance tab, or use Resource Manager, where you can see CPU and disk busy % at the same time, do you see significant differences between NuGet with and without mmap? Does it appear to be IO or CPU limited? Or is it not obvious?

Can you share a little about the hardware you tested on? Is it a VM or a physical machine? spinning disk or SSD?

I admit I didn't validate the perf myself when the PR was created a year ago, but the results shown in the PR indicates that with mmap it was twice as fast on the machine it was tested on.

@marcin-krystianc
Copy link
Author

@zivkan Hi, I've created a PR with a change to stop using mmap-based I/O for package extraction. I've repeated my testing on Windows and Linux. I was also able to reproduce the problem outside NuGet wit a test application. My conclusion is that the mmap-based I/O is slower when writing small files (up to 1MB) and has the same performance as filestreams for larger files.

@aortiz-msft aortiz-msft removed their assignment Aug 13, 2021
@zkat zkat self-assigned this Aug 16, 2021
@zkat zkat added this to the Sprint 2021-08 milestone Aug 16, 2021
@zkat zkat modified the milestones: Sprint 2021-08, Sprint 2021-09 Sep 7, 2021
@zkat zkat modified the milestones: Sprint 2021-09, Sprint 2021-10 Sep 7, 2021
@zkat zkat modified the milestones: Sprint 2021-10, Sprint 2021-11 Oct 4, 2021
@nkolev92 nkolev92 added the Category:Quality Week Issues that should be considered for quality week label Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category:Quality Week Issues that should be considered for quality week Tenet:Performance Performance issues Triage:NeedsTriageDiscussion Type:Bug Type:Learning
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants