Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The SDK resolver "NuGetSdkResolver" failed to run. The process cannot access the file 'nugetorgadd.trk' because it is being used by another process. #7503

Closed
cclauson opened this issue Nov 13, 2018 · 15 comments
Assignees
Labels
Area:Settings NuGet.Config and related issues Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. Product:MSBuildSDKResolver The NuGet powered SDK resolver. Owned by MSBuild team Tenet:Reliability Crashes, hangs etc. Triage:NeedsTriageDiscussion Type:Bug

Comments

@cclauson
Copy link

cclauson commented Nov 13, 2018

Currently trying to troubleshoot an issue which is intermittently happening on our continuous integration server. I can't consistently repro, but our build fails with the following error:
error : C:\Program Files\dotnet\sdk\2.1.401\Sdks\MSBuild.Sdk.Extras\Sdk not found. Check that a recent enough .NET Core SDK is installed and/or increase the version specified in global.json.
shortly after this warning:
The SDK resolver "NuGetSdkResolver" failed to run. The process cannot access the file 'C:\Users\cloud-user\AppData\Roaming\NuGet\nugetorgadd.trk' because it is being used by another process.

I've found what I believe to be the relevant code in the NuGet.Client repo under src/NuGet.Core/NuGet.Configuration/Settings/Settings.cs:

                    var IsEmptyConfig = !appDataSettings.GetSettingValues(ConfigurationConstants.PackageSources).Any();

                    if (IsEmptyConfig)
                    {
                        var trackFilePath = Path.Combine(Path.GetDirectoryName(defaultSettingsFilePath), NuGetConstants.AddV3TrackFile);

                        if (!File.Exists(trackFilePath))
                        {
                            File.Create(trackFilePath).Dispose();
                            var defaultPackageSource = new SettingValue(NuGetConstants.FeedName, NuGetConstants.V3FeedUrl, isMachineWide: false);
                            defaultPackageSource.AdditionalData.Add(ConfigurationConstants.ProtocolVersionAttribute, "3");
                            appDataSettings.UpdateSections(ConfigurationConstants.PackageSources, new List<SettingValue> { defaultPackageSource });
                        }
                    }

This code is a bit confusing to me, it looks like under certain conditions, it checks to see if the track file exists, and if it doesn't, creates it, but leaves it empty. Can anyone shed more light on what the track file is and what's happening here? Additional questions that would be great if anyone has comments on them:

  • Any possible cause or solution to my problem
  • Since this occurs in an else block, would it be possible to bypass this code path somehow?
  • Any recommendations on how to debug into this? Would be interested to know if this is ever being hit normally and if not, why not

I believe all of the bits that are being run ship with Visual Studio, currently on version 15.8.4

EDIT: Am wondering, would it be possible to work around this by placing a track file with the expected contents in that location? Is there any way to figure out what the expected contents would be?

@cclauson
Copy link
Author

Update: discovered original discussion associated with the checkin of the code in question:
#2445

@PatoBeltran
Copy link

hey @cclauson thanks for the issue. The trk file is just a blank file used to track if we have created an user-wide nuget.config so we don't create again if the user explicitly erased it. I'm curious about how you got this issue since we don't hold any locks for that file apart from the code you provided (where we just create it), if you could find a way to reproduce this more consistently I would love to investigate the cause of this!

@PatoBeltran PatoBeltran added Area:Settings NuGet.Config and related issues WaitingForCustomer Applied when a NuGet triage person needs more info from the OP Product:MSBuildSDKResolver The NuGet powered SDK resolver. Owned by MSBuild team labels Nov 29, 2018
@PatoBeltran PatoBeltran added this to the Backlog milestone Nov 29, 2018
@bergbria
Copy link

@PatoBeltran / @rrelyea the problem is that there is no cross-process mutex on this block of code in the LoadUserSpecificSettings class:

                if (!File.Exists(trackFilePath))
                {
                    File.Create(trackFilePath).Dispose();
                    ...
                }

This code is not concurrency-safe.

We encountered this break on build servers where many nuget processes are run concurrently. Given enough processes and enough machines, the breaks became regular enough that we had to mitigate them by adding a single 'nuget config' call before running anything else.

You can simulate the error condition consistently via this code snippet:

        const int attempts = 5;
        const int concurrency = 5;

        string nugetDir = Environment.ExpandEnvironmentVariables(@"%appdata%\nuget");
        string trkFile = Path.Combine(nugetDir, "nugetorgadd.trk");

        for (int i = 0; i < attempts; i++)
        {
            File.Delete(trkFile);

            var foo = Enumerable.Range(0, concurrency)
                .Select(j => Task.Run(() =>
                {
                    if (!File.Exists(trkFile))
                    {
                        File.Create(trkFile).Dispose();
                    }
                })).ToArray();

            Task.WaitAll(foo);
        }

@nkolev92 nkolev92 added Tenet:Reliability Crashes, hangs etc. Type:Bug and removed WaitingForCustomer Applied when a NuGet triage person needs more info from the OP labels Dec 31, 2019
@zkat zkat removed this from the Backlog milestone Apr 20, 2020
@zkat zkat added the Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. label Apr 20, 2020
@zkat
Copy link
Contributor

zkat commented Apr 20, 2020

This is some great sleuthing, thank you! We prioritize issues based on upvotes and impact on our users. Please let us know if you continue seeing this issue, and how often you do.

@dasMulli
Copy link

dasMulli commented Feb 22, 2022

Since today (2022-02-22 around 9 CET / 8 UTC) this is happening more often on Azure DevOps pipeline agents. All our build jobs use dotnet tool restore to restore nbgv and use it to set the build number for building and deploying artifacts.

Starting: Update build number based on git version
==============================================================================
Task         : Command line
Description  : Run a command line script using Bash on Linux and macOS and cmd.exe on Windows
Version      : 2.198.0
Author       : Microsoft Corporation
Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/command-line
==============================================================================
Generating script.
========================== Starting Command Output ===========================
/usr/bin/bash --noprofile --norc /home/vsts/work/_temp/c8b02164-a211-4fd7-9f3f-ba9c28795e66.sh
/usr/share/dotnet/sdk/6.0.200/NuGet.targets(564,5): error : The process cannot access the file '/home/vsts/.nuget/NuGet/nugetorgadd.trk' because it is being used by another process. [/tmp/qaqh5kme.rla/restore.csproj]

Package "nbgv" failed to restore, due to Microsoft.DotNet.ToolPackage.ToolPackageException: The tool package could not be restored.
   at Microsoft.DotNet.Tools.Tool.Install.ProjectRestorer.Restore(FilePath project, PackageLocation packageLocation, String verbosity)
   at Microsoft.DotNet.ToolPackage.ToolPackageInstaller.InstallPackageToExternalManagedLocation(PackageLocation packageLocation, PackageId packageId, VersionRange versionRange, String targetFramework, String verbosity)
   at Microsoft.DotNet.Tools.Tool.Restore.ToolRestoreCommand.InstallPackages(ToolManifestPackage package, Nullable`1 configFile)

Tool 'dotnet-ef' (version '6.0.0') was restored. Available commands: dotnet-ef

Restore partially failed.
Run "dotnet tool restore" to make the "nbgv" command available.
Environment
Starting: Initialize job
Agent name: 'Azure Pipelines 3'
Agent machine name: 'fv-az58-522'
Current agent version: '2.198.3'
Operating System
Ubuntu
20.04.3
LTS
Virtual Environment
Environment: ubuntu-20.04
Version: 20220220.1
Included Software: https://github.com/actions/virtual-environments/blob/ubuntu20/20220220.1/images/linux/Ubuntu2004-Readme.md
Image Release: https://github.com/actions/virtual-environments/releases/tag/ubuntu20%2F20220220.1
Virtual Environment Provisioner
1.0.0.0-main-20220202-1
Current image version: '20220220.1'
Agent running as: 'vsts'
Prepare build directory.
Set build variables.
Download all required tasks.
Downloading task: CmdLine (2.198.0)
Downloading task: NodeTool (0.198.0)
Downloading task: Bash (3.198.0)
Downloading task: Cache (2.198.0)
Downloading task: Npm (1.198.0)
Downloading task: Docker (2.198.0)
Checking job knob settings.
   Knob: AgentToolsDirectory = /opt/hostedtoolcache Source: ${AGENT_TOOLSDIRECTORY} 
   Knob: AgentPerflog = /home/vsts/perflog Source: ${VSTS_AGENT_PERFLOG} 
Finished checking job knob settings.
Start tracking orphan processes.
Finishing: Initialize job

@belav
Copy link

belav commented Feb 22, 2022

We are seeing the same issue today (2022-02-22 around 1am CST). It is failing pretty consistently.

Run dotnet tool restore
/usr/share/dotnet/sdk/6.0.200/NuGet.targets(564,5): error : The process cannot access the file '/home/runner/.nuget/NuGet/nugetorgadd.trk' because it is being used by another process. [/tmp/dnleixtl.pgm/restore.csproj]

Package "csharpier" failed to restore, due to Microsoft.DotNet.ToolPackage.ToolPackageException: The tool package could not be restored.
   at Microsoft.DotNet.Tools.Tool.Install.ProjectRestorer.Restore(FilePath project, PackageLocation packageLocation, String verbosity)
   at Microsoft.DotNet.ToolPackage.ToolPackageInstaller.InstallPackageToExternalManagedLocation(PackageLocation packageLocation, PackageId packageId, VersionRange versionRange, String targetFramework, String verbosity)

   at Microsoft.DotNet.Tools.Tool.Restore.ToolRestoreCommand.InstallPackages(ToolManifestPackage package, Nullable`1 configFile)
Tool 'husky' (version '0.3.0') was restored. Available commands: husky

Restore partially failed.
Error: Process completed with exit code 1.

@belav
Copy link

belav commented Feb 22, 2022

The issue I ran into seems to be related to dotnet 6.0.200, forcing 6.0.100 like this got us past the issue

    - uses: actions/setup-dotnet@v1
      with:
        dotnet-version: '6.0.100'
    - run: |
       echo "{ \"tools\": { \"dotnet\": \"6.0.100\" } }" > global.json
       dotnet tool restore

@johncrim
Copy link

johncrim commented Feb 22, 2022

We're seeing this more frequently on both Azure DevOps build agents and github build agents (both ubuntu). Running dotnet tool restore with 2 tools being restored. Build output is

/bin/bash /home/vsts/work/_temp/6b990e16-aba0-4787-8c64-d94f7b0ad805.sh
/usr/share/dotnet/sdk/6.0.200/NuGet.targets(564,5): error : The process cannot access the file '/home/vsts/.nuget/NuGet/nugetorgadd.trk' because it is being used by another process. [/tmp/2l2kqqrt.ztw/restore.csproj]

Package "gitversion.tool" failed to restore, due to Microsoft.DotNet.ToolPackage.ToolPackageException: The tool package could not be restored.
   at Microsoft.DotNet.Tools.Tool.Install.ProjectRestorer.Restore(FilePath project, PackageLocation packageLocation, String verbosity)
   at Microsoft.DotNet.ToolPackage.ToolPackageInstaller.InstallPackageToExternalManagedLocation(PackageLocation packageLocation, PackageId packageId, VersionRange versionRange, String targetFramework, String verbosity)
   at Microsoft.DotNet.Tools.Tool.Restore.ToolRestoreCommand.InstallPackages(ToolManifestPackage package, Nullable`1 configFile)

Tool 'wyam.tool' (version '2.2.9') was restored. Available commands: wyam

It appears that whichever package is installed first is failing, and the second one is succeeding. I'll try @bergbria 's trick of calling nuget config first.

@jeffkl
Copy link
Contributor

jeffkl commented Feb 22, 2022

The creation of this file was introduced in NuGet/NuGet.Client#3907 and the behavior was changed in NuGet/NuGet.Client#4338 which would have been released in 6.0.200. @zivkan do you have any insights? Perhaps we should put a try...catch around the creation of the track file?

@johncrim
Copy link

This workaround seems to work: Run dotnet nuget list source before dotnet tool restore. Eg

      # dotnet nuget list source is to fix race condition in dotnet tool restore. Remove when https://github.com/NuGet/Home/issues/7503 is fixed.
      - name: Install dotnet tools
        run: |
          dotnet nuget list source
          dotnet tool restore

I'm not yet 100% confident that this is reliable (I haven't analyzed the code paths thoroughly), but seems like it should be; and I have a few test runs supporting that. Since 6.0.200 this issue was occurring in most builds, and it hasn't occurred since this fix.

@zivkan
Copy link
Member

zivkan commented Feb 22, 2022

Yes, in this other issue/PR, I recommended using dotnet nuget list source: dotnet/sdk#23955 (comment)

The reason there is because they have tests running in parallel where each test invokes a new dotnet build or dotnet restore process. On a new machine where there is no user-profile nuget.config file yet, there's a timing issue when multiple processes try to create the default config and tracking file at the same time. NuGet is designed to restore a solution in a single process, not restore many projects in parallel processes, so this parallel test scenario isn't really representative of customer scenarios.

On the other hand, if MSBuild is invoking many instances of the NuGetSDKResolver in parallel (because multiple tools are being restored in the single dotnet tool restore command), then that's something I think we should support/fix.

@johncrim
Copy link

johncrim commented Feb 22, 2022

Thank you for the info @zivkan . I should add that this problem is now occurring very frequently with just a single shell call to dotnet tool restore on a single VM (without an initial nuget.config file) - it's not dependent on any test processes running in parallel on that VM. It looks like that has to do with the changes in 6.0.200.

@thefringeninja
Copy link

I'm experiencing the same thing on github actions at the moment.

thefringeninja added a commit to thefringeninja/EventStore-Client-Dotnet that referenced this issue Feb 24, 2022
@zivkan zivkan changed the title Help troubleshooting issue with NuGetSdkResolver The SDK resolver "NuGetSdkResolver" failed to run. The process cannot access the file 'nugetorgadd.trk' because it is being used by another process. Feb 24, 2022
@zivkan
Copy link
Member

zivkan commented Feb 25, 2022

This bug has the same root cause as #11607.

Closing this one as a duplicate, despite being older, so it's easier to show management how many recent upvotes that issue has, so please add a 👍 to that issue

@64J0
Copy link

64J0 commented Sep 23, 2022

I'm still facing this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area:Settings NuGet.Config and related issues Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. Product:MSBuildSDKResolver The NuGet powered SDK resolver. Owned by MSBuild team Tenet:Reliability Crashes, hangs etc. Triage:NeedsTriageDiscussion Type:Bug
Projects
None yet
Development

No branches or pull requests