-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: nuget restore thread safety #7060
Comments
It might be related to issue #7020 |
@lanfeng NuGet does have some cross process locking. For example, two different processes could attempt to install to the same directory and it should work. The biggest takeaway, is that small individual operations are thread safe. The whole operation might not be. In PackageReference, consuming merely means enumerating the directory files for an already installed (in the global packages folder) package. In packages.config, this could mean copying. Please note that having multiple restores at the same time, means a lot of http connections created, which could lead to #6742. tl;dr; Should work, but there are a lot of caveats. What's the reason you would want to do that? |
we have a parallelized build process that builds ~300 projects in a non-persistent VM (no global nuget cache). It takes a long time to run nuget restore on each project in serial before spawning the task executors and I was hoping that it might be faster to just allow msbuild to do the restore when each project is built by adding /restore to the command line. |
@ticktickMOF I think it should be safe to run |
@ticktickMOF Additionally, which version of Visual Studio are you using? |
We have an issue with multiple TFS agent on the same machine. If two builds run at the same time, nuget restore may fail on locked files and will leave the cache broken. Then, all my build failed with I must clear the local cache to make it build again |
We see this same error under the same circumstances with TeamCity |
Reporting in with the same issue |
I am now adding -NoCache and -DirectDownload to my nuget restores on my build server. The "isolated mode" is described here: |
@genscape-agodfrey, yes, this can be workaround, however it does not solve issue (from my point of view). We use the same script for the local build and for TeamCity build (for consistency of course). And sometimes TeamCity compiles the same project from the different branches in parallel (our TeamCity virtual machine has multiple TeamCity agents). And it leads to build failure exact with "v3-cache" error. Gradle is able to work with caches in parallel because of the following logic: it waits shared file unlock for the several minutes. E.g. instead of code line So question for the Nuget repository owners: is it reasonable to just change code above to the following logic? |
We've had the same problem on our build servers so I've spent some time investigating this and here is what I've found out. Theoretically, running multiple restore operations should be ok because there is a proper cross-process locking mechanism in place. To acquire a lock, NuGet translates the file path that is going to be locked into a new path which is actually used for the lock file (ConcurrencyUtilities.cs#L242-L268). Then the exclusive file descriptor for a lock file is created. If the operation succeeds then it means that the lock is acquired, if it doesn't then the operation is repeated with some delay (ConcurrencyUtilities.cs#L134-L171). The problem is that lock files are not created in the location of global packages cache, but they are created in the Another problem is that this PR (merged at 17 Aug 2020) introduced backwards incompatibility into locking mechanism. Due to this change it is not possible to run safely concurrent restores when two incompatible nuget clients are used at the same time (e.g. |
I'm not confident that's a problem. We use the temp to lock across all locations, so it's really a design decision. It's a NuGet contract with NuGet, no right or wrong imo.
Yeah, that's unfortunate :( |
Yes, I partially agree with you. I think that the downside of it that it is unknown to users which leads to problems.
nothing really suggests that temp location is so important and used as a location of lock files. Therefore the contract that requires me to use the same I think if the output was e.g.:
it would be clear that there is dedicated location for locking. If you think that it is good idea to add this a can make a PR. |
I think that's exposing NuGet internals to customers that are probably not interested in that detail. |
We solved this problem with a custom MSBuildWithMutex task. |
@nkolev92 how will the locking in the TEMP folder interact with having multiple azure-devops build agents (that have their own isolated TEMP folders). We see some weird "concurrency" writing problems if 2 agents perform a restore related to the same package that eventually results in a next build to fail with an "Error NU5037: The package is missing the required nuspec file." |
@japj According to my findings if you have agents using their own TEMP folders but the same global packages folder then the package extraction into global packages folder is not thread safe. |
Not sure I follow your conclusion. The locking is happening at https://github.com/NuGet/NuGet.Client/blob/a4c9d63bf942f1df1ba9486a87bad2e4b6888488/src/NuGet.Core/NuGet.Packaging/PackageExtractor.cs#L633-L634, so if the package was extracted and the nupkg file exists, https://github.com/NuGet/NuGet.Client/blob/a4c9d63bf942f1df1ba9486a87bad2e4b6888488/src/NuGet.Core/NuGet.Packaging/PackageExtractor.cs#L639, there should be no race. |
@nkolev92 not sure which files this code refer to, but before we definitely fixed the problem with a custom MSBuild task using a mutex, the problem was exactly what @marcin-krystianc describes: located in the user packages cache ( |
@dedale I also looked at your gist, and I noticed: <MSBuild Projects="$(MSBuildProjectFullPath)" Targets="Restore" Properties="$(PackageRestoreProperties)" BuildInParallel="true" /> If you execute restore on a solution, it only runs 1 task per invocation, rather than 1 per project. |
I realize this gist is not up-to-date with the latest version of my hack. In our CI setup, every agent running on the same machine is using its own TEMP folder. |
I have updated my gist using a child class of MSBuild class (previous approach with a UsingTask and C# TaskFactory did not work well because global properties are lost). |
If you refer to #7060 (comment), NuGet depends on a shared temp folder as a locking mechanism. |
I was referring to the case where there are two processes which use different TEMP folder (so locking will not actually work) but same global packages folder.
So the consequence of two processes simultaneously installing the same package (when these processes use different TEMP folder so locking mechanism will not actually work) is that the installation is going to be corrupted and the |
Ahh, I totally misread your original statement, mb. |
I've just realized that, there was recently another change that breaks compatibility of inter-process locking mechanism on Linux (Default location of NuGetScratch folder has changed) -> NuGet/NuGet.Client@288a479 |
@marcin-krystianc
and had from time to time similar errors as here. I changed that mapping to:
and after this, from time to time, I have errors similar to this: |
Both agents not only need to share NuGet folder but also the TEMP folder (because lock files for interposes synchronisation are created in the TEMP/NuGetScratch folder ). |
are nuget.exe restore, dotnet restore, msbuild /restore thread safe across processes? Like if I have 2 msbuild /restore instances running in parallel on different projects that consume the same nuget package do I need to worry about them trying to write the same files at the same time?
The text was updated successfully, but these errors were encountered: