Restore crash due to unresponsive feed and unavailable package #3672

Closed
chlowell opened this Issue Oct 17, 2016 · 6 comments

Comments

Projects
None yet
4 participants
@chlowell

When an unresponsive feed is configured and a package can't be found on any responsive feed, restore repeatedly retries the unresponsive feed before eventually crashing with a NuGet.Protocol.Core.Types.FatalProtocolException.

For example, restoring a simple project with two dependencies begins as usual:

> dotnet restore --configfile .\nuget.config -v Verbose --no-cache --disable-parallel
trace: NuGet Command Line Version: 3.5.0.0
trace: Running non-parallel restore.
trace: Reading project file C:\restore-test\project.json.
log  : Restoring packages for C:\restore-test\project.json...
trace: Restoring packages for .NETCoreApp,Version=v1.0...
info :   GET https://api.nuget.org/v3-flatcontainer/unfindable.bogus.package/index.json
info :   GET http://localhost:5000/FindPackagesById()?id='unfindable.bogus.package'
info :   GET https://api.nuget.org/v3-flatcontainer/microsoft.netcore.app/index.json
info :   GET http://localhost:5000/FindPackagesById()?id='Microsoft.NETCore.App'
info :   OK https://api.nuget.org/v3-flatcontainer/microsoft.netcore.app/index.json 374ms
info :   GET https://api.nuget.org/v3-flatcontainer/microsoft.netcore.app/1.0.1/microsoft.netcore.app.1.0.1.nupkg
info :   NotFound https://api.nuget.org/v3-flatcontainer/unfindable.bogus.package/index.json 426ms
...

After the first timeout, NuGet requests every dependency from the unresponsive feed, even those already satisfied elsewhere, retrying until the FatalProtocolException. The log fills with messages like these:

...
info : The HTTP request to 'GET http://localhost:5000/FindPackagesById()?id='System.IO.UnmanagedMemoryStream'' has timed out after 100000ms.
info : An error was encountered when fetching 'GET http://localhost:5000/FindPackagesById()?id='System.IO.MemoryMappedFiles''. The request will now be retried.
info : The HTTP request to 'GET http://localhost:5000/FindPackagesById()?id='System.IO.MemoryMappedFiles'' has timed out after 100000ms.
info : An error was encountered when fetching 'GET http://localhost:5000/FindPackagesById()?id='System.IO.FileSystem.Watcher''. The request will now be retried.
info : The HTTP request to 'GET http://localhost:5000/FindPackagesById()?id='System.IO.FileSystem.Watcher'' has timed out after 100000ms.
info :   GET http://localhost:5000/FindPackagesById()?id='Microsoft.CodeAnalysis.CSharp'
info :   GET http://localhost:5000/FindPackagesById()?id='Microsoft.CodeAnalysis.VisualBasic'
info :   GET http://localhost:5000/FindPackagesById()?id='Libuv'
info :   GET http://localhost:5000/FindPackagesById()?id='Microsoft.CSharp'
info :   GET http://localhost:5000/FindPackagesById()?id='Microsoft.NETCore.DotNetHostPolicy'
info :   GET http://localhost:5000/FindPackagesById()?id='Microsoft.NETCore.Runtime.CoreCLR'
...

Full log

@joelverhagen

This comment has been minimized.

Show comment
Hide comment
@joelverhagen

joelverhagen Oct 18, 2016

Member

Could you also post your project.json for this restore?

Member

joelverhagen commented Oct 18, 2016

Could you also post your project.json for this restore?

@chlowell

This comment has been minimized.

Show comment
Hide comment
@chlowell

chlowell Oct 18, 2016

Sure, it's the dotnet new template with an unsatisfiable dependency:

{
  "version": "1.0.0-*",
  "buildOptions": {
    "debugType": "portable",
    "emitEntryPoint": true
  },
  "frameworks": {
    "netcoreapp1.0": {
      "dependencies": {
        "unfindable.bogus.package": "42.42.42",
        "Microsoft.NETCore.App": {
          "type": "platform",
          "version": "1.0.1"
        }
      },
      "imports": "dnxcore50"
    }
  }
}

Sure, it's the dotnet new template with an unsatisfiable dependency:

{
  "version": "1.0.0-*",
  "buildOptions": {
    "debugType": "portable",
    "emitEntryPoint": true
  },
  "frameworks": {
    "netcoreapp1.0": {
      "dependencies": {
        "unfindable.bogus.package": "42.42.42",
        "Microsoft.NETCore.App": {
          "type": "platform",
          "version": "1.0.1"
        }
      },
      "imports": "dnxcore50"
    }
  }
}
@yishaigalatzer

This comment has been minimized.

Show comment
Hide comment
@yishaigalatzer

yishaigalatzer Oct 18, 2016

This seems to be the correct behavior, restore should fail when it can't find a package.
We could however improve the cancellation of a request if it was indeed fully satisfied by other feeds.

So here is what I think it should do:

  1. If a package was found and downloaded, the unresponsive feed retries should be cancelled. This is the expected behavior.
  2. The request should fail as it is now, but perhaps a bit faster with less messages in the log.

You can run restore with --ignore-failed-sources to overcome the issue. But note that you may end up with different packages restored. This is the main reason we don't ignore failed feeds by default.

This seems to be the correct behavior, restore should fail when it can't find a package.
We could however improve the cancellation of a request if it was indeed fully satisfied by other feeds.

So here is what I think it should do:

  1. If a package was found and downloaded, the unresponsive feed retries should be cancelled. This is the expected behavior.
  2. The request should fail as it is now, but perhaps a bit faster with less messages in the log.

You can run restore with --ignore-failed-sources to overcome the issue. But note that you may end up with different packages restored. This is the main reason we don't ignore failed feeds by default.

@chlowell

This comment has been minimized.

Show comment
Hide comment
@chlowell

chlowell Oct 18, 2016

That sounds good to me. I think the problem here is the difficulty of debugging. Restore takes a long time to fail, the NotFound response gets buried under thousands of lines of timeouts and unnecessary re-requests to the unresponsive feed, and the FatalProtocolException is misleading.

--ignore-failed-sources doesn't fail fast but at least ends with a better error:

Errors in C:\restore-test\project.json
    Unable to resolve 'unfindable.bogus.package (>= 42.42.42)' for '.NETCoreApp,Version=v1.0'.

That sounds good to me. I think the problem here is the difficulty of debugging. Restore takes a long time to fail, the NotFound response gets buried under thousands of lines of timeouts and unnecessary re-requests to the unresponsive feed, and the FatalProtocolException is misleading.

--ignore-failed-sources doesn't fail fast but at least ends with a better error:

Errors in C:\restore-test\project.json
    Unable to resolve 'unfindable.bogus.package (>= 42.42.42)' for '.NETCoreApp,Version=v1.0'.
@joelverhagen

This comment has been minimized.

Show comment
Hide comment
@joelverhagen

joelverhagen Nov 9, 2016

Member

Hey @chlowell, the fix I am considering here is:

  1. Reduce retry count for this sort of failure from 9 attempts to 3 attempts.
  2. Reduce redundant logging.

Does that seems reasonable?

Member

joelverhagen commented Nov 9, 2016

Hey @chlowell, the fix I am considering here is:

  1. Reduce retry count for this sort of failure from 9 attempts to 3 attempts.
  2. Reduce redundant logging.

Does that seems reasonable?

joelverhagen added a commit to NuGet/NuGet.Client that referenced this issue Nov 9, 2016

Catch exception thrown by the RestoreTask and log at the verbose level (
#1012)

In places where we have a custom retry loop, do not use the loop in HttpRetryHandler
When we throw a FatalProtocolException, do not log an error (since the caller should log)
Fix NuGet/Home#3672
@joelverhagen

This comment has been minimized.

Show comment
Hide comment
@joelverhagen

joelverhagen Nov 9, 2016

Member

Spoke offline. I also made it clearer which package ID was not found.

Member

joelverhagen commented Nov 9, 2016

Spoke offline. I also made it clearer which package ID was not found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment