New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correlation failed at signin-oidc redirect #720

Closed
rgceihk opened this Issue Jan 20, 2017 · 29 comments

Comments

Projects
None yet
@rgceihk

rgceihk commented Jan 20, 2017

I keep receiving the following error after the Consent screen where I am going to the client “signin-oidc” url (please refer to images in attached doc for reference):

Unhandled remote failure (Correlation failed)

This was not an issue on my development build (VS2015 /w IISExpress) and only started after I had published the application.

From what I could find on Google, I can see that the issue is cause due to missing cookies. Would be helpful if I could get a better understanding and also some tips on how to debug it.

I have attached the debug log from the client, and from Fiddler, for reference.

logs.docx

Some background on my current setup –

  • IdentityServer with ASP.Net Core Identity Integration and EntityFramework persistence layer (combined from your ASP Identity and EntityFramework Quickstarts)
  • MVC Client (created using the MVCHyrbid Quickstart)

Both applications are deployed on separate domains:

  • IdentityServer is on a Windows Server 2012 R2 machine behind IIS 8.5
  • Client is on Windows Server 2008 R2 behind IIS 7.5

Appreciate the help!

@brockallen

This comment has been minimized.

Member

brockallen commented Jan 21, 2017

Not sure, but make sure the URL you're starting from in the client is really the same you've configured for your redirect URI (specifically http vs https).

@brockallen brockallen added the question label Jan 21, 2017

@rgceihk

This comment has been minimized.

rgceihk commented Jan 21, 2017

Yes, I've checked it, its the same. I'm using http.

I also forgot to mention that both machines connect to Internet via a proxy (if that makes any difference). I have enabled standard http & https ports on the proxy firewall.

@rgceihk

This comment has been minimized.

rgceihk commented Jan 21, 2017

Good news is that I've got it working. I deployed the client to another machine running Windows 2012 RC2 with IIS 8.5, and remapped the domain / IP to the new machine. Not sure if it was an issue with IIS...

@brockallen

This comment has been minimized.

Member

brockallen commented Jan 21, 2017

Good to hear. I'll close this then. Thanks for the update.

@brockallen brockallen closed this Jan 21, 2017

@aduggleby

This comment has been minimized.

aduggleby commented Jun 13, 2017

We're running into this issue in production on about 2% of all requests. So far I haven't been able to see a pattern. I have full verbose logging active and captured the following log directly after a restart of the web app (possibly sensitive data redacted with XXX):

2017-06-13 15:23:45.469 +00:00 [Debug] Reading data from file '"D:\home\ASP.NET\DataProtection-Keys\key-XXX.xml"'.
2017-06-13 15:23:46.413 +00:00 [Debug] Found key {XXX}.
2017-06-13 15:23:46.460 +00:00 [Debug] Considering key {XXX} with expiration date 2017-08-11 07:11:29Z as default key.
2017-06-13 15:23:46.491 +00:00 [Debug] Opening CNG algorithm '"AES"' from provider 'null' with chaining mode CBC.
2017-06-13 15:23:46.539 +00:00 [Debug] Opening CNG algorithm '"SHA256"' from provider 'null' with HMAC.
2017-06-13 15:23:46.554 +00:00 [Debug] Using key {XXX} as the default key.
2017-06-13 15:23:46.601 +00:00 [Information] HttpContext.User merged via AutomaticAuthentication from authenticationScheme: "Cookies".
2017-06-13 15:23:46.636 +00:00 [Verbose] MessageReceived: '"?code=XXX&id_token=XXX&scope=openid%20profile%20email&state=XXX&session_state=XXX"'.
2017-06-13 15:23:46.636 +00:00 [Warning] '".AspNetCore.Correlation.oidc.XXX"' cookie not found.
2017-06-13 15:23:46.636 +00:00 [Information] Error from RemoteAuthentication: "Correlation failed.".
2017-06-13 15:23:46.666 +00:00 [Information] HttpContext.User merged via AutomaticAuthentication from authenticationScheme: "Cookies".
2017-06-13 15:23:46.741 +00:00 [Information] HttpContext.User merged via AutomaticAuthentication from authenticationScheme: "Cookies".
2017-06-13 15:23:46.929 +00:00 [Error] Connection id ""0HL5IE0MV55HC"": An unhandled exception was thrown by the application.
System.AggregateException: Unhandled remote failure. (Correlation failed.) ---> System.Exception: Correlation failed.

Can you point me in a direction to further debug this or what could be the root cause?

@openidauthority

This comment has been minimized.

openidauthority commented Jun 26, 2017

@aduggleby: I suspect that some of your users are arrive at the log in screen, become distracted, and then come back and try to log in more than 15 minute later. By then the cookie used for correlation has expired and they get this error. If you set RemoteAuthenticationTimeout in the OIDC middleware to something like 10 seconds:

app.UseOpenIdConnectAuthentication(new OpenIdConnectOptions
{
    RemoteAuthenticationTimeout = TimeSpan.FromSeconds(10),
    ...
}

...then users would only have 10 seconds to log in. If you increase this to several hours it will probably greatly reduce the frequency of the error. A better solution would be to redirect the user to some other page if someone leaves the log in screen for more than a few minutes so they cannot use a "stale" log in screen. The timeout is there for security purposes.

@aduggleby

This comment has been minimized.

aduggleby commented Jun 27, 2017

Ah thanks, I'll give that a try!

@rfuhrer

This comment has been minimized.

rfuhrer commented Jul 19, 2017

Just posting that @openidauthority was correct with his assessment of the issue. We have an internal site that field workers use and I kept seeing An unhandled exception has occurred: Unhandled remote failure. (Correlation failed.) and '.AspNetCore.Correlation.oidc.<some random string>' cookie not found. popping up in my Elmah.io logs.

The issue we had is that users would log into the site, do some work, log out (which we have set to redirect back to the login page), then they'd close their laptop. A day later they would pop open their laptop and try to sign in and then get an error. They'd just refresh the page and login again and assume it was some glitch and never tell me it happened or what they did to cause it. Thanks users :)

I solved the issue by just updating the login screen to hide the login form after idling for 15 minutes and show a button that refreshes the page.

Here's my stacktrace for those Googling the same issue:

System.AggregateException: Unhandled remote failure. (Correlation failed.) ---> System.Exception: Correlation failed.
   --- End of inner exception stack trace ---
   at Microsoft.AspNetCore.Authentication.RemoteAuthenticationHandler`1.d__6.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.AspNetCore.Authentication.RemoteAuthenticationHandler`1.d__5.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.AspNetCore.Authentication.OpenIdConnect.OpenIdConnectHandler.d__15.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware`1.d__18.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware`1.d__18.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware`1.d__18.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware`1.d__18.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.d__6.MoveNext()
@srikrsna

This comment has been minimized.

srikrsna commented Nov 28, 2017

We've tried increasing the threshold and yet we are receiving this error. There seems to be something wrong with our setup of session or session in ASP.NET Core itself that's causing this issue. Our diagnostics indicate the correlation cookie are not being set for some reason. Because, even for other external OAuth2 logins we are getting the same error of state being not set in some of the requests. We even tried using TempData for external logins. @openidauthority @aduggleby @brockallen Any idea as to where to proceed next with this?

@aduggleby

This comment has been minimized.

aduggleby commented Nov 29, 2017

Unfortunately for us it was not the timeout issue - there's something else going on and I haven't found a solution yet or an idea how to debug.

@ChrisPritchard

This comment has been minimized.

ChrisPritchard commented Dec 13, 2017

I had this same issue, and we ultimately discovered it was due to certain users bookmarking the login screen to access our site. Since the site didn't allow any anonymous access, the login screen was the first place they would land - by book marking it to save themselves time, they ended up with a link containing expired query string parameters that would then result in this error.

@brockallen

This comment has been minimized.

Member

brockallen commented Dec 14, 2017

I had this same issue, and we ultimately discovered it was due to certain users bookmarking the login screen to access our site

Which is something OpenID Connect fundamentally doesn't address, so you need to deal with this in your client (usually by re-issuing the authorize request).

@ChrisPritchard

This comment has been minimized.

ChrisPritchard commented Dec 14, 2017

Yeah no worries. However it was pure luck we discovered this cause, so worth noting it here as a possibility for those that hit this exception in future.

@rfuhrer

This comment has been minimized.

rfuhrer commented Dec 14, 2017

Interesting. I think some of my users might have done the same thing. This error was still coming up on my site but only once or twice a month now instead of multiple times a day. Bookmarking the token sounds like it might be what's happened.

How did you end up fixing it? Was there an easy way for the server to tell that the user came to the page with an expired token? I don't really see an obvious way of doing that with my controller's action unless there's something with the returnUrl I can check?:

        [AllowAnonymous]
        [HttpGet]
        public async Task<IActionResult> Login(string returnUrl)
        {
            ViewData["ReturnUrl"] = returnUrl;
            return View();
        }
@ChrisPritchard

This comment has been minimized.

ChrisPritchard commented Dec 14, 2017

No fool proof way, but a couple of common process changes help to solve it:

  • have a landing page that allows anonymous access. This will reduce the likelihood that users will bookmark the login page
  • handle the exception with a friendly error page that contains a fresh login link and a message like 'ensure you don't stay on the login page too long or access it via a bookmark', or similar.

I suppose you could add a query string param of your own to the login redirect, that has a datestamp, and then on the identity server login page check this to catch old references. But we didn't bother; just did the above suggestions and the problem went away.

@coffeymatt

This comment has been minimized.

coffeymatt commented Feb 23, 2018

@ChrisPritchard Hi Chris, you mention handling the exception for this error, do you know where the hook is to do that, is it an event you can hook into when configuring the id server middleware?

@ChrisPritchard

This comment has been minimized.

ChrisPritchard commented Feb 23, 2018

We ended up just using an anonymous home page to minimise the issue. However, from what I remember, there was no obvious way in the framework to catch it. So I would have probably built a specific handler for the exception in global exception handling.

@srikrsna

This comment has been minimized.

srikrsna commented Feb 24, 2018

Well, the way we solved it is by catching the exception using the ExceptionHandler Middleware of ASP.NET Core and checked if the Request path was /sign-in-oidc (Redirect URI). If it was then redirect to any page that requires authentication (home page in our case, i.e. /). If it was not then handle the exception just like any other exception.
@coffeymatt @ChrisPritchard

@coffeymatt

This comment has been minimized.

coffeymatt commented Feb 26, 2018

I've found a hook on the middleware to handle this error. On the authenticating client application where the openid connect middleware is configured, I've put:

options.Events.OnRemoteFailure = RemoteAuthFail;

private Task RemoteAuthFail(RemoteFailureContext context) { context.Response.Redirect("/Home/AuthError"); context.HandleResponse(); return Task.CompletedTask; }

I've put a friendly message on that page prompting them to not bookmark the login (as well as on the login screen).

@MogauGeeky

This comment has been minimized.

MogauGeeky commented Mar 13, 2018

@coffeymatt @srikrsna Combining your suggestions works perfectly.
OnRemoteFailure event I check for the /signin-oidc path, if so, I simply redirect to a secured endpoint on the client then the client redirects to identity server, this time with valid request params but since the user is already logged on identity server they are simply redirected back to the client without the need to re-enter their credentials.

I was having this issue with recorded selenium tests and that solution solved it.

@sathiathirumal

This comment has been minimized.

sathiathirumal commented Apr 30, 2018

I am hitting this with the following combination:

  1. Browser incognito mode
  2. Application is behind Azure gateway

I am not even using IdentityServer; it is plain old Azure AD authentication (openid).
I see that Cookies->{Mysitename} has the following:

image

HANDLING IN CODE
I tried to handle the OnRemoteFailure() event as suggested above, and redirect to "/" which is a secure endpoint for me, but it causes an infinite loop of authentication from / to /signin-oidc to / to /signin-oidc... perhaps I should redirect somewhere else? Or write a specific page that will hard clear the cookie cache and then redirect to the "/" page?

WORKAROUND

  1. Users can go to the homepage again and hit F5 until this works. It seems that each F5 gets them moving a step ahead and once the OpenID cookies are populated, everything else (I have more auth after openid finishes, via adal.js for AJAX use).
  2. Bypass the application gateway and use the direct service fabric cluster DNS name (not acceptable as it is http).

DETAILS
System.Exception: Correlation failed.
at Microsoft.AspNetCore.Authentication.RemoteAuthenticationHandler`1.d__12.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.d__6.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.AspNetCore.Builder.RouterMiddleware.d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.AspNetCore.Builder.RouterMiddleware.d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.d__7.MoveNext()

image

@coffeymatt

This comment has been minimized.

coffeymatt commented May 1, 2018

It's a bit off topic for this issue @sathiathirumal, this being about identity server. You're better asking such things on stack overflow. But yes, of course if authentication fails and on failure you redirect to a secure end point, that's going to trigger authentication again and thus the infinite loop.

@sathiathirumal

This comment has been minimized.

sathiathirumal commented May 1, 2018

Fair enough @coffeymatt - I will pursue this in a more appropriate forum. My point was that this seems to happen even outside of IdentityServer, so I am not sure if the issue is further upstream and generally related to running your services behind an application gateway (somehow).

@heavenwing

This comment has been minimized.

heavenwing commented May 29, 2018

@sathiathirumal I am hitting same problem. Do you solve this issue?

My environment is:

  • 4 nodes Service Fabric Cluster ( IP is :192.168.100.16-192.168.100.19)
  • 2 nodes NLB+ARR as Load Balance, I have Virtual IP is 192.168.100.60, NLB1 IP is 192.168.100.13, NLB2 IP is 192.168.100.14
  • STS web stateless service run on 4 SF nodes
  • Client(a management console web) also run on 4 SF nodes

My problem is :

  • When I open mc web in Chrome with 192.168.100.60 (with NLB+ARR), then login with sts, I get error "Correlation failed".
  • When I open mc web in Chrome with 192.168.100.13 (only with ARR), then login with sts, I also get same error.
  • When I open mc web in Chrome with 192.168.100.16 (access SF node directly), then login with sts, it is working.
    STS always run on 4 SF nodes with NLB+ARR in above 3 situations.
@sathiathirumal

This comment has been minimized.

sathiathirumal commented May 31, 2018

@heavenwing I have been posting on this at aspnet/Security#1755

@KevinDockx

This comment has been minimized.

Contributor

KevinDockx commented Jun 1, 2018

In case anyone runs into this issue: we had the exact same problem, turned out it had to do with the data protection APIs. I detailed it on my blog (https://www.kevindockx.com/solving-correlation-failed-state-property-not-found-errors-openid-connect-middleware-asp-net-core/), but to summarize: the OIDC middleware uses the data protection APIs to encrypt/decrypt state. When decryption fails, state is null, thus resulting in a Correlation failed: state not found error. In our case, decryption failed because different keys were used for encryption/decryption, a pretty common problem when deploying behind a load balancer. The solution was to use a shared key store.

@sathiathirumal

This comment has been minimized.

sathiathirumal commented Jun 1, 2018

@heavenwing see aspnet/Security#1755 (comment) on how I solved this for my issue. (hint: yes it was the DataProtection layer missing and OIDC middleware being unable to decrypt cookies that were generated/encrypted on a different node)

@DwaynesWorld

This comment has been minimized.

DwaynesWorld commented Oct 20, 2018

For me this issue was using the default Cookie Policy:

public void Configure...
{
    // This will override cookie settings for OpenIdConnect
    // Nonce and Correlation Cookies included.
    app.UseCookiePolicy(); 
}
@Daniel-iel

This comment has been minimized.

Daniel-iel commented Nov 6, 2018

I had the same problem and I fixed using @coffeymatt suggestion.

options.Events.OnRemoteFailure = context =>
{
if (context.Failure.Message.Contains("Correlation failed"))
context.Response.Redirect("/");
else
context.Response.Redirect("/Error");

     context.HandleResponse();

      return Task.CompletedTask;

};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment