You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
we are using IHttpClientBuilder.AddStandardHedgingHandler() from Microsoft.Extensions.DependencyInjection and we've noticed that sometimes a request is not being cancelled (e.g., it is not respecting the HttpStandardHedgingResilienceOptions.TotalRequestTimeout.Timeout), even on situations where the hedging request returns a successful response, the logical operation still waiting on the primary requests and it never returns.
Here is a log where a hedging request returned 200 but it never cancelled the primary request (Attempt 0):
Here's a log where attempt 0 and attempt 2 were cancelled, but attempt 1 was never cancelled:
>PollyTelemetry:Polly|ExecutionAttempt|Execution attempt. Source: 'HttpClient.ServiceClient-standard-hedging//StandardHedging-Hedging', Operation Key: '', Result: 'The operation was canceled.', Handled: 'False', Attempt: '0', Execution Time: '3498.7123'|System.Threading.Tasks.TaskCanceledException: The operation was canceled.
>PollyTelemetry:Polly|ExecutionAttempt|Execution attempt. Source: 'HttpClient.ServiceClient-standard-hedging//StandardHedging-Hedging', Operation Key: '', Result: 'The operation was canceled.', Handled: 'False', Attempt: '2', Execution Time: '1484.4099'|System.Threading.Tasks.TaskCanceledException: The operation was canceled.
Looking at our telemetry, we can see the 99.99th percentile latency sometimes is higher than the total timeout configured which should not be possible. For example, the HttpClient is configured to have both attempt and total timeout at 4s, but we see latency being reported above 4s:
Please advise how we can debug this further.
Reproduction Steps
Unfortunately I am not able to reproduce this with unit tests, but this is how I am configuring the HttpClient:
public static void ConfigureStandardHedgingHttpClient(
this IServiceCollection serviceCollection,
StandardHedgingHttpClientConfiguration configuration)
{
var httpClientBuilder = serviceCollection.AddHttpClient(configuration.ClientId);
configuration.Configurator.ConfigureHeadHandlers(httpClientBuilder);
configuration.Configurator.ConfigurePrimaryHttpMessageHandler(httpClientBuilder);
var hedgingHandler = httpClientBuilder
.AddStandardHedgingHandler();
hedgingHandler = hedgingHandler
.SelectPipelineByAuthority()
.Configure(options =>
{
options.Hedging.Delay = configuration.HedgingDelay;
options.Hedging.MaxHedgedAttempts = Math.Max(1, configuration.MaxHedgedAttempts - 1); // need to subtract 1 because the original request is counted as one attempt
// if set, it overrides the default ShouldHandle
if (configuration.ShouldHandleHedgingProvider != null)
{
options.Hedging.ShouldHandle = configuration.ShouldHandleHedgingProvider;
}
else
{
options.Hedging.ShouldHandle = args => HandleConnectTimeoutsAndTransientFailures(args);
}
options.Endpoint.Timeout.Timeout = configuration.AttemptTimeout;
options.Endpoint.Timeout.TimeoutGenerator = configuration.AttemptTimeoutGenerator;
// if set, it takes precedence over HedgingDelay
options.Hedging.DelayGenerator = configuration.HedgingDelayGenerator;
// circuit breaker configuration
options.Endpoint.CircuitBreaker.SamplingDuration = configuration.SamplingDuration;
options.Endpoint.CircuitBreaker.BreakDuration = configuration.BreakDuration;
options.Endpoint.CircuitBreaker.FailureRatio = configuration.FailureThreshold;
options.Endpoint.CircuitBreaker.MinimumThroughput = configuration.MinimumThroughput;
if (configuration.ShouldHandleCircuitBreakerProvider != null)
{
options.Endpoint.CircuitBreaker.ShouldHandle = configuration.ShouldHandleCircuitBreakerProvider;
}
if (!configuration.IsCircuitBreakerEnabled)
{
options.Endpoint.CircuitBreaker.ShouldHandle = (outcome) => new ValueTask<bool>(false);
}
options.TotalRequestTimeout.Timeout = configuration.TotalRequestTimeout;
options.TotalRequestTimeout.TimeoutGenerator = configuration.TotalTimeoutGenerator;
});
configuration.Configurator.ConfigureOrderedRoutingStrategy(hedgingHandler.RoutingStrategyBuilder);
configuration.Configurator.ConfigureTailHandlers(httpClientBuilder);
AddChaosInjectors(httpClientBuilder, configuration.ChaosSettings);
#pragma warning disable CS0618 // Type or member is obsolete
RedactionExtensions.AddRedaction(httpClientBuilder
.Services
.AddResilienceEnricher());
#pragma warning restore CS0618 // Type or member is obsolete
}
Expected behavior
All requests should be canceled at the configured TotalRequestTimeout
HttpClient should cancel remaining requests when a request returns successfully.
Actual behavior
Some requests are cancelled with a higher timeout than the one configured.
Some requests are never cancelled when timeout reached.
Pending requests are never cancelled when one of them returns successfully.
Regression?
No, this is the only build we have used, so probably it was always like that.
Known Workarounds
No known workarounds
Configuration
Microsoft.Extensions version is 8.0.0
Server is running on Windows Server containers 2019
.NET version is 8.0
Other information
No response
The text was updated successfully, but these errors were encountered:
Description
we are using
IHttpClientBuilder.AddStandardHedgingHandler()
fromMicrosoft.Extensions.DependencyInjection
and we've noticed that sometimes a request is not being cancelled (e.g., it is not respecting the HttpStandardHedgingResilienceOptions.TotalRequestTimeout.Timeout), even on situations where the hedging request returns a successful response, the logical operation still waiting on the primary requests and it never returns.Here is a log where a hedging request returned 200 but it never cancelled the primary request (Attempt 0):
Here's a log where attempt 0 and attempt 2 were cancelled, but attempt 1 was never cancelled:
Looking at our telemetry, we can see the 99.99th percentile latency sometimes is higher than the total timeout configured which should not be possible. For example, the HttpClient is configured to have both attempt and total timeout at 4s, but we see latency being reported above 4s:
Please advise how we can debug this further.
Reproduction Steps
Unfortunately I am not able to reproduce this with unit tests, but this is how I am configuring the HttpClient:
Expected behavior
Actual behavior
Regression?
No, this is the only build we have used, so probably it was always like that.
Known Workarounds
No known workarounds
Configuration
Other information
No response
The text was updated successfully, but these errors were encountered: