-
Notifications
You must be signed in to change notification settings - Fork 835
[2.2.x] Port Drop messages in azure loggers if queue is full #592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Approved for 2.2.2 |
await IntervalAsync(_interval, _cancellationTokenSource.Token); | ||
else | ||
{ | ||
await IntervalAsync(_interval, _cancellationTokenSource.Token); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure moving this into an else is worth it in a patch. It doesn't seem necessary for dropping messages, and it has the potential to make ProcessLogQueue more CPU hungry. Unless we have some benchmark or stress scenario that shows a major benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It allows dropping fewer messages by unloading them faster. Without doing this and having no backpressure a lot of messages might be lost even if app is not overloaded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine a scenario where you have an app that consistently logs exactly one message every 1ms. Also imagine that WriteMessagesAsync consistently takes 2 ms. For simplicity's sake, assume that the size of the batch has a negligible impact on WriteMessagesAsync's execution time.
Before this change, 1000 messages would be logged per batch once every second. After this change, 2 messages would be logged per batch 500 times every second.
Depending on the implementation WriteMessagesAsync, calling it 500 times more than before could be pretty bad, and not something that I'm sure would be great to push in a patch release. Not without measuring the performance in a few simple scenarios at least.
I understand wanting to error on the side of dropping less messages, but this change will always risk dropping messages where messages weren't dropped before. If your app ever logs more than 1000 messages a second, you'll need increase your BackgroundQueueSize, because either way a burst of logs could come at the start of await IntervalAsync(...
.
If we're worried that WriteMessagesAsync itself might take a significant portion of the _interval, we could subtract the time spent in WriteMessagesAsync from the next Task.Delay.
Please wait until https://github.com/aspnet/AspNetCore-Internal/issues/1610 is resolved to merge changes. |
This PR is mostly ready to go (modulo @halter73's comments). Before merging, please update the 2.2.2 section of eng/PatchConfig.props: https://github.com/aspnet/Extensions/blob/d8407116d183debaadb801c0cbc41d65927999d6/eng/PatchConfig.props#L7-L10 |
66a1caa
to
414a72f
Compare
{ | ||
_currentBatch.Add(new LogMessage() | ||
{ | ||
Message = $"{messagesDropped} message(s) dropped because of queue size limit. Increase the queue size or decrease logging verbosity to avoid this.{Environment.NewLine}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about referencing BatchingLoggerOptions in this message or including a URL for the docs?
try | ||
{ | ||
_messageQueue.Add(new LogMessage { Message = message, Timestamp = timestamp }, _cancellationTokenSource.Token); | ||
if (!_messageQueue.TryAdd(new LogMessage { Message = message, Timestamp = timestamp }, millisecondsTimeout: 0, cancellationToken: _cancellationTokenSource.Token)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the millisecondsTimeout: 0
needed for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an optional parameter in this overload and I need to pass _cancellationTokenSource
.
</PropertyGroup> | ||
<PropertyGroup Condition=" '$(VersionPrefix)' == '2.2.2' "> | ||
<PackagesInPatch> | ||
Microsoft.Extensions.Logging.AzureAppServices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Port of #494 to 2.2
cc @muratg
Description:
Today, when Azure AppServices Log Stream logger queue becomes full we start blocking in calls to
logger.Log
that often happen on thread pool threads causing thread pool starvation. Making things worse the process of writing messages usesHttpClient
and queues thread pool items itself making starvation almost non-recoverable.Customer Impact
If customers are using Azure AppServices logger and get a spike of logging messages on multiple ThreadPool threads their app might effectively "hang" and stop serving requests.
Regression?
No
Risk:
Low. Fix is limited to the case where the queue if full and the logic is just to drop messages.