Skip to content

Conversation

pakrym
Copy link

@pakrym pakrym commented Dec 6, 2018

Port of #494 to 2.2

cc @muratg

Description:

Today, when Azure AppServices Log Stream logger queue becomes full we start blocking in calls to logger.Log that often happen on thread pool threads causing thread pool starvation. Making things worse the process of writing messages uses HttpClient and queues thread pool items itself making starvation almost non-recoverable.

Customer Impact

If customers are using Azure AppServices logger and get a spike of logging messages on multiple ThreadPool threads their app might effectively "hang" and stop serving requests.

Regression?

No

Risk:

Low. Fix is limited to the case where the queue if full and the logic is just to drop messages.

@pakrym pakrym added servicing-consider Shiproom approval is required for the issue area-logging labels Dec 6, 2018
@muratg muratg added this to the 2.2.2 milestone Dec 6, 2018
@vivmishra vivmishra added servicing-approved and removed servicing-consider Shiproom approval is required for the issue labels Dec 13, 2018
@vivmishra
Copy link

Approved for 2.2.2

@pakrym pakrym requested a review from halter73 December 17, 2018 17:18
await IntervalAsync(_interval, _cancellationTokenSource.Token);
else
{
await IntervalAsync(_interval, _cancellationTokenSource.Token);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure moving this into an else is worth it in a patch. It doesn't seem necessary for dropping messages, and it has the potential to make ProcessLogQueue more CPU hungry. Unless we have some benchmark or stress scenario that shows a major benefit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows dropping fewer messages by unloading them faster. Without doing this and having no backpressure a lot of messages might be lost even if app is not overloaded

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine a scenario where you have an app that consistently logs exactly one message every 1ms. Also imagine that WriteMessagesAsync consistently takes 2 ms. For simplicity's sake, assume that the size of the batch has a negligible impact on WriteMessagesAsync's execution time.

Before this change, 1000 messages would be logged per batch once every second. After this change, 2 messages would be logged per batch 500 times every second.

Depending on the implementation WriteMessagesAsync, calling it 500 times more than before could be pretty bad, and not something that I'm sure would be great to push in a patch release. Not without measuring the performance in a few simple scenarios at least.

I understand wanting to error on the side of dropping less messages, but this change will always risk dropping messages where messages weren't dropped before. If your app ever logs more than 1000 messages a second, you'll need increase your BackgroundQueueSize, because either way a burst of logs could come at the start of await IntervalAsync(....

If we're worried that WriteMessagesAsync itself might take a significant portion of the _interval, we could subtract the time spent in WriteMessagesAsync from the next Task.Delay.

@natemcmaster
Copy link

Please wait until https://github.com/aspnet/AspNetCore-Internal/issues/1610 is resolved to merge changes.

@dougbu
Copy link

dougbu commented Jan 11, 2019

This PR is mostly ready to go (modulo @halter73's comments). Before merging, please update the 2.2.2 section of eng/PatchConfig.props: https://github.com/aspnet/Extensions/blob/d8407116d183debaadb801c0cbc41d65927999d6/eng/PatchConfig.props#L7-L10

@pakrym pakrym force-pushed the pakrym/22-drop-messages branch from 66a1caa to 414a72f Compare January 14, 2019 21:19
@pakrym
Copy link
Author

pakrym commented Jan 14, 2019

🆙📅

@halter73 - I've reverted changes in question
@dougbu - I've added PackagesInPatch

{
_currentBatch.Add(new LogMessage()
{
Message = $"{messagesDropped} message(s) dropped because of queue size limit. Increase the queue size or decrease logging verbosity to avoid this.{Environment.NewLine}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about referencing BatchingLoggerOptions in this message or including a URL for the docs?

try
{
_messageQueue.Add(new LogMessage { Message = message, Timestamp = timestamp }, _cancellationTokenSource.Token);
if (!_messageQueue.TryAdd(new LogMessage { Message = message, Timestamp = timestamp }, millisecondsTimeout: 0, cancellationToken: _cancellationTokenSource.Token))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the millisecondsTimeout: 0 needed for this change?

Copy link
Author

@pakrym pakrym Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

</PropertyGroup>
<PropertyGroup Condition=" '$(VersionPrefix)' == '2.2.2' ">
<PackagesInPatch>
Microsoft.Extensions.Logging.AzureAppServices
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pakrym pakrym merged commit 9b0bb18 into release/2.2 Jan 15, 2019
@JunTaoLuo JunTaoLuo deleted the pakrym/22-drop-messages branch February 2, 2019 02:36
@ghost ghost locked as resolved and limited conversation to collaborators May 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants