Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitizing stack traces to improve readability #1051

Merged
merged 1 commit into from Mar 9, 2017

Conversation

fabiocav
Copy link
Member

@fabiocav fabiocav commented Mar 9, 2017

There are two goals with this PR:

Improve readability of exception messages/stack traces

This is particularly problematic for Azure Functions users. With this changes, we are sanitizing the exception stack trace prior to logging to:

  1. Remove async state machine frames (this causes a significant amount of noise)
  2. Convert the compiler generated state machine code to the appropriate async method name and marking it with async
  3. Show async method arguments when possible
  4. Properly formatting AggregateExceptions and the associated inner exceptions

Here's an example of the expected output with the same exception:
Before formatting:

System.AggregateException: One or more errors occurred. ---> System.Exception: Crash!
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.<CrashAsync>d__4.MoveNext() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 117
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.<Run2Async>d__3.MoveNext() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 111
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.<Run1Async>d__2.MoveNext() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 106
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.Run(String arg) in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 96
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.Run() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 89
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.FormatException_RemovesAsyncFrames() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 20
---> (Inner Exception #0) System.Exception: Crash!
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.<CrashAsync>d__4.MoveNext() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 117
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.<Run2Async>d__3.MoveNext() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 111
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.<Run1Async>d__2.MoveNext() in D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs:line 106<---

After formatting:

System.AggregateException : One or more errors occurred. ---> Crash!
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout,CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.Run(String arg) at D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs : 96
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.Run() at D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs : 89
   at Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.FormatException_RemovesAsyncFrames() at D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs : 20
---> (Inner Exception #0) System.Exception : Crash!
   at async Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.CrashAsync() at D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs : 117
   at async Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.Run2Async() at D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs : 111
   at async Microsoft.Azure.WebJobs.Host.UnitTests.ExceptionFormatterTests.TestClass.Run1Async() at D:\src\gh.fabiocav\awj\azure-webjobs-sdk\test\Microsoft.Azure.WebJobs.Host.UnitTests\ExceptionFormatterTests.cs : 106<---

Avoid losing important stack information in logs

Changed the log truncation strategy to preserve the top of the stack as, in most customer cases, the top frames contain the relevant information and exception details. This issue is also mitigated by the change to sanitize messages as much less information is lost due to truncation.

@@ -1,70 +0,0 @@
// Copyright (c) .NET Foundation. All rights reserved.
// Licensed under the MIT License. See License.txt in the project root for license information.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consolidated this into the other ExceptionExtensions in the Extensions folder.


if (additionalDetails == null || additionalDetails.Count == 0)
{
return ExceptionFormatter.GetFormattedException(exception);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now uses the GetFormattedException method as opposed to just calling ToString

else
{
StringBuilder builder = new StringBuilder();
builder.AppendLine(ExceptionFormatter.GetFormattedException(exception));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@@ -124,7 +124,7 @@ public Task SaveAndCloseAsync(FunctionInstanceLogEntry item, CancellationToken c
{
// 0123456789
// abcdefghij
str = "..." + str.Substring(str.Length - (FunctionInstanceLogEntry.MaxLogOutputLength - 3));
str = str.Substring(0, FunctionInstanceLogEntry.MaxLogOutputLength - 1) + "…";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change, but moved from using ... to using

return GetStackForAggregateException(exception, aggregate);
}

return GetStackForException(exception, includeMessageOnly: false);
Copy link
Member

@mathewc mathewc Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the grovelling through stack frames, etc. accessing members, etc. I'm wondering if in strange cases (e.g. generated/emittd code, etc.) what the possibilities are for some info to not be there (e.g. null refs) causing things to blow up. Wondering if we shouldn't have a try/catch surrounding all this that defaults to the full stack trace if for some reason our sanitization logic blows up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
string formattedException = ExceptionFormatter.GetFormattedException(exc);

Assert.False(Regex.Match(formattedException, "d__.\\.MoveNext()").Success);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would help see what is actually going on if these tests tested a full string literal, so we can see the true formatting, and it would also help catch errors better. I realize though that line numbers might change over time, making this hard. Perhaps if the test class were in its own file, so line numbers don't jump around often? or perhaps if a regex is used to match any line numbers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have some additional tests validating against literal strings for expected traces.

{
if (string.Equals(arg, "Test1"))
{
Run1Async().Wait();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have some non async tests too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mamaso
Copy link
Contributor

mamaso commented Mar 9, 2017

Massive improvement, great work!

@dsyme
Copy link
Member

dsyme commented Mar 10, 2017

@fabiocav Nice work! Does it applies to F# too? Might be good to add a smoke test for that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants