Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Exception Messages #4402

Closed
BrunoJuchli opened this issue Jul 30, 2015 · 33 comments
Closed

Improve Exception Messages #4402

BrunoJuchli opened this issue Jul 30, 2015 · 33 comments
Labels
area-Diagnostics-coreclr design-discussion Ongoing discussion about design without consensus enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@BrunoJuchli
Copy link
Contributor

There's several places in the .net framework where exception messages can be vastly improved by providing more or the correct information. Vastly improved? Yes. It's that cases where we need to fire up a debugger to find out what exactly has gone wrong. Now imagine the exception doesn't occur in your code but a 3rd party library. Or even worse, it occurs when you execute some executable you aren't even developing. It requires more and more work to acquire the necessary source, build tools etc. to actually create a debug build and then debug it.
The good news is, that the need for doing so could often be eliminated by providing a better exception message.

Let me pick an example

System.Security.Util.StringExpressionSet.CanonicalizePath() NotSupportedException with generic message The given path's format is not supported.. The actual trigger for the exception is a path containing a colon (:) at any index > 1 (so C: is ok, :: is okay, but C: is not ok).

Now let's pick a tool everyone knows. Nuget.exe.
When you run nuget.exe pack foo.nuspec and the nuspec file contains an invalid path - what will happen?
The console will show

Argument_PathFormatNotSupported

Let's configure nuget to be more verbose by adding -Verbosity detailed. The console will now show:

System.NotSupportedException: The given path's format is not supported.
at System.Security.Util.StringExpressionSet.CanonicalizePath(String path, Boolean needFullPath)
(...)
at NuGet.Program.Main(String[] args)

(yes i've removed part of the stack trace ;-) ).
This is still not very helpful. Of course, at this point i don't know whether the actual issue was caused by me writing an invalid .nuspec or by a bug in nuget. If the exception message would state the offending path i could check if it's one that i wrote like this in my *.nuspec. In most cases it'll just be that the user made a simple mistake. In all of these cases, an improved exception message would improve the user experience, as it makes it easier to find the actual mistake (there could be a lot of paths in the nuspec file).

Now that i made my case, i have a few questions for the community:

How should exception messages be improved?

Alternatives (combinations thereof are valid, too)

  • make the exception message more verbose
    • means there would be a lot more exception messages
    • if these are localized, this could mean substantial work
  • create custom exceptions which contain additional properties
    • should inherit from standard ones in order to not alter the interface too much
    • still a breaking change for checks like exception.GetType() == typeof(NotSupportedException (granted, checks as these should not occur too often as usually there exist better alternatives)
    • doesn't require localization as no localizable values are involved.
    • however, would require a "consumer" of the exception to specifically evalute the property. This would diminish the benefit immensely.

Which exception messages should be improved?

I'd suggest one should start with the ones which are easy to improve. Example for an exception which isn't easy to improve: NullReferenceException - see #3858

What work is involved?

  • Since exception messages (resources) are currently shared across multiple usages (example Argument_PathFormatNotSupported, when creating new, more specific messages, the old inspecific ones might become superfluous. So we'd need to track if we can remove any of these.
  • Specifically in regards to localization: Is there any? What's necessary to change an exception message?

Will Pull-Requests be accepted?

As this kind of issue is not on the priority list, the question is, will any work put into this bear fruits?

@BrunoJuchli
Copy link
Contributor Author

Follow up: Also, the question is if strategically these issues should be pursued.
A file-path string is a good(*) example of something which should be encapsulated in a domain entity type and not passed around as string. Doing so would limit the necessity to check for validity of the path to the point where we convert a string to a Path. Subsequent usages of it don't need to check for it's validity.
However i would also like to note, that the unsupported/faulty paths are still possible, so these exceptions could still occur on conversion. And, these messages should still be verbose.

(*) actually i'm not so sure if it's a perfect example, since on unix and windows we have different path formats. It might be somewhat difficult to determine what kind of constraints to apply to both (all) and when to test/enforce them.


List of exception messages to improve:

@shahid-pk
Copy link
Contributor

cc @stephentoub

@DickvdBrink
Copy link
Contributor

Not really sure if this is a valid concern but when an exception leaks through to the end user (because of missing catch; program error) you might give away more info which might not be desired.

For example you might give away a local path on a webserver. Again, might not be the best example but sometimes I'm a bit paranoid ;)

That being said, your suggestion would definitely help with detecting where the error lies :)

@ayende
Copy link
Contributor

ayende commented Jul 30, 2015

Same issue, KeyNotFoundException.
When we are looking for a foo[key] in a dictionary, and it isn't there,
this is thrown.
But because the key value isn't specific, we don't know which key that is.
In many cases, just knowing the key is enough to resolve the issue.

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Thu, Jul 30, 2015 at 10:03 AM, Bruno Juchli notifications@github.com
wrote:

There's several places in the .net framework where exception messages can
be vastly improved by providing more or the correct information. Vastly
improved? Yes. It's that cases where we need to fire up a debugger to find
out what exactly has gone wrong. Now imagine the exception doesn't occur in
your code but a 3rd party library. Or even worse, it occurs when you
execute some executable you aren't even developing. It requires more and
more work to acquire the necessary source, build tools etc. to actually
create a debug build and then debug it.
The good news is, that the need for doing so could often be eliminated by
providing a better exception message.
Let me pick an example

System.Security.Util.StringExpressionSet.CanonicalizePath()
https://github.com/dotnet/coreclr/blob/4cf8a6b082d9bb1789facd996d8265d3908757b2/src/mscorlib/src/System/Security/Util/StringExpressionSet.cs#L751
NotSupportedException with generic message The given path's format is not
supported.
https://github.com/dotnet/coreclr/blob/9501a4fac546d188c9459e79d1b77c3ca0442e49/src/mscorlib/src/mscorlib.txt#L505.
The actual trigger
https://github.com/dotnet/coreclr/blob/4cf8a6b082d9bb1789facd996d8265d3908757b2/src/mscorlib/src/System/Security/Util/StringExpressionSet.cs#L750
for the exception is a path containing a colon (:) at any index > 1 (so C:
is ok, :: is okay, but C: is not ok).

Now let's pick a tool everyone knows. Nuget.exe.
When you run nuget.exe pack foo.nuspec and the nuspec file contains an
invalid path - what will happen?
The console will show

Argument_PathFormatNotSupported

Let's configure nuget to be more verbose by adding -Verbosity detailed.
The console will now show:

System.NotSupportedException: The given path's format is not supported.
at System.Security.Util.StringExpressionSet.CanonicalizePath(String path,
Boolean needFullPath)
(...)
at NuGet.Program.Main(String[] args)

(yes i've removed part of the stack trace ;-) ).
This is still not very helpful. Of course, at this point i don't know
whether the actual issue was caused by me writing an invalid .nuspec or by
a bug in nuget. If the exception message would state the offending path i
could check if it's one that i wrote like this in my *.nuspec. In most
cases it'll just be that the user made a simple mistake. In all of these
cases, an improved exception message would improve the user experience, as
it makes it easier to find the actual mistake (there could be a lot of
paths in the nuspec file).

Now that i made my case, i have a few questions for the community:
How should exception messages be improved?

Alternatives (combinations thereof are valid, too)

  • make the exception message more verbose
    • means there would be a lot more exception messages
    • if these are localized, this could mean substantial work
      • create custom exceptions which contain additional properties
    • should inherit from standard ones in order to not alter the
      interface too much
    • still a breaking change for checks like exception.GetType() ==
      typeof(NotSupportedException (granted, checks as these should not
      occur too often as usually there exist better alternatives)
    • doesn't require localization as no localizable values are
      involved.
    • however, would require a "consumer" of the exception to
      specifically evalute the property. This would diminish the benefit
      immensely.

Which exception messages should be improved?

I'd suggest one should start with the ones which are easy to improve.
Example for something which isn't an easy fix: NullReferenceException
https://github.com/dotnet/coreclr/issues/25.
What work is involved?

  • Since exception messages (resources) are currently shared across
    multiple usages (example Argument_PathFormatNotSupported
    https://github.com/dotnet/coreclr/search?utf8=%E2%9C%93&q=Argument_PathFormatNotSupported,
    when creating new, more specific messages, the old inspecific ones might
    become superfluous. So we'd need to track if we can remove any of these.
  • Specifically in regards to localization: Is there any? What's
    necessary to change an exception message?

Will Pull-Requests be accepted?

As this kind of issue is not on the priority list, the question is, will
any work put into this bear fruits?


Reply to this email directly or view it on GitHub
https://github.com/dotnet/coreclr/issues/1311.

@mariusmg
Copy link

Shouldn't the developer be able to infer the problem from the stacktrace ? I mean the last thing you want is bubbling up very explicit details to the user interface/user.

@ayende
Copy link
Contributor

ayende commented Jul 30, 2015

It is often not possible to do that from the stack trace.
Imagine a config file with key/values - which is held as dictionary.
And an error during app startup because of missing configuration value.
Even if you have the exact line (often in prod you don't have PDBs), you
sometimes can't tell what the value is from the stack trace.

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Thu, Jul 30, 2015 at 1:07 PM, Marius Gheorghe notifications@github.com
wrote:

Shouldn't the developer be able to infer the problem from the stacktrace ?
I mean the last thing you want is bubbling up very explicit details to the
user interface/user.


Reply to this email directly or view it on GitHub
https://github.com/dotnet/coreclr/issues/1311#issuecomment-126259487.

@ayende
Copy link
Contributor

ayende commented Jul 30, 2015

And I think that the security concerns are not relevant. If this worries
you, you don't show that to the user.

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Thu, Jul 30, 2015 at 1:12 PM, Oren Eini (Ayende Rahien) <
ayende@ayende.com> wrote:

It is often not possible to do that from the stack trace.
Imagine a config file with key/values - which is held as dictionary.
And an error during app startup because of missing configuration value.
Even if you have the exact line (often in prod you don't have PDBs), you
sometimes can't tell what the value is from the stack trace.

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Thu, Jul 30, 2015 at 1:07 PM, Marius Gheorghe <notifications@github.com

wrote:

Shouldn't the developer be able to infer the problem from the stacktrace
? I mean the last thing you want is bubbling up very explicit details to
the user interface/user.


Reply to this email directly or view it on GitHub
https://github.com/dotnet/coreclr/issues/1311#issuecomment-126259487.

@OtherCrashOverride
Copy link

I think the feature desired is actually core dump support for CoreCLR.
https://en.wikipedia.org/wiki/Core_dump

With it you can get back the core dump from your remote/production system and load it into your debugger and see the exact state of things when it crashed.

@ayende
Copy link
Contributor

ayende commented Jul 30, 2015

That would be cool, sure, but it isn't valid.
you can't do that all the time, only if you crashed. And by the time you
actually crashed, the stack has unwinded, etc.
And adding exception details would be better than whole new feature.
Note that it is actually pretty easy to implement this feature by forcing a
minidump. But that isn't something that you can do easily or nicely in
production.

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Thu, Jul 30, 2015 at 2:00 PM, OtherCrashOverride <
notifications@github.com> wrote:

I think the feature desired is actually core dump support for CoreCLR.
https://en.wikipedia.org/wiki/Core_dump

With it you can get back the core dump from your remote/production system
and load it into your debugger and see the exact state of things when it
crashed.


Reply to this email directly or view it on GitHub
https://github.com/dotnet/coreclr/issues/1311#issuecomment-126273635.

@OtherCrashOverride
Copy link

I am saying there should be an option for the unhandled exception handler to generate a core dump and support in the debugging tools to use it. At unhandled exception time, the problem state should still be on the stack.

[Edit]
Additionally, the object tracking the GC does should possibly be able to selectively pull data rather than having to dump the entire heap.

@akoeplinger
Copy link
Member

@ayende I fixed a similar thing a while back: dotnet/corefx#185. The security point came up there as well, but the conclusion was it doesn't apply to .NET Core.

@BrunoJuchli
Copy link
Contributor Author

@akoeplinger
Thanks. Link to specific post about the security concern not appyling to .NET Core


My personal take: I think that in case leaking information is a security concern one should not leak the exception or stack trace at all. So this is a higher-level problem and concern than the design of exception / exception messages. Adding this kind of security through obscurity to such a low layer in a fixed manner robs us of any chance of re-configuring it to function more "debug friendly".

Also, while core-dumps are a great thing (especially because they're kind of an universal solution), they also require more tools/work. Especially for novice programmers this means an unnecessary high threshold. Also for expert programmers this is often problematic (debugging remote machines w/o internet access, not having the necessary debugging tools with you,...).

to conclude, i think that @OtherCrashOverride 's suggestion should be a separate issue / feature request ;-)

@OtherCrashOverride
Copy link

I think so too. In fact, I wanted to open it as one because I think the functionality would be really beneficial for many CoreCLR scenarios.

@BrunoJuchli
Copy link
Contributor Author

ok so i'll leave this open for a while to see if there's some more feedback. As dotnet/coreclr#185 shows similar PRs have been accepted.

Can we also discuss some guidelines for exception messages? Any ideas?
Some suggestions / topics for guidelines:

  • use CultureInfo.CurrentCulture for String.Format (as has been done here)
  • add a unit test
    • it's not necessary to add unit tests for behavior in case .ToString() returns null, string.Empty, or throws (as here.
  • Where should new strings be placed? Some are put in mscorlib.txt and some are put into resx files Example

How about this:
After a while I'll close this issue and open a new one referencing this. I'll add a task-list with "all" of the exception messages which could be improved. I'll also put in the guidelines / pointers. From that point on I'll start improving the exception messages one by one (separate PR each). Of course, at that point anyone else could pitch in just as well :)

@OtherCrashOverride
Copy link

I would suggest creating a feature tag for the new enhanced message. The changes can then be #ifdef FEATURE_XYZ wrapped so that the old behavior can be preserved for those with concerns.

@BrunoJuchli
Copy link
Contributor Author

That would mean quite a lot of upfront extra and some future work.
As discussed here security does not seem to be an issue here (or just a minor one), so i don't think it would be worth the added complexity.

@masonwheeler
Copy link
Contributor

Out of curiosity, why do you say that NREs are not easy to fix? I generally think of them as the easiest error to fix, because once you reproduce it under the debugger, about 95% of the time it's immediately obvious what's going wrong just by looking at the code and the locals.

@ayende
Copy link
Contributor

ayende commented Jul 31, 2015

@masonwheeler

Assume that you can't reproduce this in the debugger.
Let us assume that you even have a line number, but the code is:

foo.Bar.Baz.ToString().ToLower()

Any of which might be null.

That can cause it to be pretty hard to use.

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Fri, Jul 31, 2015 at 5:43 PM, masonwheeler notifications@github.com
wrote:

Out of curiosity, why do you say that NREs are not easy to fix? I
generally think of them as the easiest error to fix, because once you
reproduce it under the debugger, about 95% of the time it's immediately
obvious what's going wrong just by looking at the code and the locals.


Reply to this email directly or view it on GitHub
https://github.com/dotnet/coreclr/issues/1311#issuecomment-126711594.

@OtherCrashOverride
Copy link

That would mean quite a lot of upfront extra and some future work.

It should not really be that much extra effort:

#ifdef FEATURE_ENHANCED_EXCEPTION_MESSAGES  
    // New code written here
    throw new ArgumentException(String.Format(Strings.DuplicateKey, key));
#else
   // This code already exists
   throw new ArgumentException(Strings.DuplicateKey);
#endif

@RichiCoder1
Copy link

That would mean quite a lot of upfront extra and some future work.

It should not really be that much extra effort:

#ifdef FEATURE_ENHANCED_EXCEPTION_MESSAGES  
    // New code written here
    throw new ArgumentException(String.Format(Strings.DuplicateKey, key));
#else
   // This code already exists
   throw new ArgumentException(Strings.DuplicateKey);
#endif

I'd be against doing this as it'd require anyone interested in using the "old" verison to recompile coreclr themselves. Even putting these behind an AppContext-esque feature switch would result in a lot of code bloat and duplicate effort for the Core developers. For a subtle example of this, your code would actually be more like:

#ifdef FEATURE_ENHANCED_EXCEPTION_MESSAGES  
    // New code written here
    throw new ArgumentException(String.Format(Strings.DuplicateKeyEx, key));
#else
   // This code already exists
   throw new ArgumentException(Strings.DuplicateKey);
#endif

@masonwheeler
Copy link
Contributor

@ayende: If for some reason it wouldn't reproduce under the debugger--which usually implies the sort of memory corruption that managed code is theoretically supposed to make impossible, but anyway--it would still be trivial to fix if I could get a reproducible bug report and stack trace with a line number.

I'd rewrite that one statement with 4 operations as 4 statements with 1 operation, and before each line, I'd add code to print/log the full name of the class of the variable I'm about to perform an operation on. Then rebuild, reproduce the bug, and check the line number and the output, and I know exactly where to look.

@BrunoJuchli
Copy link
Contributor Author

@masonwheeler actually i was trying to say that it's hard to make a good NRE which states the exact problem. Not fixing the problem which caused the NRE.
#3858 discusses why it's not easy to improve the NRE.

@masonwheeler
Copy link
Contributor

Ah, I see. That makes more sense.

@BrunoJuchli
Copy link
Contributor Author

@OtherCrashOverride @RichiCoder1
Well.. yes.. but

  • requires double the additional unit tests (most often just two instead of one, but sometimes maybe 4 instead of 2)
  • requires running the test suite *2 (once compile with old messages + run tests, once compile with new messages + test) => must be set up, must be maintained, and we must pay the price (for tests/integration taking longer) all the time. I'd say the goal is to reduce compilation switches to the bare minimum, not vice versa.
  • requires maintaining two exception messages
    • if they ever should be translated, requires maintaining more more texts
  • there's probably a few dozens, maybe even hundreds of exceptions which have potential for improvement by including more details
  • we'd have to make sure - all the time - that any new changes to (relevant) exception messages adhere to the same standard (2x exception messages with compilation switch)

So all in all I'd say we'd be talking about dozens if not hundreds of additional hours. I'd rather invest them in something more useful.

@OtherCrashOverride
Copy link

There are many that do not want information leaked in exceptions. I don't accept the "its not a security risk" argument at all. How long until a key called "; DROP TABLES *;" ends up in an exception that gets written to your SQL log?

@ayende
Copy link
Contributor

ayende commented Jul 31, 2015

If you do that. It is your own fault. There is a limit to how much you can
bubblewrap everything
On Jul 31, 2015 20:26, "OtherCrashOverride" notifications@github.com
wrote:

There are many that do not want information leaked in exceptions. I don't
accept the "its not a security risk" argument at all. How long until a key
called "; DROP TABLES *;" ends up in an exception that gets written to your
SQL log?


Reply to this email directly or view it on GitHub
https://github.com/dotnet/coreclr/issues/1311#issuecomment-126759424.

@BrunoJuchli
Copy link
Contributor Author

@OtherCrashOverride
I'm not sure I'm getting your argument.
So if there's an exception containing a message "; DROP TABLES *;" (for example a KeyNotFoundException with a message of "Key '; DROP TABLES *;' was not found in dictionary"). And this exception message gets written to a log file - how would it become relevant? It's not like "DROP TABLES" is a secret. If you were to design some piece of software which scans log files for SQL statements which it would automatically execute.. well.. yes then you'd have a problem.
But i don't think that's what you mean. Because that would be similar to arguing that there shouldn't be a Console.Write because of "How long would it be until some dev programs Console.Write(password).

If it's more an argument that a secret (password, private key...) might be written to a log file, i'd agree that this is a security concern/risk. Incidentally i don't think that i ever denied that - i just think it's not that relevant here. Such concerns are about tradeoff as well. People still drive cars even though they have potential to kill (accidentially or on purpose) - even though both happens a lot of times every year.

@OtherCrashOverride
Copy link

I'm not sure I'm getting your argument.

Its not uncommon for someone to write an exception to log that happens to be contained in a database rather than a file. string sql = "INSERT INTO [blah blah blah]' " + exception.Message + "';".

People still drive cars even though they have potential to kill (accidentially or on purpose)

Yes, but they don't drive cars on sidewalks which is why I feel this should be #Ifdef as a feature. So the 'sidewalks' are still safe for us pedestrians. An you can operate your exception enhanced cars in the street and we can all live to see tomorrow.

Exceptions are not a replacement for a debugger. Not everyone is using DNX or ASP.Net. Its far to easy to get lost in the mindset that CoreCLR = ASP.Net. For the rest of us, its enough to know that a key was not found. Does it matter what that key is? Not at all. All that matters is that its not found.

This point of this conversation was to be informative. It should not dissuade anyone from creating the enhanced exception messages. I will ask that if they are not #ifdef marked, then at least the commit messages should be keyworded to make it easier to search for so that i can remove them from my own build.

@masonwheeler
Copy link
Contributor

@OtherCrashOverride If you're writing anything into SQL that way, instead of using Parameters like you know you should, whatever ends up happening to your database is your own fault. Willful stupidity is not a security hole.

@OtherCrashOverride
Copy link

That was just one example. In the case where the exception output is displayed as part of a web page, it opens up the possibility for cross-site scripting (XSS) exploits. This does not happen with a static set of exception messages that are trusted.

@BrunoJuchli
Copy link
Contributor Author

@OtherCrashOverride
regarding XSS: just another example which can easily be circumvented.

By the way I'm not a web developer. The last time I've developed some web stuff was with classic asp / vbscript. I mostly develop desktop apps or client/server apps. And tooling for the development. And i care greatly for better exception messages because it would make my work much more enjoyable (and somewhat more efficient, too).


ok regarding the security stuff. I think we're turning in circles. My personal conclusion is that dotnet/coreclr#185 has discussed this already and there have not been brought new arguments to the table (yet). As such i will take dotnet/coreclr#185 as an indicator which means its not necessary to have a conditional compilation symbol.
Tagging the commits may be an option, however, in that case it should be put into a guideline by someone "official". After all it's also not enough if "just me" makes sure i tag my commits accordingly and not everyone else is not doing so. @OtherCrashOverride I'd suggest if you're concerned in that regard you should raise a separate Issue to discuss this.


Having said that i would like to have some discussion / statements regarding what i wrote here.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the Future milestone Jan 30, 2020
@stephentoub
Copy link
Member

This issue hasn't progressed in four years. And lots of exception message-related changes have been made in the interim. @danmosemsft, is there anything left to do here?

@danmoseley
Copy link
Member

Yes, I'm passionate about this and we made key improvements here including (1) including the key in key not found exceptions (2) including the path in many IO related exceptions. I know myself and others have improved some other messages to include more data as well. This is all on by default in 3.1 or earlier.

I don't see anything else actionable here -- if there is, please open distinct issues. We hear you and we do care a lot about the actionability of exception messages. Specific feedback will help motivate future improvements.

@ghost ghost locked as resolved and limited conversation to collaborators Jan 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Diagnostics-coreclr design-discussion Ongoing discussion about design without consensus enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests