Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nicer output when an exception bubbles through main #4968

Merged
merged 12 commits into from Sep 14, 2017
Merged

Nicer output when an exception bubbles through main #4968

merged 12 commits into from Sep 14, 2017

Conversation

asterite
Copy link
Member

@asterite asterite commented Sep 13, 2017

When an exception happens in a program and isn't catched, what's printed is a bit hard to digest. For example, for this program:

def foo
  bar
end

def bar
  a = [1]
  a[2]
end

foo

The output is:

Index out of bounds (IndexError)
0x1025cc715: *CallStack::unwind:Array(Pointer(Void)) at ??
0x1025cc6b1: *CallStack#initialize:Array(Pointer(Void)) at ??
0x1025cc688: *CallStack::new:CallStack at ??
0x1025c62a5: *raise<IndexError>:NoReturn at ??
0x1025e553e: *Array(Int32)@Indexable(T)#at<Int32>:Int32 at ??
0x1025e54d9: *Array(Int32)@Indexable(T)#[]<Int32>:Int32 at ??
0x1025cba51: *bar:Int32 at ??
0x1025cb9d9: *foo:Int32 at ??
0x1025bb807: __crystal_main at ??
0x1025cb8b8: main at ??

That's at least on Mac without running dsymutil. On linux, or on Mac when running dsymutil, this is the output:

Index out of bounds (IndexError)
0x104c6d53e: at at /usr/local/Cellar/crystal-lang/0.23.1/src/indexable.cr 0:17
0x104c6d4d9: [] at /usr/local/Cellar/crystal-lang/0.23.1/src/indexable.cr 74:5
0x104c53a51: bar at /Users/asterite/Projects/crystal/foo.cr 8:3
0x104c539d9: foo at /Users/asterite/Projects/crystal/foo.cr 3:3
0x104c43807: __crystal_main at /Users/asterite/Projects/crystal/foo.cr 10:1
0x104c538b8: main at /usr/local/Cellar/crystal-lang/0.23.1/src/main.cr 12:15

That's a bit better: we can see raise and CallStack are removed from the trace.

So I see a few things to improve:

  1. The numbers at the left are super hardcore. They are memory addresses of the functions inside the executable (I think). Not very useful unless you will debug at a lower level (but maybe for that one can use gdb or lldb)
  2. For the trace where I didn't run dsymutil on Mac, CallStack and raise are still there, and also all names have an annoying * prefix (this is part of the function mangling in Crystal)
  3. For the trace where I didn't run dsymutil on Mac, filenames are shown as "??" which clutters the output
  4. Filenames are always absolute
  5. main is always in the trace but adds little info because it's always the same and adds no info

This PR improves all of that:

  1. The cryptic numbers are removed
  2. CallStack and raise are always removed from the trace
  3. "??" filenames are not shown anymore
  4. Files are shown relative to the current directory if possible
  5. We keep __crystal_main but we show it as main, because that will have the line number in the "main" file, and we remove the actual main from the trace.

I also formatted the output a bit more like how Ruby shows it.

The result, without running dsymutil on Mac, is:

Index out of bounds (IndexError)
        from Array(Int32)@Indexable(T)#at<Int32>:Int32
        from Array(Int32)@Indexable(T)#[]<Int32>:Int32
        from bar:Int32
        from foo:Int32
        from main

With dsymutil, or in linux, the output is:

Index out of bounds (IndexError)
        from /usr/local/Cellar/crystal-lang/0.23.1/src/indexable.cr 0:17 in 'at'
        from /usr/local/Cellar/crystal-lang/0.23.1/src/indexable.cr 74:5 in '[]'
        from foo.cr 8:3 in 'bar'
        from foo.cr 3:3 in 'foo'
        from foo.cr 10:1 in 'main'

I think it's a bit better than before.

As a comparison, this code in Ruby:

def foo
  bar
end

def bar
  raise "oh no"
end

foo

Gives this output:

foo.cr:6:in `bar': oh no (RuntimeError)
	from foo.cr:2:in `foo'
	from foo.cr:9:in `<main>'

I'll send a separate PR to always try to execute dsymutil on Mac so we always get files and line numbers (tracked in #4186).

We should also investigate why line numbers are incorrect, but probably in a different issue.

Copy link
Contributor

@RX14 RX14 left a comment

I'd love an env var or other option to get a "low-level" raw stacktrace which shows pretty much the same as what existed before this PR, minus the @@skip check. It'd be useful for debugging on occations.

This is much more necessary given how often the debug info is blatantly wrong (see #4538 for the true scale of this). The symbol info is pretty much impossible to be wrong.

src/callstack.cr Outdated
next if @@skip.includes?(file)

# Turn to relative to the current dir, if possible
Copy link
Contributor

@RX14 RX14 Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably would get confusing if the executable changes it's directory mid-program. We could have multiple stacktraces from the same program have different representations for the same file. I think it'd be best to record Dir.current very early in the init.

Copy link
Contributor

@bew bew Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have ˋProcess::INITIAL_PWDˋ in process/executable_path.cr maybe we can use that? (As a constant its set very early)

Copy link
Member Author

@asterite asterite Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! I think we can actually expose that constant as it might be useful for others.

Copy link
Member Author

@asterite asterite Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used it, but I'm not sure now we should expose it. Or maybe find another way/name and document it. So out of the scope of this PR.

src/callstack.cr Outdated

# Turn to relative to the current dir, if possible
file = file.lchop(current_dir)

file_line_column = "#{file} #{line}:#{column}"
Copy link
Member

@oprypin oprypin Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to have "#{file}:#{line}:#{column}" because my text editor understands when you tell it to open foo.cr:7:4.

Copy link
Member Author

@asterite asterite Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought about changing that to be like that but thought it would be harder on the eyes, and easier to copy just the filename. But yes, if it's easier to open than on an editor, let's change it (Ruby also uses that format)

asterite added 2 commits Sep 13, 2017
Some editors understand this format so it's easier to jump to that location.
@asterite
Copy link
Member Author

@asterite asterite commented Sep 13, 2017

@RX14 Maybe we can keep it all the time. I was thinking maybe like this:

Index out of bounds (IndexError)
0x10026084e |    from src/indexable.cr:0:17 in 'at'
0x1002607e9 |    from src/indexable.cr:74:5 in '[]'
0x10023df61 |    from foo.cr:8:3 in 'bar'
0x10023dee9 |    from foo.cr:3:3 in 'foo'
0x10023035f |    from foo.cr:11:1 in 'main'

What do you think?

src/callstack.cr Outdated
if function.starts_with?("*raise<") ||
function.starts_with?("*CallStack::") ||
function.starts_with?("*CallStack#") ||
function == "main"
Copy link
Member

@straight-shoota straight-shoota Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have a custom method called main it would get hidden from the stacktrace? Just rename foo to main in your example to see what I mean.

Copy link
Member Author

@asterite asterite Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I'll just keep those entries then.

Copy link
Member Author

@asterite asterite Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it because there's only one "main", I don't know why when there's debug info Foo#main just appears as main. But well, it's not a big deal not removing those entries.

Copy link
Member

@straight-shoota straight-shoota Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you just remove the last entry? This would require to count the size, but I don't think this should be a big concern.

Copy link
Member

@straight-shoota straight-shoota Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or simply break from the loop after __crystal_main is processed.

Copy link
Member Author

@asterite asterite Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because en release some traces might not be there. I think it's simpler to just keep things as they are now regarding this.

Copy link
Member

@straight-shoota straight-shoota Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless if it might not show up in some scenarios, anything after __crystal_main shouldn't be of interest so I think it should be possible to break when this function is processed.
Though it's not a big deal having a line more at the end.

@RX14
Copy link
Contributor

@RX14 RX14 commented Sep 13, 2017

@asterite I've never actually needed to use the function hex locations, they're practically random with ASLR anyway. What I have done before is to use --no-debug to remove the line numbers so I can see the symbol names when the debug info has been particularly wrong. I'd like an option to show the raw symbols too, at least until we fix debug info.

@asterite
Copy link
Member Author

@asterite asterite commented Sep 13, 2017

@RX14 Sounds good. The main issue now is deciding how to trigger this behaviour. We can either use an ENV var, configure something in CallStack itself, or use a compile-time flag.

Also, what do you think about removing the --no-dsymutil option? It could be implied by --no-debug as @oprypin suggested.

@asterite
Copy link
Member Author

@asterite asterite commented Sep 13, 2017

@RX14 Actually, I think we can show the IP address when there's no file/line/column information. That happens when you compile with --no-debug. I'll make that change.

@RX14
Copy link
Contributor

@RX14 RX14 commented Sep 13, 2017

Wouldn't "symbol location" be a much less confusing name than "ip address"...

Regardless, my entire point with the request was to request a way to switch between showing debug info and plain symbols without recompiling. Sorry if I didn't make it clear.

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

I suggested a few minor tweaks, otherwise: good job! That really cleans things up.

The symbol address should always be left out as they don't help much the ordinary developer (that includes me), even when the debug info couldn't be found. It's not uncommon (at least on Linux) to have some lines be found in the DWARF sections, while some aren't... we'd have a weirdly mixed backtrace output when that happens for example.

Now, as @RX14 suggested, maybe we could have a flag to disable the pretty output to get the raw backtrace. Maybe just react --no-debug which can be found out with flag?(:debug).

src/callstack.cr Outdated

# Crystal methods (their mangled name) start with `*`, so
# we remove that to have less clutter in the output.
function = function.lchop('*')
Copy link
Contributor

@ysbaddaden ysbaddaden Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the unmangling of the function name could happen in the elseif above (line 168-170), when we failed to decode the function name from DWARF and got the mangled symbol instead.

# CallStack tries to make files relative to the current dir,
# so we do the same for tests
current_dir = Dir.current
current_dir += '/' unless current_dir.ends_with?('/')
Copy link
Contributor

@ysbaddaden ysbaddaden Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use File::SEPARATOR for future-proofing on Windows?

src/callstack.cr Outdated
@@ -23,6 +23,10 @@ end

# :nodoc:
struct CallStack
# Compute current directory at the beginning so filenames
# are always shown relative to the *starting* working directory.
CURRENT_DIR = Process::INITIAL_PWD + '/'
Copy link
Contributor

@ysbaddaden ysbaddaden Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: File::SEPARATOR.

Copy link
Member

@oprypin oprypin Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should have a function to get a path relative to another path
https://docs.python.org/3/library/os.path.html#os.path.relpath

Copy link
Member Author

@asterite asterite Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oprypin I agree, though I don't know if I'd like ../../ to appear in the output. I checked Go for example and they always use absolute paths (though Ruby uses relative paths if the path is in the current directory)

@asterite
Copy link
Member Author

@asterite asterite commented Sep 14, 2017

@ysbaddaden @RX14 I applied your suggestions.

I also made it so that if the env variable CRYSTAL_CALLSTACK_FULL_INFO is 1, the ip address (or symbol name, however we want to call it) will be shown, without having to recompile the program. We can later document that, or not, if we really need to, depending on whether we fix all bugs relative to debug locations.

Going back to the original example, we have:

Without dsymutil:

Index out of bounds (IndexError)
        from Array(Int32)@Indexable(T)#at<Int32>:Int32
        from Array(Int32)@Indexable(T)#[]<Int32>:Int32
        from bar:Int32
        from foo:Int32
        from __crystal_main
        from main

With dsymutil:

Index out of bounds (IndexError)
        from src/indexable.cr:0:17 in 'at'
        from src/indexable.cr:74:5 in '[]'
        from foo.cr:8:3 in 'bar'
        from foo.cr:3:3 in 'foo'
        from foo.cr:10:1 in '__crystal_main'
        from src/main.cr:12:15 in 'main'

Without dsymutil, with CRYSTAL_CALLSTACK_FULL_INFO:

Index out of bounds (IndexError)
        from Array(Int32)@Indexable(T)#at<Int32>:Int32 at 0x10365a4ee
        from Array(Int32)@Indexable(T)#[]<Int32>:Int32 at 0x10365a489
        from bar:Int32 at 0x103637da1
        from foo:Int32 at 0x103637d29
        from __crystal_main at 0x103627934
        from main at 0x103637c08

With dsymutil, with CRYSTAL_CALLSTACK_FULL_INFO:

Index out of bounds (IndexError)
        from src/indexable.cr:0:17 in 'at' at 0x10d02e4ee
        from src/indexable.cr:74:5 in '[]' at 0x10d02e489
        from foo.cr:8:3 in 'bar' at 0x10d00bda1
        from foo.cr:3:3 in 'foo' at 0x10d00bd29
        from foo.cr:10:1 in '__crystal_main' at 0x10cffb934
        from src/main.cr:12:15 in 'main' at 0x10d00bc08

@RX14
Copy link
Contributor

@RX14 RX14 commented Sep 14, 2017

@asterite I used --no-debug to get at the symbol info, not at the instruction pointer address (which changes between process runs because of aslr and is therefore useless in most situations). CRYSTAL_CALLSTACK_FULL_INFO should show both the hex address and the corresponding symbol name (perhaps with the old +delta as well?).

@asterite
Copy link
Member Author

@asterite asterite commented Sep 14, 2017

@RX14 Updated!

Now without dsymutil when full info is requested:

Index out of bounds (IndexError)
        from Array(Int32)@Indexable(T)#at<Int32>:Int32 at 0x106bed4ee
        from Array(Int32)@Indexable(T)#[]<Int32>:Int32 at 0x106bed489
        from bar:Int32 at 0x106bcac31
        from foo:Int32 at 0x106bcabb9
        from __crystal_main at 0x106bba7c4
        from main at 0x106bcaa98

With dsymutil when full info is requested:

Index out of bounds (IndexError)
        from src/indexable.cr:0:17 in '*Array(Int32)@Indexable(T)#at<Int32>:Int32' at 0x10e00d4ee
        from src/indexable.cr:74:5 in '*Array(Int32)@Indexable(T)#[]<Int32>:Int32' at 0x10e00d489
        from foo.cr:8:3 in '*bar:Int32' at 0x10dfeac31
        from foo.cr:3:3 in '*foo:Int32' at 0x10dfeabb9
        from foo.cr:10:1 in '__crystal_main' at 0x10dfda7c4
        from src/main.cr:12:15 in 'main' at 0x10dfeaa98

That way you can see the real symbol name plus the address.

Is it good now? :-)

RX14
RX14 approved these changes Sep 14, 2017
Copy link
Contributor

@RX14 RX14 left a comment

The implementation looks good! Would be nice to use --no-debug and setting ENV["CRYSTAL_CALLSTACK_FULL_INFO"] = "1" to spec the output in all possible cases though.

@luislavena
Copy link
Contributor

@luislavena luislavena commented Sep 14, 2017

Is it good now? :-)

I have only a small nitpick-kinda-question actually. 😁

Why 8 spaces of indentation? Been using some shards with some nested path structure and ended with really long lines in the backtrace.

Don't see it as a real problem, but always wondered about it and perhaps avoid wrapping at 120 columns like happens under some scenarios.

Anyway, thank you for the contribution! ❤️ ❤️ ❤️

@asterite
Copy link
Member Author

@asterite asterite commented Sep 14, 2017

@luislavena I copied Ruby here. I find that those 8 spaces immediately signals an exception was thrown, but only because I'm used to that. I have no problem changing that to 2 spaces, if @RX14 agrees.

I'll also add specs for CRYSTAL_CALLSTACK_FULL_INFO

@luislavena
Copy link
Contributor

@luislavena luislavena commented Sep 14, 2017

I have no problem changing that to 2 spaces

I have no problems with the 8 spaces either, was wondering if there was a particular reason that I was missing. Thanks for the details.

And thank you for the improvements!

@asterite
Copy link
Member Author

@asterite asterite commented Sep 14, 2017

I changed it to use 2 spaces instead of 8. I think that's better because it gives more spaces to show the trace. Thanks for the suggestion, @luislavena !

@asterite asterite merged commit c23ad1f into crystal-lang:master Sep 14, 2017
2 checks passed
@RX14
Copy link
Contributor

@RX14 RX14 commented Sep 15, 2017

@asterite what about the spec changes?

@asterite
Copy link
Member Author

@asterite asterite commented Sep 15, 2017

I'd like CRYSTAL_CALLSTACK_FULL_INFO to remain internal knowledge for now. It's also not critical to deserve specs, and it's hard to test (just a regex with numbers?). But if you want to add it, I'll approve it :-)

@mverzilli mverzilli added this to the Next milestone Sep 15, 2017
@bew bew mentioned this pull request Nov 18, 2017
6 tasks
@hanyuone
Copy link

@hanyuone hanyuone commented Nov 18, 2017

An update to this issue - would it be better to add colours in the form of ANSI colour formatting to these errors? I personally think it's a little bit of effort that goes a long way to making errors more readable, and I'm willing to do it - just not sure if everyone supports this idea.

@hanyuone
Copy link

@hanyuone hanyuone commented Nov 24, 2017

CC @asterite ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants