Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print a backtrace when a signal is caught #8

Closed
jclark opened this issue Jun 7, 2021 · 8 comments
Closed

Print a backtrace when a signal is caught #8

jclark opened this issue Jun 7, 2021 · 8 comments
Assignees

Comments

@jclark
Copy link
Contributor

jclark commented Jun 7, 2021

Ballerina requires that the / operator result a panic if its right operand is 0. We could handle this by explicitly testing for 0, before performing the division, but it would be much more efficient if instead we relied on the CPU's ability to generate an exception in this case. In POSIX terms, the CPU exception will turn into a signal (SIGFPE in this case), which the kernel will deliver to the program. Using sigaction we can catch this exception and get the address (in siginfo.si_addr) of the instruction that caused CPU exception. We should then be able to use this to print a backtrace.

@KavinduZoysa
Copy link
Contributor

https://github.com/KavinduZoysa/test-GCs/tree/div

@jclark, could you please review my draft code and give your opinion on this?

I took the ballerina source code and llvm IR from @ruvi-d's test cases. To get the line where the division is done to print on the backtrace, I put the debug info(ex : !dbg !4) on sdiv instruction. Therefore I did not want to use siginfo.si_addr.

Please correct me if this approach is wrong.

@jclark
Copy link
Contributor Author

jclark commented Jun 9, 2021

I don’t understand why you are getting a SIGFPE since the .ll code is explicitly checking for a divisor of 0. I must be missing something obvious.

@KavinduZoysa
Copy link
Contributor

I don’t understand why you are getting a SIGFPE since the .ll code is explicitly checking for a divisor of 0. I must be missing something obvious.

As per our offline discussion, since I have used panic() function before unreachable, we expect that the rest of the code is unreachable because panic() is declared as noreturn function. But when we look at assembly code, panic() function returns. Therefore rest of the code executes discarding unreachable.

@KavinduZoysa
Copy link
Contributor

KavinduZoysa commented Jun 9, 2021

@jclark, as per suggestions I have written a simple llvm IR by removing the checks for div operation. Since we do not have any branching now, there is not a place to call panic(). Therefore we can get the backtrace when SIGFPE signal is cought, by putting debug info on sdiv instruction.

Please check the new changes and give your opinion.
Edited : https://github.com/KavinduZoysa/test-GCs/tree/d0a3326651286790f611d1a01e8a834b4e8f747b

@jclark
Copy link
Contributor Author

jclark commented Jun 10, 2021

@KavinduZoysa The problem with that is that LLVM specified that divide by zero is undefined behavior for "sdiv". Unless we can find a documented, correct way to avoid this, we cannot rely on it. I believe Rust specifies the same behavior for / as us, and they will probably have done this in the best way that LLVM allows. Can you look at how they deal with it?

We can also potentially call llvm.trap when we get an overflow (which will I believe cause a signal) and then catch that signal and print a backtrace. Probably the processor can run that faster (which is why UBSan does it that way).

@KavinduZoysa
Copy link
Contributor

Yes @jclark, I will check how the rust is handling this. Currently, I am trying to build an example on llvm.ubsantrap (Please note that this is a new intrinsic introduced for in llvm-12 and not available in llvm-11 which is currently used by us)

1 similar comment
@KavinduZoysa
Copy link
Contributor

Yes @jclark, I will check how the rust is handling this. Currently, I am trying to build an example on llvm.ubsantrap (Please note that this is a new intrinsic introduced for in llvm-12 and not available in llvm-11 which is currently used by us)

@jclark
Copy link
Contributor Author

jclark commented Jun 12, 2021

This approach isn't going to work because of #38.

@jclark jclark closed this as completed Jun 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants