Conversation
|
|
||
// Init | ||
|
||
shared static this() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is odd.
It means that this mechanism is activated by importing the module.
Instead it should be a function like installHandler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fine to me. It can definitively be changed.
I think your point here is very weak. No argument whatsoever here. Let me state why I think it is a good idea.
Additionally, it opens some doors :
Information have been asked to linux hackers, ucontext_t is supported, even if not very documented. You can ask feep if you doubt my claim. About existing handler, I think an user playing with signal handler is big enough to know what he/she is doing and be sure to not collide with this. If the segfault occur in non-D thread (or even in non D code) things will work as well if you recover. If you throw this will go wrong, as D's Exception are not compatible with C/C++. Anyway, without the Exception, things goes wrong as well. A global thread local flag can be set to ensure this behavior doesn't trigger in non D threads. |
No you cannot because you corrupt your complete program state at the point http://www.google.com/search?q=bug%20signal%20handler%20deadlock
If you want to improve the current behavior then try to print |
I think you totally misunderstood the piece of code you are commenting. The whole point of the code in the handler is to provide a way for another piece of code OUTSIDE the handler, to handle the segfault. The state of the program isn't corrupted, so it is safe to throw and even to recover from it. I even have a piece of code where I use mprotect to trigger segfault, and totally recover without throwing, using the exact same method. It is recoverable, just try it if you don't believe me. |
Yeah, you only reset the instruction pointer but that isn't much different You still interrupt functions at arbitrary points, and exception handling Logger logger;
void setLogger()
{
logger = new Logger;
scope (failure) logger = null;
logger.init(getSomething());
} If getSomething caused a segfault your program is corrupted. Likewise if a segfault happened in malloc you may not call it again. |
No, this isn't like longjmp ! In that case, you know that the whole program is back in the right state (longjmp doesn't guarantee that). And the only registers altered are trash registers, so it is safe in regard of try catch blocks. For you example, it currently works (tested as well) with the code generated by dmd. I feel like you are making up problems that doesn't even exists here, and you don't even consider the advantages that such a mechanism provide. |
This is already implemented on Windows as the default behaviour. This needs more thought; with this patch it will even throw different exceptions on Linux and Windows, which is a completely unnecessary inconsistency. |
Yes, this is inconsistent. And yes, this needs to be made consistent with windows. The point is that the exception thrown on windows is system specific. I would rather think that both should throw a NullPointerError on null deference, so we have a system agnostic way of catching this. This definitively need to converge. I didn't wanted to change existing behavior on windows, so this approach can be tested without breaking anything. Ultimately, this is linked to the problem of @safe that allow unsafe things to be done when passed large objects/pointers that are null. What must be done to solve that problem is something we have to decide before changing the existing behavior, or we risk to change it twice, which is something we want to avoid. |
For reference, I cite ##kernel on freenode: http://pastebin.com/VEeZYPRJ . So it seems to be "officially" supported. The main advantage over longjmp is that it definitely ends up with the handler function called from a state that is "safe", since it reuses the kernel's existing return-to-previous-context mechanism. So you don't have to emulate whatever cleanup the kernel does at signal handler exit. |
Undecided on what to do about this. Should we discuss the matter further in the newsgroup? I know @WalterBright has a dim view on converting null pointer accesses into exceptions. |
(it ain't rebased either :o)) |
Indeed, this pull request involve a language design decision, and may be discussed in the newsgroup. The problem to be fixed is in fact much larger than what this pull request does, it is about the whole null deference handling in D. This pull request open the door for unified behavior between linux and windows. It is also easy to provide a callback to do whatever is wanted (HALT if one think is better, or throw). It is even possible to recover in some situation (concurrent GC is a great use case of that capability for instance). Note that I use this in all code I produce in D until then, and it have been of great help. |
I rebased the pull request. git wasn't able to merge the stuff by itself. |
There are still tons of issues. Your signal handler is global.
This applies to all D executables and all D libraries.
You'll get deadlocks which are even worse than crashing.
You cannot recover without unwinding/destruction support.
This has absolutely no place in production code because it is unpredictable and unreliable. |
I'm sorry, but I don't think most of this are problems. First, yes, it throw D exception, but not in any language, because IT IS IN D RUNTIME. So it throw D exception in D. If you interface D with other languages, then you have to make sure that the interfacing make sense. This is no news and have nothing to do with that pull request specifically. As of 3rd party lib stealing the signal handler, it is a possibility. However, the signal mechanism is already used in druntime. So either we consider this is a problem, or we don't. But this is rather stupid to state this is a problem when this is done all over the place. Many other remarks show that you don't understand what is going on here. The whole mechanism is here to set up a function call on top of the instruction that caused the fault. So everything happen in userland, not in the signal handler. Signal usafe function will work properly, and no deadlock will occur. I'm sorry, but I see nothing but FUD in your comment. You should come with actual fact here. |
I have to agree with this. See my earlier paste from ##kernel - it's standard, albeit uncommon. The technique works on any operating system that uses the basic x86 stackframe layout, which, I think, is more systems than D runs on - and it's not like "but this only works on Linux" stopped people when it came to supporting SEH. And the entire point of this is to sidestep the issue of signal safety. Let's not get into flames - but please make sure you have read the entire thing before objecting. |
@dawgfoto please excuse me for the rudeness of my previous message. I let it here to keep the discussion understandable, but It was way too aggressive. I'm sorry. To restate thing in a more neutral way : I have nothing against this not being included, but it have to be discussed and the decision must be taken based on actual hard facts. Please feel free to ask any question about the technical aspect of things, because your post contained inaccurate informations. As I'm pretty sure it was not done on purpose, I guess we simply have a misunderstanding. So let's start again on good basis, and please forgive my tone. |
@deadalnix Thanks very much. All - let's keep the good spirits going! I'll ask Walter what he thinks about the idea. Is this implementable on all major OSs? |
On windows, the system already throw an Exception when this happen. A similar mechanism can be implement on windows without much problems. I'm not qualified enough on FreeBSD or macOS to answer that question. |
void* p = void; // uninitialized
core.stdc.free(p); // malloc.c
void free(void* p)
{
// ...
rlock_acquire(&malloc_mtx);
size_t len = *cast(size_t)(p-1); // SIGSEGV
} Now your preemptively exited a signal-unsafe function and you have no way of repairing it. A set of functions that you may call is listed in signal(7) - Async-signal-safe functions.
If your talking about recovering you need a mechanism to unwind the stack and restore state.
|
It's common that sigreturn restores the CPU context from the signal's ucontext_t as far as security allows. |
I understand your example with malloc/free. You have to understand that in this case, whatever happen, you are doomed. Either the program crash either it is in an inconsistent state. It is a situation where catching the NullPointerError make no sense t all, because you can't recover from it. Not having this behavior will not solve the problem, because you'll also be in an inconsistent state or you'll crash. This isn't any better and I don't see how this pull request is making things worse. A invalid call to a system function have been made, whatever comes out from that is either a crash or a inconsistent state. Note that those function are signal unsafe. And this pull request don't change in any way if signal are send or received, it just change how they are handled. At the moment the code present in this pull request start to execute, the arm is already done, and the program is already in a beyond repair state. It is in that state because of the signal, not because of the code present in the pull request, so I don't see how it is an argument for or against it. Another fact here is that the exception is thrown from C code and C don't handle exception. This problem isn't specific to this pull request, this is a problem that can occur every time an exception is thrown throw C code. And this problem is unsolvable, as C don't support exceptions. It is up to the programmer to ensure that exception are not throw throw C code. This is exactly why I inherited NullPointerError from Error and not from Exception. They are not always recoverable. But they always are in @safe code. |
On an automated system deadlocks are worse than failures. By the way could you rephrase the purpose of translating signals to errors? immutable pid = fork();
if (pid)
{
char[11] buf=void;
format(buf, pid);
const char* args[3];
args[0] = argv[0]; // C argv (this has issues with chdir, deleted images, changed rights...)
args[1] = "--druntime-report-error";
args[2] = buf.ptr;
execve(args[0], args.ptr, environ);
}
else
{
ptrace(PT_TRACE_ME);
abort();
} |
Personally: exceptions/errors have backtraces. Backtraces are immensely useful (no, gdb is not the answer). It'd also add consistency with Windows. Look. Of course we can get backtraces for segfaults under linux with effort. But this patch allows us to get backtraces without effort, and that level of trivial convenience has a quality of its own. About your |
The point is that creating a backtrace might be enough to do way more harm. How about a library solution, for example? |
I also think that should evolve to provide a custom handler for advanced user. As of now, it is in opt-in, you have to include this module somewhere to « activate » it. |
Why doesn't this simply restore BP and call _d_throwc from within the signal handler? |
Throwing from the signal handler isn't safe. You aren't in a standard execution flow, and the kernel knows it. Yes the code can be extended to manage stack overflows, and should be IMO. |
For the same reason that this mechanism is unsafe or am I missing something.
What do you mean by that? |
@@ -0,0 +1,277 @@ | |||
/** | |||
* Handle page protection error using Errors. NullPointerError is throw when deferencing null. A system dependant error is throw in other cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some corrections:
errors
thrown (twice)
dereferencing
system-dependent
It's also a good idea to use $(D symbol) when referencing symbols.
@deadalnix can you address @JakobOvrum's points? |
import core.sys.posix.ucontext; | ||
|
||
// Register and unregister memory error handler. | ||
private shared sigaction_t old_sigaction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to use __gshared
here to avoid casting.
Please stick to camel casing for function names. |
… in other SIGSEGV cases.
What is the hard limit for line length ? What is the policy to split ternary operator on multiple line ? |
There is a soft character limit of 80 characters per line and a hard limit of 120. So, most lines should be within 80 characters but no lines can exceed 120.
There is none. For the most part, we don't have much in the way of style rules about formatting. They boil down primarily to using 4 spaces per indent (no tabs) and putting braces on their own line. For most of the rest, it's primarily a matter of making sure that the style in the file is reasonably consistent, and there probably aren't enough ternary operators being used to even set a precedent within most files. Personally, if the line with a ternary operator is too long, I do
but I expect that there's code in druntime and/or Phobos which formats that differently. At minimum, if Andrei were doing it, he'd only indent the 2nd and 3rd lines by 4 spaces rather than lining them up with the previous line. It all depends on who wrote the code and what file it's in (as the formatting style varies from file to file). |
historical moment! |
This one have been epic :D |
When core.nullpointererror is imported in a project, it transform null deference into NullPointerError and other segfault in SignalError on linux x86 and x86_64 .
If the idea is successful, and implemented on other systems (windows, macOS, freeBSD) it could become the standard behavior. For now, it only behave that way when the module is explicitly imported.
This is realted to : #181 that should be included before.