-
Notifications
You must be signed in to change notification settings - Fork 3.8k
/
exception-handling.txt
330 lines (260 loc) · 13.3 KB
/
exception-handling.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
Exception Handling In the Mono Runtime
--------------------------------------
* Introduction
--------------
There are many types of exceptions which the runtime needs to
handle. These are:
- exceptions thrown from managed code using the 'throw' or 'rethrow' CIL
instructions.
- exceptions thrown by some IL instructions like InvalidCastException thrown
by the 'castclass' CIL instruction.
- exceptions thrown by runtime code
- synchronous signals received while in managed code
- synchronous signals received while in native code
- asynchronous signals
Since exception handling is very arch dependent, parts of the
exception handling code reside in the arch specific
exceptions-<ARCH>.c files. The architecture independent parts
are in mini-exceptions.c. The different exception types listed
above are generated in different parts of the runtime, but
ultimately, they all end up in the mono_handle_exception ()
function in mini-exceptions.c.
* Exceptions throw programmatically from managed code
-----------------------------------------------------
These exceptions are thrown from managed code using 'throw' or
'rethrow' CIL instructions. The JIT compiler will translate
them to a call to a helper function called
'mono_arch_throw/rethrow_exception'.
These helper functions do not exist at compile time, they are
created dynamically at run time by the code in the
exceptions-<ARCH>.c files.
They perform various stack manipulation magic, then call a
helper function usually named throw_exception (), which does
further processing in C code, then calls
mono_handle_exception() to do the rest.
* Exceptions thrown implicitly from managed code
------------------------------------------------
These exceptions are thrown by some IL instructions when
something goes wrong. When the JIT needs to throw such an
exception, it emits a forward conditional branch and remembers
its position, along with the exception which needs to be
emitted. This is usually done in macros named
EMIT_COND_SYSTEM_EXCEPTION in the mini-<ARCH>.c files.
After the machine code for the method is emitted, the JIT
calls the arch dependent mono_arch_emit_exceptions () function
which will add the exception throwing code to the end of the
method, and patches up the previous forward branches so they
will point to this code.
This has the advantage that the rarely-executed exception
throwing code is kept separate from the method body, leading
to better icache performance.
The exception throwing code braches to the dynamically
generated mono_arch_throw_corlib_exception helper function,
which will create the proper exception object, does some stack
manipulation, then calls throw_exception ().
* Exceptions thrown by runtime code
-----------------------------------
These exceptions are usually thrown by the implementations of
InternalCalls (icalls). First an appropriate exception object
is created with the help of various helper functions in
metadata/exception.c, which has a separate helper function for
allocating each kind of exception object used by the runtime
code. Then the mono_raise_exception () function is called to
actually throw the exception. That function never returns.
An example:
if (something_is_wrong)
mono_raise_exception (mono_get_exception_index_out_of_range ());
mono_raise_exception () simply passes the exception to the JIT
side through an API, where it will be received by helper
created by mono_arch_throw_exception (). From now on, it is
treated as an exception thrown from managed code.
* Synchronous signals
---------------------
For performance reasons, the runtime does not do same checks
required by the CLI spec. Instead, it relies on the CPU to do
them. The two main checks which are omitted are null-pointer
checks, and arithmetic checks. When a null pointer is
dereferenced by JITted code, the CPU will notify the kernel
through an interrupt, and the kernel will send a SIGSEGV
signal to the process. The runtime installs a signal handler
for SIGSEGV, which is sigsegv_signal_handler () in mini.c. The
signal handler creates the appropriate exception object and
calls mono_handle_exception () with it. Arithmetic exceptions
like division by zero are handled similarly.
* Synchronous signals in native code
------------------------------------
Receiving a signal such as SIGSEGV while in native code means
something very bad has happened. Because of this, the runtime
will abort after trying to print a managed plus a native stack
trace. The logic is in the mono_handle_native_sigsegv ()
function.
Note that there are two kinds of native code which can be the
source of the signal:
- code inside the runtime
- code inside a native library loaded by an application, ie. libgtk+
* Stack overflow checking
-------------------------
Stack overflow exceptions need special handling. When a thread
overflows its stack, the kernel sends it a normal SIGSEGV
signal, but the signal handler tries to execute on the same as
the thread leading to a further SIGSEGV which will terminate
the thread. A solution is to use an alternative signal stack
supported by UNIX operating systems through the sigaltstack
(2) system call. When a thread starts up, the runtime will
install an altstack using the mono_setup_altstack () function
in mini-exceptions.c. When a SIGSEGV is received, the signal
handler checks whenever the fault address is near the bottom
of the threads normal stack. If it is, a
StackOverflowException is created instead of a
NullPointerException. This exception is handled like any other
exception, with some minor differences.
There are two reasons why sigaltstack is disabled by default:
* The main problem with sigaltstack() is that the stack
employed by it is not visible to the GC and it is possible
that the GC will miss it.
* Working sigaltstack support is very much os/kernel/libc
dependent, so it is disabled by default.
* Asynchronous signals
----------------------
Async signals are used by the runtime to notify a thread that
it needs to change its state somehow. Currently, it is used
for implementing thread abort/suspend/resume.
Handling async signals correctly is a very hard problem,
since the receiving thread can be in basically any state upon
receipt of the signal. It can execute managed code, native
code, it can hold various managed/native locks, or it can be
in a process of acquiring them, it can be starting up,
shutting down etc. Most of the C APIs used by the runtime are
not asynch-signal safe, meaning it is not safe to call them
from an async signal handler. In particular, the pthread
locking functions are not async-safe, so if a signal handler
interrupted code which was in the process of acquiring a lock,
and the signal handler tries to acquire a lock, the thread
will deadlock. Unfortunately, the current signal handling
code does acquire locks, so sometimes it does deadlock.
When receiving an async signal, the signal handler first tries
to determine whenever the thread was executing managed code
when it was interrupted. If it did, then it is safe to
interrupt it, so a ThreadAbortException is constructed and
thrown. If the thread was executing native code, then it is
generally not safe to interrupt it. In this case, the runtime
sets a flag then returns from the signal handler. That flag is
checked every time the runtime returns from native code to
managed code, and the exception is thrown then. Also, a
platform specific mechanism is used to cause the thread to
interrupt any blocking operation it might be doing.
The async signal handler is in sigusr1_signal_handler () in
mini.c, while the logic which determines whenever an exception
is safe to be thrown is in mono_thread_request_interruption
().
* Stack unwinding during exception handling
-------------------------------------------
The execution state of a thread during exception handling is
stored in an arch-specific structure called MonoContext. This
structure contains the values of all the CPU registers
relevant during exception handling, which usually means:
- IP (instruction pointer)
- SP (stack pointer)
- FP (frame pointer)
- callee saved registers
Callee saved registers are the registers which are required by
any procedure to be saved/restored before/after using
them. They are usually defined by each platforms ABI
(Application Binary Interface). For example, on x86, they are
EBX, ESI and EDI.
The code which calls mono_handle_exception () is required to
construct the initial MonoContext. How this is done depends on
the caller. For exceptions thrown from managed code, the
mono_arch_throw_exception helper function saves the values of
the required registers and passes them to throw_exception (),
which will save them in the MonoContext structure. For
exceptions thrown from signal handlers, the MonoContext
stucture is initialized from the signal info received from the
kernel.
During exception handling, the runtime needs to 'unwind' the
stack, i.e. given the state of the thread at a stack frame,
construct the state at its callers. Since this is platform
specific, it is done by a platform specific function called
mono_arch_find_jit_info ().
Two kinds of stack frames need handling:
- Managed frames are easier. The JIT will store some
information about each managed method, like which
callee-saved registers it uses. Based on this information,
mono_arch_find_jit_info () can find the values of the
registers on the thread stack, and restore them.
- Native frames are problematic, since we have no information
about how to unwind through them. Some compilers generate
unwind information for code, some don't. Also, there is no
general purpose library to obtain and decode this unwind
information. So the runtime uses a different solution. When
managed code needs to call into native code, it does through
a managed->native wrapper function, which is generated by
the JIT. This function is responsible for saving the machine
state into a per-thread structure called MonoLMF (Last
Managed Frame). These LMF structures are stored on the
threads stack, and are linked together using one of their
fields. When the unwinder encounters a native frame, it
simply pops one entry of the LMF 'stack', and uses it to
restore the frame state to the moment before control passed
to native code. In effect, all successive native frames are
skipped together.
Problems/future work
--------------------
1. Async signal safety
----------------------
The current async signal handling code is not async safe, so
it can and does deadlock in practice. It needs to be rewritten
to avoid taking locks at least until it can determine that it
was interrupting managed code.
Another problem is the managed stack frame unwinding code. It
blindly assumes that if the IP points into a managed frame,
then all the callee saved registers + the stack pointer are
saved on the stack. This is not true if the thread was
interrupted while executing the method prolog/epilog.
2. Raising exceptions from native code
--------------------------------------
Currently, exceptions are raised by calling
mono_raise_exception () in the middle of runtime code. This
has two problems:
- No cleanup is done, ie. if the caller of the function which
throws an exception has taken locks, or allocated memory,
that is not cleaned up. For this reason, it is only safe to
call mono_raise_exception () 'very close' to managed code,
ie. in the icall functions themselves.
- To allow mono_raise_exception () to unwind through native
code, we need to save the LMF structures which can add a lot
of overhead even in the common case when no exception is
thrown. So this is not zero-cost exception handling.
An alternative might be to use a JNI style
set-pending-exception API. Runtime code could call
mono_set_pending_exception (), then return to its caller with
an error indication allowing the caller to clean up. When
execution returns to managed code, then managed->native
wrapper could check whenever there is a pending exception and
throw it if neccesary. Since we already check for pending
thread interruption, this would have no overhead, allowing us
to drop the LMF saving/restoring code, or significant parts of
it.
4. libunwind
------------
There is an OSS project called libunwind which is a standalone
stack unwinding library. It is currently in development, but
it is used by default by gcc on ia64 for its stack
unwinding. The mono runtime also uses it on ia64. It has
several advantages in relation to our current unwinding code:
- it has a platform independent API, i.e. the same unwinding
code can be used on multiple platforms.
- it can generate unwind tables which are correct at every
instruction, i.e. can be used for unwinding from async
signals.
- given sufficient unwind info generated by a C compiler, it
can unwind through C code.
- most of its API is async-safe
- it implements the gcc C++ exception handling API, so in
theory it can be used to implement mixed-language exception
handling (i.e. C++ exception caught in mono, mono exception
caught in C++).
- it is MIT licensed
The biggest problem with libuwind is its platform support. ia64 support is
complete/well tested, while support for other platforms is missing/incomplete.
http://www.hpl.hp.com/research/linux/libunwind/