Implement proper exception handling #44

TerrorJack · 2018-12-11T02:52:19Z

Status quo of exceptions: When a Haskell exception is thrown, the scheduler returns with an error code, which is later checked by rts_checkSchedStatus and re-thrown as a JavaScript exception. We don't really know what is thrown; and when rts_checkSchedStatus finds out the Haskell thread hasn't exited gracefully, there's no chance to get back to the crime scene and fix stuff. Besides Haskell exceptions, there are also errors signaled in the runtime itself, and the same problems apply as well.

We definitely need to improve the exception story. Types of exceptions I can come up at the moment:

Asynchronous exceptions: there's really no need to consider async exceptions at the moment.
Synchronous exceptions:
- Explicit throw/throwIO calls in Haskell
- Heap/stack check fails
Runtime errors which aren't re-wrapped as a Haskell exception:
- Failure to grow linear memory and allocate blocks
- Encountered unimplemented stub rts interfaces
- JavaScript exceptions thrown from foreign import javascript code, or rts.js code
- Fatal errors signaled by barf() in rts cmm code

All possible exceptions can be roughly grouped into three kinds:

Can be caught and handled in Haskell.
Can't be handled in Haskell, but when signaled, can be handled by JavaScript and carry on Haskell execution.
Fatal enough to either log & crash, or wipe the current runtime instance & restart.

The key to handle all three kinds of exceptions is improving the current one-shot scheduler. When an StgRun loop yields back to scheduleWaitThread, we need some (potentially async) JavaScript logic to check and fix stuff, then re-enter the loop and get back to Haskell execution.

A rough roadmap for this issue:

Improve scheduleWaitThread, so Haskell execution may be automatically resumed for simple cases like heap/stack check failures. The storage manager won't need to allocate a large fixed-size nursery/object pool; it can take advantage of the recently implemented fast block allocator and only request as many blocks as needed. Also, the createThread wrapper functions perform extra bookkeeping, so when the thread is created from a static closure and a fatal error is signaled, it's possible to simply re-initiate a new instance and restart execution from there. This one will be delivered this week.
Refactor the throw/catch primitives in PrimOps.cmm/Exception.cmm to fit the new scheduler, enabling exceptions to be caught and handled in Haskell.
Well, let's first see if 1 & 2 work as expected.

The text was updated successfully, but these errors were encountered:

TerrorJack · 2018-12-13T13:19:45Z

Preliminary progress on 1:

Pulled quite some weeds in the runtime (removing ~500 loc) without breaking a single test.
We're gradually getting rid of a lot of internal data structure/interfaces of ghc's original runtime (e.g. Task/InCalls, run queues in Capability, etc). We're fine as long as we don't break the absolute essential interfaces used in ghc's emitted cmm code and rts cmm code (e.g. allocate* functions, StgTSO structs, etc)

TerrorJack · 2018-12-14T02:47:04Z

As a result of implementing #44, almost all RTS APIs have different type signatures from the original versions:

Except allocate/createThread, RTS APIs no longer take a Capability as a parameter
The rts_eval* functions used to return nothing; now they return an i32 thread id
The rts_getSchedStatus/rts_checkSchedStatus functions now take thread id as a parameter; they no longer inspect a global data structure reused by the scheduler; they only inspect a thread-local storage. This is critical to prevent thread state leakage (a likely cause for scheduler-related bugs in todomvc)

Before next merge to master, we should also update the docs on rts api, along with the second blog post.

TerrorJack · 2018-12-14T07:43:20Z

For a more uniform styled RTS API, we should avoid passing Capability in allocate and createThread* as well. However, quite a few rts cmm files call those functions. What should we do?

Patch rts cmm files to avoid passing extra parameter
Implement a hack in the codegen so when encountering such an unsafe ccall, simply wipe the first parameter
Implement a layer of wrapper in Asterius.Builtins so the exported interfaces in js don't pass it, but the internal functions do

I'll try the last approach first

…ge & prep for #44 (+8 squashed commit)

TerrorJack · 2018-12-14T09:12:46Z

Simplified runtime & RTS API is merged to master via 497a7e5. Rest of today's work:

Handle stg_gc*/stg_raisezh runtime primitives, so when a gc occurs or a Haskell exception is thrown, it's possible to get detailed error information, instead of a ThreadKilled status code with no attached context.

TerrorJack · 2018-12-16T05:35:16Z

Handling stack overflow requires dealing with chunked stacks and underflow frames, adding extra difficulty for this issue. We'll just allocate a large stack upon createThread, panic on stack overflow (and hope it doesn't). Heap overflow is simpler to deal with.

…ocate* invocations #44

TerrorJack · 2018-12-16T09:13:57Z

Finally! We no longer need to allocate a huge heap upon startup; both the nursery and the object pool can grow on demand, and the scheduler gracefully handles HeapOverflow ret code. I'd say we're also not far from a real gc, but leaving that to a future issue.

…ocate* invocations #44 (+5 squashed commit) Squashed commit: [927e97b] no message [4224024] no message [a68642c] Handle ThreadYielding ret code for stg_yieldzh #44 [17012c7] Cleanup redundant constants in ghc-toolkit [d5d32cc] Add no-op threadPaused #44

TerrorJack · 2019-09-09T16:59:03Z

It's been a long time. We have gc and exceptions now.

TerrorJack added a commit that referenced this issue Dec 13, 2018

Pulling weeds in runtime in preparation for #44 (+9 squashed commit)

ac7d488

TerrorJack added a commit that referenced this issue Dec 14, 2018

Eliminate reused Task/InCall structs across rts_eval* invocations #44

6642b82

TerrorJack added a commit that referenced this issue Dec 14, 2018

Eliminate Capability parameter in all exported RTS API #44

ddfcf70

TerrorJack added a commit that referenced this issue Dec 14, 2018

Simplify RTS API & runtime implementation, support thread-local stora…

497a7e5

…ge & prep for #44 (+8 squashed commit)

TerrorJack added a commit that referenced this issue Dec 14, 2018

Remove previous maskings on EagerBlackholeInfo/GCEnter1/GCFun #44

3567b9d

TerrorJack added a commit that referenced this issue Dec 15, 2018

Add no-op threadPaused #44

d5d32cc

TerrorJack added a commit that referenced this issue Dec 16, 2018

Handle ThreadYielding ret code for stg_yieldzh #44

a68642c

TerrorJack mentioned this issue Dec 16, 2018

Fix Cmm narrowing operators for Int8#/Word8# #47

Closed

TerrorJack added a commit that referenced this issue Dec 16, 2018

Implement fully growable Haskell heap for both regular closures & all…

38d341c

…ocate* invocations #44

TerrorJack added a commit that referenced this issue Dec 17, 2018

Fix vault get/set in an asterius instance #44

e745d93

TerrorJack closed this as completed Sep 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement proper exception handling #44

Implement proper exception handling #44

TerrorJack commented Dec 11, 2018

TerrorJack commented Dec 13, 2018

TerrorJack commented Dec 14, 2018

TerrorJack commented Dec 14, 2018

TerrorJack commented Dec 14, 2018

TerrorJack commented Dec 16, 2018

TerrorJack commented Dec 16, 2018

TerrorJack commented Sep 9, 2019

Implement proper exception handling #44

Implement proper exception handling #44

Comments

TerrorJack commented Dec 11, 2018

TerrorJack commented Dec 13, 2018

TerrorJack commented Dec 14, 2018

TerrorJack commented Dec 14, 2018

TerrorJack commented Dec 14, 2018

TerrorJack commented Dec 16, 2018

TerrorJack commented Dec 16, 2018

TerrorJack commented Sep 9, 2019