Skip to content
This repository has been archived by the owner on Nov 24, 2022. It is now read-only.

Implement proper exception handling #44

Closed
TerrorJack opened this issue Dec 11, 2018 · 7 comments
Closed

Implement proper exception handling #44

TerrorJack opened this issue Dec 11, 2018 · 7 comments

Comments

@TerrorJack
Copy link
Member

Status quo of exceptions: When a Haskell exception is thrown, the scheduler returns with an error code, which is later checked by rts_checkSchedStatus and re-thrown as a JavaScript exception. We don't really know what is thrown; and when rts_checkSchedStatus finds out the Haskell thread hasn't exited gracefully, there's no chance to get back to the crime scene and fix stuff. Besides Haskell exceptions, there are also errors signaled in the runtime itself, and the same problems apply as well.

We definitely need to improve the exception story. Types of exceptions I can come up at the moment:

  • Asynchronous exceptions: there's really no need to consider async exceptions at the moment.
  • Synchronous exceptions:
    • Explicit throw/throwIO calls in Haskell
    • Heap/stack check fails
  • Runtime errors which aren't re-wrapped as a Haskell exception:
    • Failure to grow linear memory and allocate blocks
    • Encountered unimplemented stub rts interfaces
    • JavaScript exceptions thrown from foreign import javascript code, or rts.js code
    • Fatal errors signaled by barf() in rts cmm code

All possible exceptions can be roughly grouped into three kinds:

  • Can be caught and handled in Haskell.
  • Can't be handled in Haskell, but when signaled, can be handled by JavaScript and carry on Haskell execution.
  • Fatal enough to either log & crash, or wipe the current runtime instance & restart.

The key to handle all three kinds of exceptions is improving the current one-shot scheduler. When an StgRun loop yields back to scheduleWaitThread, we need some (potentially async) JavaScript logic to check and fix stuff, then re-enter the loop and get back to Haskell execution.

A rough roadmap for this issue:

  1. Improve scheduleWaitThread, so Haskell execution may be automatically resumed for simple cases like heap/stack check failures. The storage manager won't need to allocate a large fixed-size nursery/object pool; it can take advantage of the recently implemented fast block allocator and only request as many blocks as needed. Also, the createThread wrapper functions perform extra bookkeeping, so when the thread is created from a static closure and a fatal error is signaled, it's possible to simply re-initiate a new instance and restart execution from there. This one will be delivered this week.

  2. Refactor the throw/catch primitives in PrimOps.cmm/Exception.cmm to fit the new scheduler, enabling exceptions to be caught and handled in Haskell.

  3. Well, let's first see if 1 & 2 work as expected.

@TerrorJack
Copy link
Member Author

Preliminary progress on 1:

  • Pulled quite some weeds in the runtime (removing ~500 loc) without breaking a single test.
  • We're gradually getting rid of a lot of internal data structure/interfaces of ghc's original runtime (e.g. Task/InCalls, run queues in Capability, etc). We're fine as long as we don't break the absolute essential interfaces used in ghc's emitted cmm code and rts cmm code (e.g. allocate* functions, StgTSO structs, etc)

@TerrorJack
Copy link
Member Author

As a result of implementing #44, almost all RTS APIs have different type signatures from the original versions:

  • Except allocate/createThread, RTS APIs no longer take a Capability as a parameter
  • The rts_eval* functions used to return nothing; now they return an i32 thread id
  • The rts_getSchedStatus/rts_checkSchedStatus functions now take thread id as a parameter; they no longer inspect a global data structure reused by the scheduler; they only inspect a thread-local storage. This is critical to prevent thread state leakage (a likely cause for scheduler-related bugs in todomvc)

Before next merge to master, we should also update the docs on rts api, along with the second blog post.

@TerrorJack
Copy link
Member Author

For a more uniform styled RTS API, we should avoid passing Capability in allocate and createThread* as well. However, quite a few rts cmm files call those functions. What should we do?

  • Patch rts cmm files to avoid passing extra parameter
  • Implement a hack in the codegen so when encountering such an unsafe ccall, simply wipe the first parameter
  • Implement a layer of wrapper in Asterius.Builtins so the exported interfaces in js don't pass it, but the internal functions do

I'll try the last approach first

TerrorJack added a commit that referenced this issue Dec 14, 2018
@TerrorJack
Copy link
Member Author

Simplified runtime & RTS API is merged to master via 497a7e5. Rest of today's work:

  • Handle stg_gc*/stg_raisezh runtime primitives, so when a gc occurs or a Haskell exception is thrown, it's possible to get detailed error information, instead of a ThreadKilled status code with no attached context.

@TerrorJack
Copy link
Member Author

Handling stack overflow requires dealing with chunked stacks and underflow frames, adding extra difficulty for this issue. We'll just allocate a large stack upon createThread, panic on stack overflow (and hope it doesn't). Heap overflow is simpler to deal with.

@TerrorJack
Copy link
Member Author

Finally! We no longer need to allocate a huge heap upon startup; both the nursery and the object pool can grow on demand, and the scheduler gracefully handles HeapOverflow ret code. I'd say we're also not far from a real gc, but leaving that to a future issue.

TerrorJack added a commit that referenced this issue Dec 16, 2018
…ocate* invocations #44 (+5 squashed commit)

Squashed commit:

[927e97b] no message

[4224024] no message

[a68642c] Handle ThreadYielding ret code for stg_yieldzh #44

[17012c7] Cleanup redundant constants in ghc-toolkit

[d5d32cc] Add no-op threadPaused #44
@TerrorJack
Copy link
Member Author

It's been a long time. We have gc and exceptions now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant