Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement bail: %meme #182

Closed
wants to merge 10 commits into from
Closed

Implement bail: %meme #182

wants to merge 10 commits into from

Conversation

ashelkovnykov
Copy link
Collaborator

@ashelkovnykov ashelkovnykov commented Dec 14, 2023

Resolves #172

  • Adds OOM checks on NockStack allocations
  • Implements a mote system

Current issues:

  • Allocation issues; see comment 1 and comment 2 below
  • Need to redo some name changes in arguments to interpret which were made as a result of the way in which the hw_exception::catch closure can use local arguments (it can't use any &mut due to extending UnwindSafe)

rust/ares/src/serf.rs Show resolved Hide resolved
@@ -389,537 +390,568 @@ pub fn interpret(context: &mut Context, mut subject: Noun, formula: Noun) -> Res
//
// (See https://docs.rs/assert_no_alloc/latest/assert_no_alloc/#advanced-use)
let nock = assert_no_alloc(|| unsafe {
push_formula(&mut context.stack, formula, true)?;
hw_exception::catch(|| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it safe to nest these invocations? What happens if a SIGSEGV results inside +mink or a parser jet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe nesting invocations is safe. It doesn't explicitly say so in the docs, but my understanding is that catch closures form a stack of setjmp buffers, and throw will longjmp to the one on the top of the stack. I've tested this experimentally and my results line up with this interpretation.

A SIGSEGV in +mink or one of the parser jets will be caught by the top-most catch closure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "topmost" you mean "most inner" so the most recently-called +mink would be the one to receive the error, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; "top-most catch" == "inner-most +mink"

unwrap_or is eagerly evaluated; we want unwrap_or_else which is lazily
evaluated.
@ashelkovnykov
Copy link
Collaborator Author

Note that we now have two different crates registering signal hooks. Both SIGINT and SIGSEGV are being handled correctly, but it's something to keep in mind.

@ashelkovnykov
Copy link
Collaborator Author

@eamsden throw appears to allocate internally, but I would assume it does so in a way that's acceptable to us, since it uses longjmp and no unwinding occurs. Might this be a rare case where we add a comment, wrap in permit_alloc, and carry on?

@ashelkovnykov
Copy link
Collaborator Author

@eamsden On the back of the previous comment, I'm going to play around with using hw_exception for SIGINT as well, as discussed in #152

@eamsden
Copy link
Collaborator

eamsden commented Dec 17, 2023

@ashelkovnykov the point of #152 is not to catch SIGINT with hw_exception, as SIGINT is asynchronous. Rather it is to turn SIGINT into SIGSEGV by mprotecting the entire NockStack, and then using hw_exception to catch the resulting SIGSEGV.

@eamsden
Copy link
Collaborator

eamsden commented Dec 17, 2023

@ashelkovnykov is this PR ready to merge? Or are you planning to also address #152 in the same PR?

@ashelkovnykov ashelkovnykov marked this pull request as draft December 17, 2023 22:50
@ashelkovnykov
Copy link
Collaborator Author

ashelkovnykov commented Dec 17, 2023

@eamsden I want to quickly see if I can improve the PR a little and unify how we handle interrupts + OOM

@eamsden
Copy link
Collaborator

eamsden commented Dec 18, 2023

@ashelkovnykov as discussed can you mark this ready-for-review and work on #152 in a separate PR?

@ashelkovnykov ashelkovnykov marked this pull request as ready for review December 19, 2023 03:49
@ashelkovnykov
Copy link
Collaborator Author

ashelkovnykov commented Dec 19, 2023

The nested hw_exception::catch statements are creating an issue due to hw_exception::run_hooks allocating internally (this function gets called when an exception is thrown to find the signal hook handler we register in serf.rs). Making this assert_no_alloc error go away requires nesting permit_alloc calls around each hw_exception::catch, which is miserable.

But it also doesn't fully resolve the issue: the longjmp inside of throw is messing up the internal state management of assert_no_alloc, which uses thread-local counters to count nested assert_no_alloc calls. Therefore, when you longjmp out, the code assumes that you're still inside of assert_no_alloc and triggers an error on Newt allocations.

To work around this, @eamsden is hacking together a Frankenstein's monster crate from both hw_exception and assert_no_alloc. While vendoring two crates into one isn't ideal, they're luckily both very small and assert_no_alloc in particular is basically frozen.

@eamsden
Copy link
Collaborator

eamsden commented Dec 19, 2023

To work around this, @eamsden is hacking together a Frankenstein's monster crate from both hw_exception and assert_no_alloc. While vendoring two crates into one isn't ideal, they're luckily both very small and assert_no_alloc in particular is basically frozen.

Thinking further this isn't necessary. If we make hw_exception not allocate, and nest catch inside every assert_no_alloc (we really will only have 2, one in interpret and one in cg_interpret when codegen merges) then no longjmp from hw_exception::catch should ever jump outside of an assert_no_alloc call.

@eamsden
Copy link
Collaborator

eamsden commented Dec 19, 2023

looking even further: hw_exception is overengineered for purpose and unavoidably allocates when catch-ing an exception. Not sure what to do from here.

@eamsden
Copy link
Collaborator

eamsden commented Dec 19, 2023

I've come to the conclusion that we need a custom crate that provides unified functionality of both assert_no_alloc and hw_exception, but with the latter specialized to SIGSEGV handling. In particular:

  • assert-no-alloc uses thread-local counters to track nested assert_no_alloc and permit_alloc invocations. These of course are not reset by hw_exception.
  • hw_exception allocates information about the signal on the heap. This of course is the only reasonable place to put it, since information in the signal handler is pushed onto the (C/Rust) stack, which will be reset by the longjmp call. However this trips assert_no_alloc. This cannot be wrapped in a permit_alloc call from outside the library, since a SIGSEGV could be caught anywhere inside a catch closure.

So we really need these two implementations to be aware of each other. @ashelkovnykov I think work on this either ought to be suspended for now, or directed toward a hw_exceptions_no_alloc crate which provides a catch_assert_no_alloc function, and saves the value of the thread locals in the stack frame of catch prior to the setjmp, restoring them after the longjmp. This would be a pretty tricky lift, but the payoff would be something we could also use to remove TERMINATOR checks in both interpreters.

@ashelkovnykov
Copy link
Collaborator Author

@eamsden That sounds right - and luckily there's plenty of prior art

@matthew-levan matthew-levan linked an issue Jan 16, 2024 that may be closed by this pull request
@ashelkovnykov
Copy link
Collaborator Author

Archived this PR branch here; going to submit a new PR that just adds motes. Guard page work will now be done in collaboration w/ @matthew-levan in #202.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mprotect() SIGINT OOM / bail:meme check in NockStack allocation routines
2 participants