Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSan should handle Effect.Unhandled correctly #12593

Merged
merged 1 commit into from
Sep 26, 2023

Conversation

fabbing
Copy link
Contributor

@fabbing fabbing commented Sep 21, 2023

TSan wasn't handling correctly the case where an effect was performed without a matching effect handler, as noted in #12535 (comment).

There were two possible behaviours:

  • If an effect was performed from a computation, the last caml_reperform would reconnect the current_stack to its parent stack (the main fiber) too early, causing caml_tsan_entry_on_resume to call __tsan_func_entry on too many functions.
    The proposed fix introduces a struct stack_info const* limit argument to caml_tsan_entry_on_resume, which is set to the previous fiber by caml_perform, preventing the unnecessary calls to __tsan_func_entry.
  • If an effect was performed from the main fiber, caml_tsan_entry_on_resume was called unconditionally, although caml_tsan_exit_on_perform was bypassed because the fiber has no parent stack.
    In the proposed fix, the limit set to Caml_state(current_stack) by caml_perform prevents again the unnecessary calls to __tsan_func_entry.

When a continuation is resumed, the limit for caml_tsan_entry_on_resume is set to NULL meaning that we expect to unwind the stack of all the fibers from the continuation.

We have also added a test to the TSan testsuite, to validate that the backtrace given by TSan is correct in the 2 cases described above.

@fabbing fabbing force-pushed the tsan_effect_unhandled branch 2 times, most recently from 14e1d96 to 19b34b7 Compare September 21, 2023 16:04
@gasche
Copy link
Member

gasche commented Sep 21, 2023

I don't understand the details, but the code does not look very nice: 100% of the additions to the C code are new weird corner cases. Is there a simpler way to do this?

If an effect was performed from the main fiber, caml_tsan_entry_on_resume was called unconditionally, although caml_tsan_exit_on_perform was bypassed because the fiber has no parent stack.

What if we did not bypass caml_tsan_exit_on_perform, by calling it before the parent is NULL? check? Could you implement a simpler fix in this scenario?

@gasche gasche added the submitter-action-needed This PR is waiting for an action of its submitter. label Sep 26, 2023
@OlivierNicole
Copy link
Contributor

We were dubious initially, but it turns out that yes, we found a smaller fix by doing caml_tsan_exit_on_perform prior to testing for the parent fiber.

Note: for this to work, we modify caml_exit_on_perform to perform one less TSan-exit, which we compensate by an explicit exit where needed from the assembly.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I indeed agree that the new version is nicer. I don't 100% follow the details of how the show stack winding/unwinding work in the new model vs. the older one, but I trust the combination of you and the testsuite to get it correct. I reviewed that this PR does not break anything outside TSan mode. Approved.

runtime/amd64.S Outdated
@@ -735,6 +735,7 @@ LBL(caml_c_call):
/* Save non-callee-saved registers %rax and %xmm0 before C call */
pushq %rax; CFI_ADJUST(8);
subq $16, %rsp; CFI_ADJUST(16);
/* Save %xmm0 before C call */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line looks a bit redundant with the comment just above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest version (I also squashed the history)

Co-authored-by: Olivier Nicole <olivier@chnik.fr>
@gasche gasche merged commit 2ec9ac7 into ocaml:trunk Sep 26, 2023
9 checks passed
gasche added a commit that referenced this pull request Sep 26, 2023
@OlivierNicole OlivierNicole deleted the tsan_effect_unhandled branch September 26, 2023 20:24
sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
submitter-action-needed This PR is waiting for an action of its submitter.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants