Introduce run-time checks that the domain lock is held #11506

gadmm · 2022-08-23T19:46:12Z

Following #11485, this PR makes it simpler to find out why the user's program crashes when the domain lock is incorrectly acquired from C code.

Add an assertion that Caml_state is not NULL (in debug mode)
Add a check in public entry points of the C API (always)

For instance, when calling CAMLparam() without the domain lock, instead of a segfault, one gets a fatal error with message f: no domain lock held where f is the name of the function.

cc @gasche

gadmm · 2022-08-23T19:47:57Z

(The changes to the Changes.md file assumes that it is merged in 5.0; indeed this changes the API: Caml_state is no longer synonymous with the new Caml_state_opt.)

dra27 · 2022-08-24T09:01:45Z

I'm not (yet) sold on the idea that correct code should be penalised with a mandatory check. Assuming that the segfault is reproducible, isn't it enough then to be able to re-run with the debug runtime?

gadmm · 2022-08-24T14:39:12Z

What is the reasoning that this is going to cost? If you take one of the perf-sensitive functions (e.g. caml_modify and caml_initialize) and look at the generated code, you'll see that the load of Caml_state gets CSE'd and your are only left with a correctly-predicted branch.

I do not think that recompiling their app with OCaml's debug mode is what users will think about for locating such bugs (and it is clearly not made for debugging user's programming errors).

gadmm · 2022-08-29T16:43:26Z

There is a sandmark benchmark running. I added a commit that removes many of the checks via CAMLparam inside "internal" code. The sandmark micro-benchmarks are very sensitive to code layout changes, so this change will help better interpret the benchmark results.

gasche

On re-reading (I think your new documentation helped, thanks!) I understand the idea better now, and I think it makes sense. Let's also wait for those sandmark results.

Changes

manual/src/cmds/intf-c.etex

gadmm · 2022-08-30T19:06:13Z

Re. benchmarks. Here is a result for the original PR:

The shape suggests that there could be a slight slowdown, yet by running a few individual benchmarks on my machine, the microarchitectural effects (relevant or irrelevant) seemed dominant (e.g. some would speed up on my machine instead of slowing down, etc). Hence I wanted to see the result of the new commit "Do not check Caml_state inside CAMLparam* when internal" below. The new commit reduces the number of check points from ~120 to ~40 in the OCaml runtime.

This shuffles things around but the overall shape is the similar, however note the change of scale. This is consistent with microarchitectural effects being mostly at play. For instance, I could reproduce the slowdown of quicksort by chance. This benchmark is heavy with caml_modify, so you would think this is the likely culprit. However the slowdown did not go away when removing the check inside caml_modify... but it did so when removing a check never called by the benchmark. I could not observe a cost of having a check inside caml_modify.

We are not going to get definitive answers with the Sandmark benchmarks. It contains many micro-benchmarks and those are usually very difficult to exploit and interpret. As it stands, the suite is not well-suited to evaluate performance impacts at this level.

(In addition, it is good to have in mind that they disable ASLR for reproducibility, as I have just learned, see https://github.com/ocaml-bench/notes/blob/master/apr19.md. This means one always observes the same "random" point in the space of possible code layouts. So when the results are identical from one day to another you cannot conclude that the actual variability between two runs is low. It would be better to make the code layout more random (à la Berger's Stabilizer) rather than reduce randomness.)

Nevertheless this is not incompatible with this PR having a tiny overhead (much smaller than the variations seen on the graphs above, explaining that it is slightly skewed to the right).

Since we cannot count on Sandmark, I propose a more analytical approach: the checks are placed either where the load of Caml_state_opt is redundant (eliminated by CSE), or if the function is obviously not so performance-sensitive. This means it is good to keep inside CAMLparam, but not inside CAMLdrop.

As for caml_modify I am mixed. There is CSE, but this still anticipates a load which might not always occur otherwise, even if the CPU can parallelize this load the rest of the function. On the other hand, if this difference mattered for caml_modify then there are other optimizations which could be done to it (e.g. some redundant loads that the compiler is not eliminating currently). I do not want to spend more time with benchmarks so I propose to err on the cautious side.

gadmm · 2022-08-31T05:33:47Z

I am happy with this version if you are. With the last commits it "looks" better (with no statistical meaning whatsoever):

gasche · 2022-08-31T15:22:46Z

I would like someone with more runtime expertise than myself to give the greenlight to the extra checks. @stedolan, @xavierleroy, @damiendoligez?

gadmm · 2022-08-31T19:39:52Z

Ok. To sum-up:

Checks are placed in such a way that no extra load happens in anything remotely perf-sensitive (confirmed via disassembler).
This is best reviewed by looking at the global diff of the PR (for once). I'll clean-up the history later.

damiendoligez

The documentation needs to be reworded, but the implementation looks good, and I'm OK with the (lack of) measurable overhead.

damiendoligez · 2022-09-07T14:44:28Z

manual/src/cmds/intf-c.etex

+The domain state variable "Caml_state" checks that the domain lock is
+held, either in debug mode or at key entry points of the C API.


This reads wrong: a variable doesn't do anything. In any case, Caml_state only checks in debug mode, and you have another function to check in non-debug mode.

gadmm · 2022-09-07T15:47:04Z

Thanks, I have improved the wording, rebased, and cleaned-up the history.

gadmm · 2022-09-07T16:05:31Z

@Octachron Do you agree to backport to 5.0? Otherwise this is an API breakage for 5.1 and it might be better to call the PR off.

Changes

Octachron · 2022-09-07T17:02:30Z

Rereading the history, this a breaking change only compared to the alpha1 release, isn't it? (since previous versions could not use Caml_state to test if the current thread held the domain lock.)

Splitting the invariant (NULL <=> domain lock not held) to the more specific Caml_state_opt seems quite forward-compatible.

In term of API, I think it is reasonable to integrate this change in 5.0 .

gadmm · 2022-09-08T10:07:24Z

This is correct

gadmm

(manipulation error)

Changes

Implement suggestions from Gabriel Scherer and Damien Doligez

gadmm · 2022-09-09T14:03:11Z

I further cleaned-up history. Let's merge this so that I can make the necessary update to Boxroot in time.

Introduce run-time checks that the domain lock is held (cherry picked from commit f57bbc6)

gasche · 2022-09-21T18:59:32Z

Merged, and cherry-picked to 5.0 in 26b9861.

gadmm mentioned this pull request Aug 23, 2022

OCaml 5.0: segfault between alpha0 and alpha1 #11485

Closed

gadmm force-pushed the caml_state_assertion branch 2 times, most recently from 7b13257 to d2df7ef Compare August 29, 2022 13:54

gasche reviewed Aug 29, 2022

View reviewed changes

Changes Outdated Show resolved Hide resolved

manual/src/cmds/intf-c.etex Outdated Show resolved Hide resolved

gadmm force-pushed the caml_state_assertion branch from b0a1974 to 408f5bc Compare August 30, 2022 20:56

damiendoligez approved these changes Sep 7, 2022

View reviewed changes

gadmm added 3 commits September 7, 2022 17:44

Assert Caml_state != NULL

690dc99

Make CAML(un)likely non-internal and use it inside CAMLassert

69c1bc7

Introduce run-time checks that the domain lock is held in the C API

dc70299

gadmm force-pushed the caml_state_assertion branch from 01df151 to a05dc1b Compare September 7, 2022 15:44

Octachron reviewed Sep 7, 2022

View reviewed changes

Changes Outdated Show resolved Hide resolved

kayceesrk mentioned this pull request Sep 8, 2022

Add webpage with perfstat output from Sandmark ocaml-bench/sandmark-nightly#81

Merged

gadmm commented Sep 8, 2022

View reviewed changes

Changes Outdated Show resolved Hide resolved

Document Caml_state_opt

2375b5c

Implement suggestions from Gabriel Scherer and Damien Doligez

gadmm force-pushed the caml_state_assertion branch from 30aa645 to 2375b5c Compare September 9, 2022 14:00

gasche merged commit f57bbc6 into ocaml:trunk Sep 21, 2022

gasche added a commit that referenced this pull request Sep 21, 2022

Merge pull request #11506 from gadmm/caml_state_assertion

26b9861

Introduce run-time checks that the domain lock is held (cherry picked from commit f57bbc6)

gadmm mentioned this pull request Oct 11, 2022

Implement quality treatment for asynchronous actions in multicore (3/3?) #11307

Merged

Kakadu mentioned this pull request Jan 21, 2023

fix: add OCaml 5 support Kakadu/lablqml#67

Open

metanivek mentioned this pull request Feb 22, 2023

Update bindgen requirement from 0.63 to 0.64 mirage/irmin-rs#10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce run-time checks that the domain lock is held #11506

Introduce run-time checks that the domain lock is held #11506

gadmm commented Aug 23, 2022

gadmm commented Aug 23, 2022 •

edited

Loading

dra27 commented Aug 24, 2022

gadmm commented Aug 24, 2022

gadmm commented Aug 29, 2022

gasche left a comment

gadmm commented Aug 30, 2022

gadmm commented Aug 31, 2022

gasche commented Aug 31, 2022

gadmm commented Aug 31, 2022

damiendoligez left a comment

damiendoligez Sep 7, 2022

gadmm commented Sep 7, 2022

gadmm commented Sep 7, 2022

Octachron commented Sep 7, 2022

gadmm commented Sep 8, 2022

gadmm left a comment •

edited

Loading

gadmm commented Sep 9, 2022

gasche commented Sep 21, 2022

		The domain state variable "Caml_state" checks that the domain lock is
		held, either in debug mode or at key entry points of the C API.

Introduce run-time checks that the domain lock is held #11506

Introduce run-time checks that the domain lock is held #11506

Conversation

gadmm commented Aug 23, 2022

gadmm commented Aug 23, 2022 • edited Loading

dra27 commented Aug 24, 2022

gadmm commented Aug 24, 2022

gadmm commented Aug 29, 2022

gasche left a comment

Choose a reason for hiding this comment

gadmm commented Aug 30, 2022

gadmm commented Aug 31, 2022

gasche commented Aug 31, 2022

gadmm commented Aug 31, 2022

damiendoligez left a comment

Choose a reason for hiding this comment

damiendoligez Sep 7, 2022

Choose a reason for hiding this comment

gadmm commented Sep 7, 2022

gadmm commented Sep 7, 2022

Octachron commented Sep 7, 2022

gadmm commented Sep 8, 2022

gadmm left a comment • edited Loading

Choose a reason for hiding this comment

gadmm commented Sep 9, 2022

gasche commented Sep 21, 2022

gadmm commented Aug 23, 2022 •

edited

Loading

gadmm left a comment •

edited

Loading