Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation comments of TSan instrumentation choices #12694

Merged
merged 1 commit into from
Nov 3, 2023

Conversation

OlivierNicole
Copy link
Contributor

Add a comment in tsan.c explaining the way OCaml accesses are signaled to TSan, and the consequences in terms of correct detection of data races. I also added a section explaining the choices made for volatile accesses. A case of TSan false positive in major_gc.c is commented with a pointer to these explanations.

runtime/tsan.c Outdated Show resolved Hide resolved
runtime/tsan.c Outdated Show resolved Hide resolved
runtime/tsan.c Outdated Show resolved Hide resolved
runtime/tsan.c Outdated Show resolved Hide resolved
runtime/tsan.c Outdated
account.
- A field is accessed from C with `Field`, or more generally using a
`volatile value *` or a relaxed atomic access, and that field is modified
concurrently by OCaml code. Because caml_modify as a plain write for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a plain write?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is not so easy to understand for me. Maybe you could mention (with slightly more details) the translation / TSan effect of caml_modify and caml_initialize somewhere above, for example right before section 3.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a plain write?

This was a typo, thanks for spotting it. The right phrasing is: is instrumented as a plain write.

This sentence is not so easy to understand for me. Maybe you could mention (with slightly more details) the translation / TSan effect of caml_modify and caml_initialize somewhere above, for example right before section 3.1?

I added a paragraph about this. There was also a missing row in the table that I added.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what is documented here and I appreciate the extra documentation. Thanks!

I believe that this needs a Changes entry.

Copy link
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice documentation. Minor comments below.

runtime/tsan.c Outdated
the FFI rules on the condition that the GC does not run between the
allocation and the end of initialization) and a conflicting access is made
from OCaml after publication to other threads. There is no data race due
to data dependencies, but TSan does not take data dependencies into
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of data dependency is still an open problem for FFI usage, so maybe one could write less affirmatively, “There should be no data race due to data dependency (see [MMMOC] in memory.c)”.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtime/tsan.c Outdated
Our current make-do solution is that __tsan_volatile_readN performs a dummy
call to __tsan_atomic64_load, which is sufficient for correctness; and
__tsan_volatile_writeN simply calls __tsan_write8, i.e., volatile writes are
treated as plain writes. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand yet this part, could you clarify why having __tsan_atomic64_load vs. __tsan_write8 does what you want? (Is it related to the false positives mentioned in another PR? If so this might also be worth mentioning.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 903fc77. Note that #12681 may remove this limitation, if it’s correct and is accepted.

Copy link
Contributor

@fabbing fabbing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice addition to the comments for TSan!
I have suggested minor changes directly from the Github UI, but they most likely broke the 80 column limit...

runtime/tsan.c Outdated Show resolved Hide resolved
runtime/tsan.c Outdated Show resolved Hide resolved
runtime/tsan.c Outdated Show resolved Hide resolved
runtime/tsan.c Outdated Show resolved Hide resolved
runtime/major_gc.c Outdated Show resolved Hide resolved
Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that documentation PRs should not linger for too long -- we want to motivate people to write more documentation! Approved.

@fabbing
Copy link
Contributor

fabbing commented Oct 30, 2023

Piggybacking on this PR to correct an outdated comment:

diff --git a/runtime/tsan.c b/runtime/tsan.c
index 0a81f74ce1..8800eacdda 100644
--- a/runtime/tsan.c
+++ b/runtime/tsan.c
@@ -74,10 +74,10 @@
 
      1.1 From OCaml
 
-     Both `caml_raise_exn` and `caml_tsan_raise_notrace_exn` need to call
-     `caml_tsan_exit_on_raise` to issue calls to `__tsan_func_exit` for each
-     OCaml function exited by the exception. The process should be repeated
-     when re-raising until the appropriate exception handler is found.
+     `caml_raise_exn` need to call `caml_tsan_exit_on_raise` to issue calls to
+     `__tsan_func_exit` for each OCaml function exited by the exception. The
+     process should be repeated when re-raising until the appropriate exception
+     handler is found.
 
      1.2 From C
 

We have a removed caml_tsan_raise_notrace_exn with #12634

@OlivierNicole
Copy link
Contributor Author

Let me know if I should integrate this before merging or another PR should be opened.

@gasche
Copy link
Member

gasche commented Oct 30, 2023

My plan is to wait an extra day before merging to let people make drive-by comments or last-minute changes.
(I will forget to merge this, you can ping me or someone else whenever.)

@fabbing
Copy link
Contributor

fabbing commented Oct 30, 2023

My plan is to wait an extra day before merging to let people make drive-by comments or last-minute changes. (I will forget to merge this, you can ping me or someone else whenever.)

😅 Sorry, I noticed the mistake (or rather the outdated comment) when reviewing the PR, but I couldn't suggest the change at that moment. And I forgot to comment this right away, so it arrived lately!

@gasche
Copy link
Member

gasche commented Oct 31, 2023

A check-typo failure needs to be fixed before merging anyway:

./runtime/major_gc.c:995.81: [long-line] line is over 80 columns

@OlivierNicole
Copy link
Contributor Author

Fixed. Should I squash the history?

@gasche
Copy link
Member

gasche commented Oct 31, 2023

Yes please.

Add a comment in tsan.c explaining the way OCaml accesses are signaled
to TSan, and the consequences in terms of correct detection of data
races. I also added a section explaining the choices made for volatile
accesses. A case of TSan false positive in major_gc.c is commented with
a pointer to these explanations. A not-completely-related outdated
comment about exception raising and TSan was updated in the process.
@gasche gasche merged commit 9f4142f into ocaml:trunk Nov 3, 2023
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants