-
Notifications
You must be signed in to change notification settings - Fork 1.2k
manual: update runtime tracing chapter for custom events (ex #12335) #12840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I also replaced the term “probe” with the more self-explanatory “event source” as suggested by @xavierleroy, but I will revert it if it is deemed unnecessary. |
dra27
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking this on - it's looking very good. I didn't review the original PR, so only did a brief diff of it against the original. I liked the text on "vanilla" reading of the diff, and I then liked the re-structuring (and the new summary clarification), having compared it to the original.
Most of my comments are language or other nits - the only major part for me is to rework the explanation in the summary when talking about add_user_event to relate it to Runtime_events.Callbacks.create and that that is then passed with a cursor to consume the events. At the moment, the text read to me as though the callback is attached to the cursor, but the types told me otherwise 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general impressions and questions from reading this document:
- Ring buffer per domain, Is there always a domain present aka root domain? What is the correct name for it?
There is a unique ring buffer for every running domain and, on domain termination, ring buffers may be re-used for newly spawned domains.
Maybe some more context on what domain is present at startup, or a pointer to the domain behaviour section in the manual? I expect there is a default domain started initially but the manual section Chapter 9 Parallel programming and Domain module documentation don't have much to say about this topic.
- Can I connect to the events API after an OCaml process has started? Perhaps enabling runtime events via signals so it can be turned on dynamically? Future work?
To receive events that are only available in the instrumented runtime, the OCaml program needs to be compiled and linked against the instrumented runtime.
Does this apply for user defined events?
General question about spans and timestamps, and their properties. Are timestamps strictly monotonic or regular monotonic? Taking this definition of monotonic:
Strict monotonic counters or clocks are guaranteed to return always increasing values (or always decreasing values). The sequence 1, 2, 3, 4, 5 is strictly monotonic.
Regular (non-strict) monotonic counters otherwise only require to return non-decreasing values (or non-increasing values). The sequence 1, 2, 2, 2, 3, 4 is monotonic, but not strictly monotonic.
definition from https://learnyousomeerlang.com/time
What is the relationship between domains and their respective timestamps? If I have the same timestamp value for two different Domains, did the event happen at the same time? Can I infer that one timestamp from Domain 1 is before or after a second timestamp on Domain 2.
How do I observe the events for a particular domain vs all domains? This is implied in the signature for span_event_handler and int_event_handler but might be worth mentioning explicitly.
manual/src/cmds/runtime-tracing.etex
Outdated
| To summarize, to emit and consume custom events of a custom type (for example), | ||
| the user needs to: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be simplified as:
| To summarize, to emit and consume custom events of a custom type (for example), | |
| the user needs to: | |
| For a user to emit and consume custom events using a custom type they need to: |
or
To emit and consume custom events using a custom type:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I integrated your suggestion in 9d1768b. However I kept “To summarize” because I feel that it informs clearly that the list of steps is merely a more condensed recap of previously given information.
|
It would be good to get a review from a current user of custom events. @talex5, would you have time to review this? |
talex5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to get a review from a current user of custom events. @talex5, would you have time to review this?
I think my earlier comment still applies: something should be said about thread-safety. e.g.
- Is it OK if my encoder function allocates (possibly triggering a GC event)?
- What happens if two sys-threads write an event at the same time?
- Does my decoder need to be robust against the event being overwritten as it's being decoded?
Regarding the API itself:
- It would be good to have a fast (noalloc) function to check whether tracing is on. That would allow us to avoid some work in the common case of no tracing.
- A lot of the complexity in the API and in the documentation is due to the module providing a marshalling system for user types. I think it would be simpler just to let users attach arbitrary bytes to a custom event.
|
Thanks @talex5 |
|
Thanks @talex5 for your review.
Yes, calls to serializers/deserializers are ordinary callback into OCaml with properly registered GC roots.
This should be fine I think, writing to the ring buffer seems designed to be thread-safe, but maybe @sadiqj should fact-check me.
No, the byte string given to the decoder is a copy. Should these points be mentioned in the manual? I would of the opinion to stick to the implicit convention that if the manual is silent about these issues (thread safety, authorization to allocate, etc.), it means that the API is safe (thread-safe, GC-safe, etc.). |
The ring-buffers are all single-writer multiple-reader so you must be holding the runtime lock in order to write events to the ring buffer. It would be bad if two systhreads on the same domain wrote simultaneously (without holding the runtime lock). |
There is indeed a main domain created at startup. I attempt to clarify the situation in 858fc6a.
Yes, via
No. Clarification attempt: 8a29ebb.
The clock is
I agree. Clarified in e847dfe. |
I see, so no problem can arise from OCaml code (the runtime lock is always held). I propose to document this in the C API (08d3b4a). |
This is a good idea, thanks.
This I actually don't know. |
|
(And thanks to @tmcgilchrist as well for the reviewing work) |
manual/src/cmds/runtime-tracing.etex
Outdated
| \end{itemize} | ||
|
|
||
| Additional events can be declared and | ||
| consumed, providing higher-level dynamic introspection capabilities to OCaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I would say "monitoring" (or maybe "observability") rather than "introspection", and I am not sure what "dynamic" means and brings in this context. (Also there is a weird space at the beginning of line 18?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 4538ac4.
manual/src/cmds/runtime-tracing.etex
Outdated
| \texttt{int}, \texttt{span}) and user-defined types. To understand the | ||
| manipulation of custom events, it is useful to know how they are transported and | ||
| stored: their representation consists of a name string (in fact, an index into | ||
| an array of all custom names) and an arbitrary array of bytes. Custom event |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"array of bytes": does this means a byte array? If it is just bytes, then "an arbitrary byte sequence" would feel more natural to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 4538ac4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that there are other occurrences of "array of bytes" in the documentation that probably need to be fixed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. 155849b
gasche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with the current state of the manual: I did a reading pass and I did not get any major unresolved question (except the one about shared/duplicated declaration of tags and events, which has been clarified in words and in code), which suggests that to me it is reasonable documentation. The other reviewers (thanks!) also seemed to have some form comments but no blocker.
It would be nice if we could articulate the value of tags (I think that there is some on the programming side: unlike string literals, tags will warn you if you make a typo, etc.), and maybe make some of the design choices more regular or clearly justified. But I think that this is fine for a first iteration of a manual for the feature as-is.
I propose to wait until @OlivierNicole considers that he is done integrating all the feedback, and then merge.
The commit history is not great with a lot of minor back-and-forth commits. I think that @OlivierNicole could squash together the commits of each author -- so we would have three commits, one for each different person who took this PR for a ride. Or maybe some finer split of commits if you think they make sense. Could you take care of this and force-push?
Thanks!
| perform steps 1 to 3 to register custom events and custom event types (if any). | ||
| Note that the tag values need not be the same in both programs; the only value | ||
| that should be identical the same is the name. | ||
| that should match are the names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only values that should match (apologies for not catching this earlier)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. 773fbc4
Co-authored-by: Olivier Nicole <olivier.github.com@chnik.fr>
2522c46 to
6170eba
Compare
6170eba to
7180530
Compare
|
Thank you for your review. I squashed as suggested and added a Changes entry. |
|
I don’t really understand what the Hygiene failure message means. |
|
The CI failure message says: This appears to be a bug in the CI script deciding if a PR needs the |
|
(The bug appears to come from the branch-point / most-recent-common-ancestor computation.) |
|
The churn on trunk wasn't in any of the hygiene jobs! Regardless, this was already failing from as soon as the PR was opened, which is before I started messing around. I'd noticed it from when the PR was first opened. I will try to have a look at it tomorrow. |
|
(the bug was two-fold - the parsetree-change check is failing, but the no-change-entry-needed was passing even when there was no Changes entry, but that's unsurprising because they share the same analysis of the commits) |
|
Thank you all! |
|
As a documentation update, don't forget to cherry-pick it on 5.2 . |
|
I pushed it to 5.1 as well, as it allows the possibility for the manual on ocaml.org to be recompiled with the updated chapter. |
This PR intends to be the new #12335, taking over @TheLortex and @sadiqj to address reviewers’ requests.
It contains the same commits as the original PR, with the following changes in the last two commits:
(Since there was discussion about possible future improvements of the API: having done my best to clarify it, I have the impression that there may be room for simplification regarding tags: I am not sure what is the interest of exposing them, rather than the event names directly.)