feat: otel thread ctx FFI#1915
Conversation
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 35f80d0 | Docs | Datadog PR Page | Give us feedback! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 05868a50b9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
e98d81f to
75add55
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1915 +/- ##
==========================================
- Coverage 72.63% 72.61% -0.02%
==========================================
Files 448 451 +3
Lines 73582 74060 +478
==========================================
+ Hits 53444 53778 +334
- Misses 20138 20282 +144
🚀 New features to boost your workflow:
|
ivoanjo
left a comment
There was a problem hiding this comment.
This looks quite reasonable. Some notes:
-
I would strongly recommend getting one of the dotnet folks (which I believe are intended first users of this API) to give a pass on the PR
-
I think we're missing an end-to-end example in C or C++. In particular, one that sets both the process and thread context, and thus can be picked up correctly by the external reader.
Right now we're kinda assuming that callers will "just know" how to put this together, and suspect that will cause a bunch of confusion. Thus, it would be very helpful to provide some docs + a fully working example that could be used as a model.
-
As I mentioned, I'm not convinced about the whole "expose the full structure", see my comment on why.
| // struct has the right total size. | ||
| #[repr(C)] | ||
| struct ThreadContextRecord { | ||
| pub struct ThreadContextRecord { |
There was a problem hiding this comment.
I'm not quite sure about this part. In particular, I don't think we're giving callers the tools that they need to directly use this without getting "burned" right now: no docs, no validation. ![]()
My "gut feeling" is: if callers haven't asked for this yet, let's not give it to them. 🤔
If they want to bang at the bits directly, I think in that situation they might as well prefer to reimplement it from scratch, e.g. why half-go through libdatadog if you want maximum control anyway?
There was a problem hiding this comment.
I'm not quite sure about this part. In particular, I don't think we're giving callers the tools that they need to directly use this without getting "burned" right now: no docs, no validation.
I would say that there is some documentation (and warnings!) in the documentation of attach. When talking with dotnet people, we thought it might make sense for performance to try to do everything on the SDK side for context switching, without going through native. I agree that then using libdatadog is questionable, but I think you'll still have to manage the whole TLSDESC/elf business. It might be reasonable to use libdatadog to attach the first context and then update it in place.
Also the structure is defined per the spec anyway, so it's not going to change and is "public" (as is, defined somewhere on the internet).
All of that being said, I agree that making it visible here is stronger, as we guarantee our FFI functions always return pointer to contexts as per the spec. Additionally, the in-place-update-from-the-sdk was really mentioned as "we can try that later in the future in mandated", so I would be totally fine reverting the changes to otel-thread-ctx, removing the bit about raw update, making the record type an entirely opaque pointer, and only discuss this again after there's a need for it, or at least some benchmarks.
Why do you dotnet guys think @gleocadie ?
There was a problem hiding this comment.
I think you'll still have to manage the whole TLSDESC/elf business.
I was thinking that if we're building a .so from C/C++, it's even easier to not use libdatadog for this part -- no need to fight linker, just declare a public thread-local symbol.
But yeah, as I mentioned above -- my suggestion is "gut feeling we're not gonna need it" -- if y'all want to try this path, we can, but I would suggest avoiding any "let's expose this now just-in-case-for-later" things ;)
There was a problem hiding this comment.
I was thinking that if we're building a .so from C/C++, it's even easier to not use libdatadog for this part -- no need to fight linker, just declare a public thread-local symbol.
That's true... although you still need the process context part, and re-implementing this in C/C++ starts to be a tad less trivial. But I agree that if you only need to swap the TLS slot, you're probably better off with a thin C/C++ shared lib than messing around with the Rust build 👍
There was a problem hiding this comment.
Yeap, I agree you'd still probably want to use the libdatadog process context impl, assuming whatever downstream consumer we're talking about is pulling libdatadog already as a dependency anyway.
There was a problem hiding this comment.
I guess you meant would be impossible?
I suspect it would be impossible; but I may be missing some detail.
Since dotnet is the first motivation here, I guess it's @chrisnas' question to answer: even in the case of doing the update directly from dotnet, do you still see value in libdatadog for setting up the process context and attach the first thread context?
To be clear I think process context is very worth using the libdatadog one. The thread context is the one where I think the options are 100% libdatadog or 0% libdatadog -- I don't think 10% libdatadog is going to be very useful and I suspect it'll be more confusing/error-prone than not ;)
I would like to move forward with an FFI anyway, because I have some small benchmarks that rely on its existence. However, if it's [edit: NOT so] useful for dotnet in the end, I can trim the PR to make the context private again, which would be minimal first version.
Idk, I'm still in the "I don't think it's very useful to add stuff if we don't think we'll need them", but also we can and should evolve the API with what we learn from adding this to dotnet, so from me not a blocker at all.
There was a problem hiding this comment.
I don't think 10% libdatadog is going to be very useful and I suspect it'll be more confusing/error-prone than not ;)
I'm not sure I get why. Once again you need to do some stuff in native anyway, so if you can reuse/offload and pull libdatadog's parts for the process context, it's strictly better to re-use what you can for thread context than re-implementing it yourself in unsafe C/C++ on the side...?
At any rate, I want to insist that the whole debate here is purely on the contract/visibility part of the API. There' absolutely no code or logic involved in making the thread context updatable purely on the .NET runtime side. It's mostly about making some C struct exported or not in the header, but this struct's layout is available in the spec anyway. My personal view is that it doesn't cost much to expose the possibility as long as we have that big CAUTION in the code comments (like: you can do it, but at your own risk). I think @ivoanjo is wary of making this part of the official public API because it's full of traps, which I can also understand.
So here is a proposal: for now, let's get the record struct out of the API and make returned values opaque pointers. Still, the code and the memory layout will be exactly the same as if we didn't. So it's not exposed in the API, but if @chrisnas wants to experiment with in-place update, they can and have everything they need to do it (I don't see any reason why we would change the layout or introduce indirection on the libdatadog side in the forseeable future). Once we see how it turns out, and if .NET think it's worth keeping libdatadog in that setting, we can update the FFI and make it more "official", or do nothing otherwise and keep the opaque version. That would unblock this PR and be forward-compatible with changing our mind.
What do you guys think?
There was a problem hiding this comment.
it's strictly better to re-use what you can for thread context than re-implementing it yourself in unsafe C/C++ on the side...?
If you go 100% libdatadog thread context, then yes, you get all the benefits of the rust safety.
But to me this argument falls apart once you need to implement the logic to bang on the bytes directly. In that situation, you might as well bang on the bytes always, not "half the time is libdatadog thread context, half the time is my own code" -- I think that'll be more error prone and hard to follow. It's even harder to test ;)
I think @ivoanjo is wary of making this part of the official public API because it's full of traps, which I can also understand.
My argument is that if you're going to do things directly, you need to really understand the spec (it's the "play in hard mode choice").. So at that point, you can look at libdatadog thread context as a reference for code, but I'm not sure having the struct definition or some other small tidbits helps very much -- you really do need to get into the spec to understand how to use it, no shortcuts anymmore.
Or, you can play in easy mode and use libdatadog thread context -- to me that's the choice here. Easy or hard, 100% libdatadog thread context or 0%.
So here is a proposal
Ack from my side!
There was a problem hiding this comment.
But to me this argument falls apart once you need to implement the logic to bang on the bytes directly. In that situation, you might as well bang on the bytes always, not "half the time is libdatadog thread context, half the time is my own code" -- I think that'll be more error prone and hard to follow. It's even harder to test ;)
The big difference to me is that the goal here is to bang the bytes from the .DOTNET runtime. Since the point is to not pay for the FFI. So this byte banging part is done in C# or whatnot. However, to install the thread context, it's very likely that you need a native shared library anyway, because of the whole TLSDESC and linker business. Hence, unless like Java you want to do custom/specific stuff in the native byte banging part, the trade-off is using ready-built libdatadog for installing the context versus creating a separate C/C++/Rust project on the side just for that, figure out the TLS dialect, linking business, etc. That is, redo mostly what's already done in otel-thread-ctx. Of course if you're going to bang the byte from native, then sure.
There was a problem hiding this comment.
[...] because of the whole TLSDESC and linker business [...]
Right, but the tlsdesc is very easy to do from C/C++ and really hard to do from rust, so again, I'm not seeing a lot of benefits :P
The dotnet runtime isn't directly calling into rust, there's already a bunch of C++ glue code ;)
dcf5548 to
e1af933
Compare
| // struct has the right total size. | ||
| #[repr(C)] | ||
| struct ThreadContextRecord { | ||
| pub struct ThreadContextRecord { |
There was a problem hiding this comment.
I would be totally fine reverting the changes to otel-thread-ctx, removing the bit about raw update, making the record type an entirely opaque pointer, and only discuss this again after there's a need for it, or at least some benchmarks.
We would definitively want to use in-place value change for .NET for performance reason due to the high number of span that can be created over a trace.
This is EXACTLY how it works today: the tracer asked the profiler (equivalent of libdatadog call) ONCE per thread where to set the span id value and then, directly update the value in memory when the span changes for the thread.
| /// hundred bytes per thread, but it guarantees that we can modify the context in-place | ||
| /// without (re)allocation in the hot path. Having a hybrid scheme (starting smaller and | ||
| /// resizing up a few times) is not out of the question. | ||
| attrs_data: [u8; MAX_ATTRS_DATA_SIZE], |
There was a problem hiding this comment.
Is this the place where to store the endpoint?
If this is the case, how to know the string ID corresponding the EndPoint attribute. Since the value seems to also be an interned string ID, it means that each different end must be interned in the libdatadog string table; growing over time since the endpoints could be very different
Don't forget that the endpoint is usually know close to the end of the request. It means that any client would have to "reconciliate" the real endpoint for the samples already created for this thread. More important, there is no notification for the full host to know WHEN a thread gets its endpoint so it looks like impossible to be sure of the right endpoint to use if not already set; i.e. the same thread could have switch request since the last time samples where created.
Also, how to reset the current end point? by setting an "empty string" ID?
There was a problem hiding this comment.
What interning API should we use ?
There was a problem hiding this comment.
. Since the value seems to also be an interned string ID
They are not, only the keys are interned string IDs. I think that what we drafted in our first discussion, that is having a small set of known keys with fixed IDs as a convention (root_span_id is 0, endpoint or whatever is the end point attribute is 1, etc.) sounds like the way to go, ideally? So there would be no need for dynamic string interning business at all.
For the endpoint question, I'm not sure to be honest. Even beyond the current spec, given the setup and the constraints of an external eBPF profiler, I don't see an easy way out of having the profiler do some additional work and book-keeping to correlate past samples once the endpoint become known. Maybe @ivoanjo has some thoughts on this.
There was a problem hiding this comment.
OK so there's two versions of this answer for the "endpoint shows up late" problem.
The in-process datadog version of this answer: While we want to adopt this mechanism, we don't have to adopt it exclusively. E.g. like we did for Java, if it's the datadog tracer talking to the datadog in-process profiler, we can have additional channels beyond just this one to provide extra information if needed. So for sure we don't want to regress in terms of feature-set, and the Java PR is a great example of how we both implement the otel spec and still keep the entire feature-set.
The out-of-process and pure-otel version of this answer to me is: I think we'll have to live with best-effort for now. That is -- we set the endpoint name in the context when we get it, and if the eBPF Profiler misses it, such is life. If an otel backend wants an accurate picture, the otel backend will probably need to look at traces/spans for this information.
The whole basis of the thread context sharing mechanism is to be a "loose coordination" kinda thing; I'm hoping to avoid expanding the scope for now because it opens a lot of hard questions that would make the implementation a lot more complex IMO.
There was a problem hiding this comment.
So there would be no need for dynamic string interning business at all
Good news :^)
So, how are we supposed to set the string corresponding to the value of the interned attribute corresponding to the endpoint?
@ivoanjo: are there any other attributes that would make sense to "attach"
There was a problem hiding this comment.
I've been keeping a list here.
I'd say definitely the datadog.local_root_span_id and the (best-effort, with limitations) datadog.trace_endpoint, as well as the datadog.thread_name and datadog.thread_id (in cases where thread id doesn't match what the OS can see).
Beyond that I see we in prof-dotnet pprofs that we have appdomain name, appdomain process id -- if you think those are useful, maybe throw them in there too? If not, don't ;) (If you do -- I suggest calling them e.g. datadog.appdomain_name and updating the doc I linked so we can try to keep everyone in-sync and avoid the backend needing to know 3 datadog names for the same thing)
| /// `ddog_otel_thread_ctx_detach`, and must not be used after this call. In particular, `ctx` | ||
| /// must not be currently attached to a thread. | ||
| #[no_mangle] | ||
| pub unsafe extern "C" fn ddog_otel_thread_ctx_free(ctx: *mut ThreadContextRecord) { |
There was a problem hiding this comment.
Just to confirm the expected usage:
the first time a thread needs to set the context:
pCurrentThreadContext = ddog_otel_thread_ctx_new(...);
ddog_otel_thread_ctx_attach(pCurrentThreadContext)
when a span is processed by a thread:
// get current thread's context
...
// directly set the value of span id/trace id (in case of reset)
...
However, on shutdown, I don't see a way to "detach" all contextes but only for the current thread. Not sure it would be safe to call ddog_otel_thread_ctx_free() on each context without detaching first
There was a problem hiding this comment.
One possible answer is that you don't do anything, if you think this can be acceptable: if we're shutting down just before the process exit for example, the OS will reclaim the memory anyway, so not sure it's worth doing it manually just before.
Otherwise I admit the thread clean-up story might be non-trivial. If you need to free contexts either all at once, or during the app lifecycle as threads are created and freed, I can see two ways:
- either do it as some kind of exit callback/cleanup phase on the thread before it exits (if doable, easiest)
- or store aliases to the contexts in a separate datastructure (says a map threadid -> contexts, or a queue of to-be-cleaned, or even a pool of to-be-reused contexts), and free or reuse after the corresponding thread exited. Once the original thread is dead, the TLS has been de-allocated anyway so the profiler can't reach it anymore. It's then safe to call
ddog_otel_thread_ctx_free()from another thread or to repurpose the context.
There was a problem hiding this comment.
The only risk I see comes from "short lived" threads for which it would be better to be able to release the memory. In .NET, we might be notified, from the exiting thread, that it will dies soon and call the free API but I don't know for the other runtimes
There was a problem hiding this comment.
Ok, good for .NET then! For other runtimes, as I said there are other possibilities (overall map of active contexts/queue/pool). If you can test if a thread is still active, you can maybe do some kind of cleanup in the background or at regular intervals.
Rust's cdylib linker emits a version script with `local: *` that hides all non-Rust symbols, preventing `custom_labels_current_set_v2` from appearing in the dynamic symbol table. Without a dynsym entry, external readers (e.g. the eBPF profiler) cannot locate the thread-local slot. Add a supplementary version script with an explicit `global:` entry for the symbol, which takes precedence over the `local: *` wildcard. Also force lld explicitly, since merging multiple version scripts is not supported by GNU ld. Also adds a temporary dummy FFI wrapper around `ThreadContext::attach` to keep the TLSDESC access live during verification. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mbol Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This reverts commit abf827e.
The FFI now exposes an opaque OtelThreadCtx handle instead of ThreadContextRecord. The core crate is unchanged — the FFI casts between NonNull<ThreadContextRecord> and NonNull<OtelThreadCtx> at the boundary. The in-place update documentation has been removed since direct memory access from outside libdatadog is not officially supported. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Ivo Anjo <ivo.anjo@datadoghq.com>
Environments with an old system lld (e.g. lld 7 from llvm-toolset-7.0 on CentOS) fail to link the cdylib because they cannot handle R_X86_64_GOTPC32_TLSDESC relocations in shared libraries. GNU ld is not an option either, as it cannot merge multiple version scripts. Discover the toolchain's bundled rust-lld (LLD 19+ since the MSRV) via `rustc --print sysroot` and pass its gcc-ld shim directory with `-B` so the C compiler driver finds it before any system-wide lld. Fall back to the system lld when rust-lld is not available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13128d5 to
0464997
Compare
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
What does this PR do?
This PR adds a basic FFI for the OTel thread-level context feature: create a new context, attach, detach, and update in place.
We also make
ThreadContextRecordpublic, or at least exposed in the FFI. The rationale is that:ThreadContextRecordis a way to document its expected memory layout.Generated C header
Motivation
OTel thread-level context has been implemented in #1791 in order to provide better interop with the OTel eBPF profiler. The first user is supposed to be dd-trace-rs, but it turns out the dotnet SDK people are interested in using it as well (and eventually other non-Rust SDKs will use it and thus require an FFI).
Additional Notes
N/A
How to test the change?
There's a test to check that the TLS symbol is properly handled. For real usage, we plan to check when integrating in dotnet (or whichever is the first SDK to use it).