-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runtime id tag support for traces & profiles #1379
Conversation
_RUNTIME_ID = _generate_runtime_id() | ||
|
||
|
||
if hasattr(os, "register_at_fork"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm most definetely not a Python developer, so feel free to ignore.
Fee things you may want to consider here:
- Using 'double check' or any sort of optimistic locking strategy may help to avoid global lock.
- There is https://github.com/google/python-atfork that seems to work before 3.7 - in fact I guess on any reasonably python and POSIX OS.
- Both 'register_at_fork' and 'atfork' mentioned above seems to only work if forked from python code - not sure how much this is a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also https://github.com/vfreex/pyatfork - which actually seems to be using OS functionality intended for this sort of thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's always a burden to add external dependencies so I took the road of having two implementations — which is anyway mandatory if we want to support non-POSIX OS.
We might also be able to write a little C/Cython wrapper if we want to expose pthread_atfork
. That's something on my plate already as we use it in the profiler and I'd love to it work for Python < 3.7.
|
||
def get_runtime_id(): | ||
with _RUNTIME_LOCK: | ||
pid = os.getpid() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there maybe a race condition here:
os.getpid
is called by thread Afork
is called by thread Bos.getpid
get called again from inside_set_runtime_id
And I think results will not be correct in this case.
Would it make sense to store first result of getpid
in a variable and use it later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, let me try to follow your scenario if I understand correctly.
If thread B forks, os.getpid()
is called from the child process in _set_runtime_id
which modifies the child memory and set a new PID and UUID. If that same child calls get_runtime_id()
, it'll get its new runtime id.
Nothing changes in the parent process.
Feel free to correct my scenario if I misread yours!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I assumed that fork may happen between _RUNTIME_ID = _generate_runtime_id()
and _RUNTIME_PID = os.getpid()
- and if that's not the case, I guess it is fine.
(this PR doesn't, but want to make sure if someone comes back to this they see this warning) |
ed75d82
to
b877bd2
Compare
This adds support for a runtime id, a unique identifier that can identifies a running process during his lifecycle.
This actually fixes the collision that occurred when a process was forking and its runtime id was the same.
This adds a tag named `runtime-id` to the root span that matches the generated runtime id. As profiles are also tagged with that same runtime-id, it makes it easy to link traces and profiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 👍
@@ -289,8 +289,8 @@ def test_send_many_traces(self): | |||
# register a single trace with a span and send them to the trace agent | |||
self.tracer.trace("client.testing").finish() | |||
trace = self.tracer.writer.pop() | |||
# 30k is a right number to have both json and msgpack send 2 payload :) | |||
traces = [trace] * 30000 | |||
# 20k is a right number to have both json and msgpack send 2 payload :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious about this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the runtime id tag slightly increased the size of a serialized span so now we need less span to have a split while uploading — that's why I had to decrease this number here to make the test still pass.
feat(runtime): implement runtime id
This adds support for a runtime id, a unique identifier that can identifies a
running process during his lifecycle.
feat(profiling): use ddtrace.runtime to get runtime id
This actually fixes the collision that occurred when a process was forking and
its runtime id was the same.
feat(tracer): tag spans with runtime id
This adds a tag named
runtime-id
to the root span that matches the generatedruntime id.
As profiles are also tagged with that same runtime-id, it makes it easy to link
traces and profiles.