Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime id tag support for traces & profiles #1379

Merged
merged 5 commits into from
May 6, 2020
Merged

Conversation

jd
Copy link
Contributor

@jd jd commented Apr 23, 2020

feat(runtime): implement runtime id

This adds support for a runtime id, a unique identifier that can identifies a
running process during his lifecycle.

feat(profiling): use ddtrace.runtime to get runtime id

This actually fixes the collision that occurred when a process was forking and
its runtime id was the same.

feat(tracer): tag spans with runtime id

This adds a tag named runtime-id to the root span that matches the generated
runtime id.
As profiles are also tagged with that same runtime-id, it makes it easy to link
traces and profiles.

@jd jd requested a review from a team as a code owner April 23, 2020 11:37
_RUNTIME_ID = _generate_runtime_id()


if hasattr(os, "register_at_fork"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm most definetely not a Python developer, so feel free to ignore.
Fee things you may want to consider here:

  • Using 'double check' or any sort of optimistic locking strategy may help to avoid global lock.
  • There is https://github.com/google/python-atfork that seems to work before 3.7 - in fact I guess on any reasonably python and POSIX OS.
  • Both 'register_at_fork' and 'atfork' mentioned above seems to only work if forked from python code - not sure how much this is a concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also https://github.com/vfreex/pyatfork - which actually seems to be using OS functionality intended for this sort of thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's always a burden to add external dependencies so I took the road of having two implementations — which is anyway mandatory if we want to support non-POSIX OS.

We might also be able to write a little C/Cython wrapper if we want to expose pthread_atfork. That's something on my plate already as we use it in the profiler and I'd love to it work for Python < 3.7.


def get_runtime_id():
with _RUNTIME_LOCK:
pid = os.getpid()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there maybe a race condition here:

  • os.getpid is called by thread A
  • fork is called by thread B
  • os.getpid get called again from inside _set_runtime_id

And I think results will not be correct in this case.
Would it make sense to store first result of getpid in a variable and use it later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, let me try to follow your scenario if I understand correctly.

If thread B forks, os.getpid() is called from the child process in _set_runtime_id which modifies the child memory and set a new PID and UUID. If that same child calls get_runtime_id(), it'll get its new runtime id.

Nothing changes in the parent process.

Feel free to correct my scenario if I misread yours!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I assumed that fork may happen between _RUNTIME_ID = _generate_runtime_id() and _RUNTIME_PID = os.getpid() - and if that's not the case, I guess it is fine.

@brettlangdon
Copy link
Member

runtime-id tag must not be added to statsd metrics

(this PR doesn't, but want to make sure if someone comes back to this they see this warning)

@jd jd force-pushed the runtime-id branch 3 times, most recently from ed75d82 to b877bd2 Compare April 24, 2020 11:53
This adds support for a runtime id, a unique identifier that can identifies a
running process during his lifecycle.
jd added 2 commits April 30, 2020 17:08
This actually fixes the collision that occurred when a process was forking and
its runtime id was the same.
This adds a tag named `runtime-id` to the root span that matches the generated
runtime id.
As profiles are also tagged with that same runtime-id, it makes it easy to link
traces and profiles.
Copy link
Member

@Kyle-Verhoog Kyle-Verhoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍

@@ -289,8 +289,8 @@ def test_send_many_traces(self):
# register a single trace with a span and send them to the trace agent
self.tracer.trace("client.testing").finish()
trace = self.tracer.writer.pop()
# 30k is a right number to have both json and msgpack send 2 payload :)
traces = [trace] * 30000
# 20k is a right number to have both json and msgpack send 2 payload :)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious about this change.

Copy link
Contributor Author

@jd jd May 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the runtime id tag slightly increased the size of a serialized span so now we need less span to have a split while uploading — that's why I had to decrease this number here to make the test still pass.

@jd jd merged commit e2af3f4 into DataDog:master May 6, 2020
@jd jd deleted the runtime-id branch May 6, 2020 11:36
@Kyle-Verhoog Kyle-Verhoog added this to the 0.38.0 milestone May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants