Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK: Increment child_count in Span when creating child Spans #150

Closed
fbogsany opened this issue Nov 12, 2019 · 3 comments · Fixed by #400
Closed

SDK: Increment child_count in Span when creating child Spans #150

fbogsany opened this issue Nov 12, 2019 · 3 comments · Fixed by #400
Labels
bug Something isn't working
Milestone

Comments

@fbogsany
Copy link
Contributor

child_count is currently initialized to 0 in Span and never incremented. The OTLP spec marks child_count as optional, but if it is provided, it should represent the count of child spans created for a span. We're in violation of the spec by providing it, but always setting to 0.

child_count is problematic in general, because we may not know the count of (immediate) child spans by the time we finish a span, and because we can create a child span without having a direct reference to its parent. Examples:

  1. Call tracer.start_span('child', with_parent_context: context).
  2. Enqueue a job in Resque. Optimistically assume it will be run exactly once, so increment the child_count of the producer span. Job runs, fails, produces a child span, is retried and succeeds, producing a second child span.

Regardless, we can try to count correctly when we have a reference to the unfinished parent. Firstly, we need a way to notify the parent span, e.g. parent_span.increment_child_count. Secondly, we need to increment at the appropriate places. We only want to do this when span.recording? and parent_span.recording?, or when we can reasonably assume a child span will be created in a remote process:

  1. In Tracer#start_span we need to propagate with_parent to internal_create_span, and in internal_create_span in the branch creating a SDK Span, we need to parent_span.increment_child_count if parent_span&.recording?.
  2. In each adapter that assumes creation of a remote child span, when calling e.g. formatter.inject(span.context, request.headers) also call span.increment_child_count. This one is 💩 from an API perspective, since it requires adding #increment_child_count to the public API for Span, and it's kinda hard to use and explain. It does mean we can avoid the if parent_span.recording? check in the previous case. It might get more complicated if we want to assume remote processes respect our sampling decision.
@fbogsany fbogsany added the bug Something isn't working label Nov 12, 2019
@fbogsany
Copy link
Contributor Author

Relevant spec issue: open-telemetry/opentelemetry-specification#355

@benedictfischer09
Copy link

Related conversations are now indicating that only local (which I take to mean in process) spans should count towards this metric open-telemetry/opentelemetry-proto#72 maybe someone else is able to confirm if that is the case, I don't see a change to the spec itself to reflect that

@mwear
Copy link
Member

mwear commented Feb 27, 2020

It looks like we don't need this and can probably remove it altogether: open-telemetry/opentelemetry-proto#107

@fbogsany fbogsany added this to the Beta v0.7 milestone Sep 18, 2020
@fbogsany fbogsany linked a pull request Sep 18, 2020 that will close this issue
@fbogsany fbogsany modified the milestones: Beta v0.8, Beta v0.7 Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants