Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditionally Creating Server Spans if No Span Found in Current Context #544

Conversation

andresbeckruiz
Copy link

@andresbeckruiz andresbeckruiz commented Jun 18, 2021

Description

This PR fixes issue #445. I check if if an active span is present in the current context by calling opentelemetry.trace.get_current_span(). If one is found, an internal span is created and the span found is used as the parent. If one is not found, a server span is created and the remote span context from the incoming request is used as a parent.

I used either this logic for all of the server instrumentations that create server spans: Falcon, Flask, Django, Pyramid, Tornado, WSGI, and ASGI. There are references to the issue that mention that FastAPI and Starlette also create server spans, but I could not find where this occurs in the opentelemetry-python-contrib repo. If someone could point out where they are created, I can add the logic I used for the other instrumentations.

Type of change

Please delete options that are not relevant.

  • [ ✅ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    -This feature causes server spans to be only created conditionally, so at times existing functionality will not work as expected.

How Has This Been Tested?

All the instrumentation tests were run and passed successfully.

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • [ ✅ ] No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • [ ✅ ] Followed the style guidelines of this project
  • [ ✅ ] Changelogs have been updated
  • [ ✅ ] Unit tests have been added
  • [ ✅ ] Documentation has been updated

@andresbeckruiz andresbeckruiz requested a review from a team as a code owner June 18, 2021 04:31
@andresbeckruiz andresbeckruiz requested review from aabmass and ocelotl and removed request for a team June 18, 2021 04:31
@ocelotl
Copy link
Contributor

ocelotl commented Jun 24, 2021

@andresbeckruiz please make sure that all tests pass.

@ocelotl ocelotl requested a review from owais June 24, 2021 16:42
@srikanthccv
Copy link
Member

Two things on top of my mind that we probably should do

  1. We want to continue the trace from a remote context. I think we should check that the trace id from the current span context is the same as the trace id from the incoming request before starting the INTERNAL span. Otherwise, this would lead to broken traces.
  2. A check that parent span is SERVER span.

@andresbeckruiz andresbeckruiz force-pushed the checking-parent-spans-current-context branch from e40c9c3 to c46c880 Compare June 26, 2021 00:48
@owais
Copy link
Contributor

owais commented Jun 28, 2021

1.0 We want to continue the trace from a remote context. I think we should check that the trace id from the current span context is the same as the trace id from the incoming request before starting the INTERNAL span. Otherwise, this would lead to broken traces.

I still cannot think of any good reason why there would be locally active span that wouldn't really be part of the same trace/request so not sure how much benefit this check will provide. It's certainly not harmful but may be we can avoid extra checks on potentially hotpaths if we can't think of a edge-case to catch with this.

2.0 A check that parent span is SERVER span.

I see how this makes sense but can there be situations where the real server span has a child span and then this logic is triggered? Let's say if WSGI is being used with Django and then a Django middleware embeds yet another router/server library that is also instrumented. The 3rd span would create another SERVER span in that case. May be we can keep it simple and just use the local span if present without any other conditions? We can easily add more checks later if we discover any cases where this doesn't work.

@srikanthccv
Copy link
Member

Ah I see how this can get complicated quickly if we consider about all cases. Yes, maybe we can just keep it simple and revisit later if people run into these edge case scenarios.

@andresbeckruiz
Copy link
Author

@owais I have updated each instrumentation file and addressed your comments.

@andresbeckruiz
Copy link
Author

andresbeckruiz commented Jul 1, 2021

@ocelotl all tests are passing now for instrumentations that I have edited. For some reason, the lint and docker tests are failing still, but I don't believe it's because of my edits.

@andresbeckruiz andresbeckruiz requested a review from owais July 1, 2021 07:30
@andresbeckruiz andresbeckruiz force-pushed the checking-parent-spans-current-context branch from 7efecdf to 0bc5fd5 Compare July 1, 2021 10:09
@andresbeckruiz
Copy link
Author

@owais I have addressed your most recent comments.

@owais
Copy link
Contributor

owais commented Jul 12, 2021

This contains commits from unrelated branches. Can you please rebase and perhaps squash the commits (just this time) so it is easier to review?

@andresbeckruiz andresbeckruiz force-pushed the checking-parent-spans-current-context branch 2 times, most recently from e4281be to 51851b1 Compare July 13, 2021 00:19
@andresbeckruiz
Copy link
Author

@owais sorry about that, I created a new local branch to rebase and clean everything up and pushed it onto this branch. I also addressed your most recent comments.

@@ -244,4 +249,5 @@ async def wrapped_send(message):

await self.app(scope, wrapped_receive, wrapped_send)
finally:
context.detach(token)
if ctx is not None:
context.detach(token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works now but it looks like something that could easily break in future with the value of ctx determining whether token is defined or not. Why not have a none-check for token directly instead?

Copy link
Contributor

@owais owais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. Left some minor comments.

Major remaining blocker are tests. We need to ensure all known cases work as expected for all instrumentation that were touched. Can you please add tests for such cases?

@andresbeckruiz andresbeckruiz force-pushed the checking-parent-spans-current-context branch from 51851b1 to 1ce04d8 Compare July 22, 2021 05:23
@andresbeckruiz andresbeckruiz force-pushed the checking-parent-spans-current-context branch from 1ce04d8 to cc6c06a Compare July 27, 2021 06:09
@andresbeckruiz
Copy link
Author

andresbeckruiz commented Jul 27, 2021

Hi @owais, I addressed the minor comments you left. I am stuck figuring out how to implement tests for the logic I have implemented because I am unsure how to access the spans being created. For example, in the Django middleware tests I cannot find any that call the process_request method that I have touched. I see tests accessing spans using a get_finished_spans method but I am not sure if that method uses the logic I updated. Furthermore, the process_request method does not return the span, context, or token, which I have all updated logic for. Therefore, I am not sure how to check for instance if the span created is of SpanKind.INTERNAL. Would you be able to give an example or point me in the right direction on how I could implement tests for the logic I have updated?

@owais
Copy link
Contributor

owais commented Jul 27, 2021

I think existing tests should continue to work with the changes you made. You need to add new tests where a span is already present in the current context before the framework receives an HTTP request. You can do this manually but it'll probably differ across frameworks. It might be simpler to have WSGI instrumentation active as well in addition to the framework instrumentation for such tests as that'd automatically reproduce the scenario. Once that is done, your test for checking spans would look exactly the same as any other test.

@owais
Copy link
Contributor

owais commented Aug 17, 2021

@andresbeckruiz Thanks for contributing this to the project. It's a very valuable feature and I really appreciate all the effort you put into this. Could you please clarify if you are planning to come back to this or if we should re-assign it to someone else so they can build on top of your work?

@andresbeckruiz
Copy link
Author

Hi @owais sorry I did not clarify; I am pretty busy right now so I will not be coming back to this issue. I would appreciate if you could re-assign the issue to someone else so that they can build on top of my work, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants