Skip to content

Conversation

@willgittoes-dd
Copy link
Contributor

This adds distributed tracing to falcon.

It's not enabled by default, but can be set with an argument to the middleware or by setting an environment variable when using autopatching.

It just works by looking for the HTTP headers using HTTPPropagator.

@willgittoes-dd willgittoes-dd requested a review from clutchski March 8, 2018 22:40
if self._distributed_tracing:
# Falcon uppercases all header names.
headers = dict((k.lower(), v) for k, v in req.headers.items())
propagator = HTTPPropagator()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make propagator a member variable? Its methods may as well be static (no state), but aren't for some reason. Just in case state is added in the future I thought it would be safe to make a new one each time.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Honestly I think they should be static methods since you don't preserve the state between executions, but probably there is something we may need to change in the future for OpenTracing compatibility.

I would say to create a new one until we make a decision, just to keep a consistent approach among integrations. Then we'll update all of them at once.

def process_request(self, req, resp):
if self._distributed_tracing:
# Falcon uppercases all header names.
headers = dict((k.lower(), v) for k, v in req.headers.items())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use .items() here, which is a nice fast generator in Python 3 but a slower thing in Python 2.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah because it would return a list with tuples containing the copied key-value pairs of the dictionary. If we want to be smart, we can implement the same compatibility layer available in six: https://github.com/benjaminp/six/blob/master/six.py#L579-L608

So we'll have a kind of:

# compat.py
if PY3:
    def iteritems(d, **kw):
        return iter(d.items(**kw))
else:
    def iteritems(d, **kw):
        return d.iteritems(**kw)

# this module
from .compat import iteritems
# ...
headers = dict((k.lower(), v) for k, v in iteritems(req.headers))

I think it should work and:

  • we have a nice fast generator in Python 3
  • it's the best approach in Python 2
  • same call everywhere

Makes sense providing this for better performance and compatibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, glad there's already this wrapper so I don't need to clutter the code :)

mw = kwargs.pop('middleware', [])
service = os.environ.get('DATADOG_SERVICE_NAME') or 'falcon'
distributed_tracing = asbool(os.environ.get(
'DATADOG_FALCON_DISTRIBUTED_TRACING')) or False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied this pattern from somewhere else, but is there a smarter thing I should do?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rewriting this part because we're polluting the system with too many env vars. We can be smart here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we could just all use the same env var name eg DATADOG_DISTRIBUTED_TRACING. If a user runs multiple distributed services on a single host they can still just set the env vars per-executable themselves. But is a more structured config process in the pipes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes. For instance you may have your intake application that doesn't have distributed tracing for incoming calls (you don't accept it from clients) but you want to activate it for outgoing calls to other services. So it's still better to provide a way to select exactly what distributed tracing you want to enable.

The configuration system will make it easier, but till now I prefer that all integrations are compliant.

flask{010,011,012}: nosetests {posargs} tests/contrib/flask
flask-autopatch{010,011,012}: ddtrace-run nosetests {posargs} tests/contrib/flask_autopatch
falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon/test_middleware.py
falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon/test_middleware.py tests/contrib/falcon/test_distributed_tracing.py
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not sure about this. It seemed silly to make a new test directive, but test_distributed doesn't actually depend on the method of patching either.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you simply rewrite this directive in:

falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon

Copy link
Contributor Author

@willgittoes-dd willgittoes-dd Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That tries to execute all the tests, including the autopatch tests (which it seems are usually kept separate by convention?) and misinterpreting the test-cases mixin test_suite.py as actual tests to run as well.

I think if I rename test_suite.py and we're OK with doing the autopatch/middleware tests under the same directive, then it is possible to use falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry forgot that we have an autopatch test also for Falcon. It's fine to keep it separated like we do in other frameworks so all good here.

if self._distributed_tracing:
# Falcon uppercases all header names.
headers = dict((k.lower(), v) for k, v in req.headers.items())
propagator = HTTPPropagator()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Honestly I think they should be static methods since you don't preserve the state between executions, but probably there is something we may need to change in the future for OpenTracing compatibility.

I would say to create a new one until we make a decision, just to keep a consistent approach among integrations. Then we'll update all of them at once.

flask{010,011,012}: nosetests {posargs} tests/contrib/flask
flask-autopatch{010,011,012}: ddtrace-run nosetests {posargs} tests/contrib/flask_autopatch
falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon/test_middleware.py
falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon/test_middleware.py tests/contrib/falcon/test_distributed_tracing.py
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you simply rewrite this directive in:

falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon

root_tracer.set_service_info('root', 'root', AppTypes.web)
with root_tracer.trace('root') as root:
propagator = HTTPPropagator()
propagator.inject(root.context, headers)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you send just plain headers here? such as:

@unittest_run_loop
@asyncio.coroutine
def test_distributed_tracing(self):
# activate distributed tracing
self.app['datadog_trace']['distributed_tracing_enabled'] = True
tracing_headers = {
'x-datadog-trace-id': '100',
'x-datadog-parent-id': '42',
}
request = yield from self.client.request('GET', '/', headers=tracing_headers)
eq_(200, request.status)
text = yield from request.text()
eq_("What's tracing?", text)
# the trace is created
traces = self.tracer.writer.pop_traces()
eq_(1, len(traces))
eq_(1, len(traces[0]))
span = traces[0][0]
# with the right trace_id and parent_id
eq_(span.trace_id, 100)
eq_(span.parent_id, 42)
eq_(span.get_metric(SAMPLING_PRIORITY_KEY), None)
@unittest_run_loop
@asyncio.coroutine
def test_distributed_tracing_with_sampling_true(self):
self.tracer.priority_sampler = RateSampler(0.1)
# activate distributed tracing
self.app['datadog_trace']['distributed_tracing_enabled'] = True
tracing_headers = {
'x-datadog-trace-id': '100',
'x-datadog-parent-id': '42',
'x-datadog-sampling-priority': '1',
}
request = yield from self.client.request('GET', '/', headers=tracing_headers)
eq_(200, request.status)
text = yield from request.text()
eq_("What's tracing?", text)
# the trace is created
traces = self.tracer.writer.pop_traces()
eq_(1, len(traces))
eq_(1, len(traces[0]))
span = traces[0][0]
# with the right trace_id and parent_id
eq_(100, span.trace_id)
eq_(42, span.parent_id)
eq_(1, span.get_metric(SAMPLING_PRIORITY_KEY))
@unittest_run_loop
@asyncio.coroutine
def test_distributed_tracing_with_sampling_false(self):
self.tracer.priority_sampler = RateSampler(0.9)
# activate distributed tracing
self.app['datadog_trace']['distributed_tracing_enabled'] = True
tracing_headers = {
'x-datadog-trace-id': '100',
'x-datadog-parent-id': '42',
'x-datadog-sampling-priority': '0',
}
request = yield from self.client.request('GET', '/', headers=tracing_headers)
eq_(200, request.status)
text = yield from request.text()
eq_("What's tracing?", text)
# the trace is created
traces = self.tracer.writer.pop_traces()
eq_(1, len(traces))
eq_(1, len(traces[0]))
span = traces[0][0]
# with the right trace_id and parent_id
eq_(100, span.trace_id)
eq_(42, span.parent_id)
eq_(0, span.get_metric(SAMPLING_PRIORITY_KEY))
@unittest_run_loop
@asyncio.coroutine
def test_distributed_tracing_disabled_default(self):
# pass headers for distributed tracing
tracing_headers = {
'x-datadog-trace-id': '100',
'x-datadog-parent-id': '42',
}
request = yield from self.client.request('GET', '/', headers=tracing_headers)
eq_(200, request.status)
text = yield from request.text()
eq_("What's tracing?", text)
# the trace is created
traces = self.tracer.writer.pop_traces()
eq_(1, len(traces))
eq_(1, len(traces[0]))
span = traces[0][0]
# distributed tracing must be ignored by default
ok_(span.trace_id is not 100)
ok_(span.parent_id is not 42)
@unittest_run_loop
@asyncio.coroutine
def test_distributed_tracing_sub_span(self):
self.tracer.priority_sampler = RateSampler(1.0)
# activate distributed tracing
self.app['datadog_trace']['distributed_tracing_enabled'] = True
tracing_headers = {
'x-datadog-trace-id': '100',
'x-datadog-parent-id': '42',
'x-datadog-sampling-priority': '0',
}
request = yield from self.client.request('GET', '/sub_span', headers=tracing_headers)
eq_(200, request.status)
text = yield from request.text()
eq_("OK", text)
# the trace is created
traces = self.tracer.writer.pop_traces()
eq_(1, len(traces))
eq_(2, len(traces[0]))
span, sub_span = traces[0][0], traces[0][1]
# with the right trace_id and parent_id
eq_(100, span.trace_id)
eq_(42, span.parent_id)
eq_(0, span.get_metric(SAMPLING_PRIORITY_KEY))
# check parenting is OK with custom sub-span created within server code
eq_(100, sub_span.trace_id)
eq_(span.span_id, sub_span.parent_id)
eq_(None, sub_span.get_metric(SAMPLING_PRIORITY_KEY))

I would prefer to decouple in the test the inject and extract methods. If inject fails or does the wrong thing, we may wrongly assume it's working/not working on extract.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replicate the same test also for disabled states. I think checking the tests in my previous link could help in improving covered cases (some of them may be merged because added in multiple iterations).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on hard-coding the headers.

Not sure what you mean about 'disabled states'. I added a test to ensure headers aren't turned into a parent when distributed tracing is disabled.

Are there other useful cases I can get from the aiohttp tests you link? I looked at the Sampling tests but I'm not sure if they apply since none of my code looks at sampling.

palazzem
palazzem previously approved these changes Mar 19, 2018
Copy link

@palazzem palazzem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this port to Distributed Tracing for Falcon! All good to me!

mw = kwargs.pop('middleware', [])
service = os.environ.get('DATADOG_SERVICE_NAME') or 'falcon'
distributed_tracing = asbool(os.environ.get(
'DATADOG_FALCON_DISTRIBUTED_TRACING')) or False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes. For instance you may have your intake application that doesn't have distributed tracing for incoming calls (you don't accept it from clients) but you want to activate it for outgoing calls to other services. So it's still better to provide a way to select exactly what distributed tracing you want to enable.

The configuration system will make it easier, but till now I prefer that all integrations are compliant.

eq_(traces[0][0].parent_id, 42)
eq_(traces[0][0].trace_id, 100)

def test_distributred_tracing_disabled(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

flask{010,011,012}: nosetests {posargs} tests/contrib/flask
flask-autopatch{010,011,012}: ddtrace-run nosetests {posargs} tests/contrib/flask_autopatch
falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon/test_middleware.py
falcon{10,11,12}: nosetests {posargs} tests/contrib/falcon/test_middleware.py tests/contrib/falcon/test_distributed_tracing.py

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry forgot that we have an autopatch test also for Falcon. It's fine to keep it separated like we do in other frameworks so all good here.

@palazzem palazzem dismissed their stale review March 19, 2018 09:45

flake8 doesn't pass

Copy link

@palazzem palazzem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just fix flake8 and we're good:

ddtrace/contrib/falcon/middleware.py:3:1: F401 'ddtrace.tracer' imported but unused
ddtrace/contrib/falcon/middleware.py:12:5: F811 redefinition of unused 'tracer' from line 3

@willgittoes-dd
Copy link
Contributor Author

ahaha, had linting enabled in my editor but didn't actually have flake8 installed 🙃

Copy link

@palazzem palazzem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! good to go!

@palazzem palazzem changed the base branch from master to 0.12-dev March 20, 2018 10:59
@palazzem palazzem added this to the 0.12.0 milestone Mar 20, 2018
@willgittoes-dd willgittoes-dd merged commit 5aa0cbf into 0.12-dev Mar 20, 2018
@willgittoes-dd willgittoes-dd deleted the willgittoes-dd/falcon-distributed-tracing branch March 20, 2018 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants