Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Squash nested sinatra/rack request span #2217

Merged
merged 17 commits into from
Nov 7, 2022
Merged

Conversation

TonyCTHsu
Copy link
Contributor

@TonyCTHsu TonyCTHsu commented Aug 12, 2022

Motivation

For Sinatra/Rack instrumentation, the middleware stack architecture renders nested flamegraph for span. Although it is technically correct, the user experience with such UI is suboptimal.

Screenshot 2022-08-22 at 15 09 54


What does this PR do?

Instead of instrumenting nested middleware stack of Sinatra or Rack, the implementation changes to only instrument the top of the stack. This means the UI would display only one rack.request and sinatra.request.

Some changes followed after skipping the subsequent middlewares:

  1. sinatra.app.name tag is removed from the sinatra.request span, but still present for sinatra.route span. Since the app name only make sense when a route is matched by its individual app.
  2. When no route matched, sinatra.request span would no longer tagged with sinatra.route.path, since none of the route matched(sinatra.route span is absent as well).

Other improvement: Configuration is all it needs to instrument Sinatra, no longer need to register for each Sinatra app.


For Classic Sinatra app

Screenshot 2022-08-12 at 14 39 08


For Modular Sinatra app with Sinatra::Router

class Acme < Sinatra::Base
  # # Use Sinatra::Router to mount different modular Sinatra applications
  use Sinatra::Router do
    mount ::Health
    mount ::Basic
  end

modular-router


For Modular Sinatra app chaining with use

class Acme < Sinatra::Base
  # # Use Sinatra App as middleware
  use Health
  use Basic

modular-use

@TonyCTHsu TonyCTHsu self-assigned this Aug 12, 2022
@TonyCTHsu TonyCTHsu marked this pull request as ready for review August 12, 2022 12:46
@TonyCTHsu TonyCTHsu requested a review from a team August 12, 2022 12:46
Comment on lines -14 to -15
RACK_ENV_MIDDLEWARE_START_TIME = 'datadog.sinatra_middleware_start_time'.freeze
RACK_ENV_MIDDLEWARE_TRACED = 'datadog.sinatra_middleware_traced'.freeze
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be careful renaming/removing constants like this. This may constitute a breaking change. Might be safer to flag this as deprecated and keep the existing constants for now.

request_span = env[Ext::RACK_ENV_REQUEST_SPAN]
request_span && request_span[app]
def datadog_span(env)
env[Ext::RACK_ENV_SINATRA_REQUEST_SPAN]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer stored multiple spans, but one.

@@ -52,18 +52,6 @@
erb :msg, locals: { msg: 'hello' }
end

get '/erb_manual_injection' do
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used

ivoanjo added a commit that referenced this pull request Aug 22, 2022
While reviewing #2217 where we fixed the sinatra documentation not to
mention the outdated `use`, I decided to make a quick check if we had
any more such leftovers, and it turns out we did.

This PR fixes the two leftover mentions of `use` that I found that
should be replaced by `instrument`. I did not change the sinatra docs
to avoid a conflict with #2217.
Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few notes from my first pass at the PR

integration/apps/sinatra2-modular/app/basic.rb Outdated Show resolved Hide resolved
lib/datadog/tracing/contrib/rack/middlewares.rb Outdated Show resolved Hide resolved
integration/apps/sinatra2-modular/app/parent.rb Outdated Show resolved Hide resolved
Comment on lines 45 to 46
if use_script_names
env[::Rack::SCRIPT_NAME].to_s + path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this change -- we previously used request.script_name but now we directly parse it from the env. Could you add some context on why this option is needed?

Copy link
Contributor Author

@TonyCTHsu TonyCTHsu Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not change the behaviour of our instrumentation.

Sinatra::Request is a children of Rack::Request, while #script_name is still delegate to rack env with SCRIPT_NAME. I changed the implementation depends on what kind of object I could access within the current scope.

This option is introduced here.

Personally, I think this option should move to rack instrumentation and enabled(true) by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think this option should move to rack instrumentation

Actually, would it impact rack? I thought pure-rack spans get named based on the verb + status code (e.g. "GET 200") so how would we represent the script name there?

and enabled(true) by default.

Fully agree on this. Perhaps worth looking into for 1.4.0 or 1.5.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking change(resource change), so probably plan for the next major release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I wouldn't call it a breaking change. As the famous xkcd comic states:

...every change has the potential to break someone. So if we applied it in the extreme every change is a potential breaking change. New integration? There's more spans now -- it's different. Fixed a typo in a resource? Someone relied on the typo -- it's different.

So I wouldn't consider that a breaking change, or at least one that would need waiting until the next major version.

But... this is just a suggestion, e.g. you're welcome to disagree and since tracing is your domain, you have more say in it than I do ;)

Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nuances to this change.

Biggest concern is it may break AppSec (which had some special use for Rack). Let's verify with @lloeki.

2nd concern is: should we always short-circuit Rack instrumentation if a Rack span is already present? This seems unduly restrictive, possibly undesirable, even if its good for Sinatra. Let's clarify this is what we want.

@@ -54,16 +54,18 @@ def call(env)
# Find out if this is rack within rack
previous_request_span = env[Ext::RACK_ENV_REQUEST_SPAN]

return @app.call(env) if previous_request_span
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work okay with AppSec? We're now short circuiting Rack instrumentation whenever there was a Rack span before. My understanding was AppSec produces its own Rack span as well... would it get skipped? CC @lloeki

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my I think this is going to be a problem. In fact I think the whole PR might be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, it should be fine with a passing test suite.

lib/datadog/tracing/contrib/sinatra/env.rb Show resolved Hide resolved
Comment on lines -14 to -15
RACK_ENV_MIDDLEWARE_START_TIME = 'datadog.sinatra_middleware_start_time'.freeze
RACK_ENV_MIDDLEWARE_TRACED = 'datadog.sinatra_middleware_traced'.freeze
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be careful renaming/removing constants like this. This may constitute a breaking change. Might be safer to flag this as deprecated and keep the existing constants for now.

lib/datadog/tracing/contrib/sinatra/tracer.rb Show resolved Hide resolved
@github-actions github-actions bot added the integrations Involves tracing integrations label Oct 20, 2022
@marcotc
Copy link
Member

marcotc commented Oct 24, 2022

This PR returns the ddtrace Sinatra instrumentation to a similar visibility granularity to ddtrace 0.x series.

Sinatra traces will look like this after the changes, as explained by @TonyCTHsu:

modular-router

This is the best representation we are able to create of a Sinatra request, given all the possible internal complex interactions between Sinatra and Rack, and it looks as close to the user application as possible.

This looks very good to me.

As far as I understand, this seems to cause issues to AppSec, is that right @lloeki? If so, what are the issue we are worried about here? We still have Rack and Sinatra spans, plus a nested sinatra.route span. I'm not sure I see any missing information here that would prevent us from correctly implementing AppSec here.

@TonyCTHsu TonyCTHsu added this to the 1.6.0 milestone Nov 7, 2022
@@ -1,6 +1,7 @@
require 'sinatra/base'
require 'sinatra/router'
require 'ddtrace'
require 'pry'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a few pry references left in this PR. Should be be kept?

Comment on lines -48 to -65
# TODO: `route` should *only* be populated if @datadog_route is defined.
# TODO: If @datadog_route is not defined, then this Sinatra app is not responsible
# TODO: for handling this request.
# TODO:
# TODO: This change would be BREAKING for any Sinatra app (classic or modular),
# TODO: as it affects the `resource` value for requests not handled by the Sinatra app.
# TODO: Currently we use "#{method} #{path}" in such aces, but `path` is the raw,
# TODO: high-cardinality HTTP path, and can contain PII.
# TODO:
# TODO: The value we should use as the `resource` when the Sinatra app is not
# TODO: responsible for the request is a tricky subject.
# TODO: The best option is a value that clearly communicates that this app did not
# TODO: handle this request. It's important to keep in mind that an unhandled request
# TODO: by this Sinatra app might still be handled by another Rack middleware (which can
# TODO: be a Sinatra app itself) or it might just 404 if not handled at all.
# TODO:
# TODO: A possible value for `resource` could set a high level description, e.g.
# TODO: `request.request_method`, given we don't have the response object available yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Nice removal!

Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrations Involves tracing integrations tracing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants