Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add http.route tag to rack #3345

Merged
merged 39 commits into from
Feb 8, 2024
Merged

Conversation

lloeki
Copy link
Contributor

@lloeki lloeki commented Dec 21, 2023

2.0 Upgrade Guide notes

What does this PR do?

  • Gathers routing information from key integrations (rails, grape, and sinatra)
  • Sets http.route tag on the root rack span

Motivation:

API Catalog + API Security need it.

Additional Notes:

We gather the route information from the child spans with various integration specific techniques before propagating the information upwards via trace tags. The root span (rack) is the last one to call trace.set_tag meaning that the child information should get propagated, extracted, then overwritten.

How to test the change?

The rake task routetest encompasses all testing done by the http_route_spec.rb file

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles
    credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

@lloeki lloeki requested a review from a team as a code owner December 21, 2023 14:24
@github-actions github-actions bot added integrations Involves tracing integrations tracing labels Dec 21, 2023
@lloeki lloeki requested a review from a team December 21, 2023 14:24
@lloeki
Copy link
Contributor Author

lloeki commented Dec 21, 2023

Things that should be made working (if they don't work already):

  • get '/foo/:id' => 'mycontroller#myaction' (rails.route should == /whatevs/:id/foo)
  • resource 'whatevs' do get 'foo' => 'mycontroller#myaction' end and stuff like that (rails.route should == /whatevs/:id/foo)
  • mount '/foo' => Rack app || Sinatra app || Rails engine (rails.route should == /foo)
  • root 'mycontroller#myaction' (rails.route should == /)
  • whatever happens when no route matches
  • this is for Rails 7, I think it works the same for some earlier major versions but will probably fail on Rails 3 and 4 and maybe fail on Rails 5 (depending on when they started to use Journey / if Journey changed, which I can't recall)

@marcotc
Copy link
Member

marcotc commented Jan 5, 2024

I read the requirements and I think we should have the tag http.route only on the Rack span, for Rack-based web frameworks. The http.route is always scoped by service, so routes-by-service should be canonical with the top-most instrumented web span (Rack in this case).

This way there's no ambiguity to which route is effective for a service.

When Rack is nested with another Rack stack, the last one (top-most Rack span) will win and count towards the Rack route for the service. We'd still set the route tag on all Rack spans that we create, but the API Catalog would only read the top-level one. This is for a trace with a service and multiple Rack stacks.
If there are multiple services, then each top-most Rack span would be read by the API Catalog naturally, so things would still work as expected.

@lloeki
Copy link
Contributor Author

lloeki commented Jan 10, 2024

I read the requirements and I think we should have the tag http.route only on the Rack span, for Rack-based web frameworks. The http.route is always scoped by service, so routes-by-service should be canonical with the top-most instrumented web span (Rack in this case).

That is my understanding as well. It makes no sense to have the global route on the deeper spans, although I must stress how much it does make sense to tag the {sinatra,grape,rails}.route intermediate routes as resolved by each resolver,e specially when nesting things.

This way there's no ambiguity to which route is effective for a service.

When Rack is nested with another Rack stack, the last one (top-most Rack span) will win and count towards the Rack route for the service.

That is my implementation, modulo that we need to leverage SCRIPT_NAME (according to SCRIPT_NAME and PATH_INFO as specified by the Rack spec) to recompose the entire composite route pattern from the resolved route pattern which comes from the nested PATH_INFO.

We'd still set the route tag on all Rack spans that we create, but the API Catalog would only read the top-level one.

Correct, although given all the possible ways one can nest applications again I'd like to stress how important the intermediate router specific tags matter, but only on their specific router spans.

@lloeki lloeki requested a review from a team as a code owner January 10, 2024 13:04
@zarirhamza
Copy link
Contributor

zarirhamza commented Jan 10, 2024

Most TODOs are addressed but things that are left:

  • Need to confirm that Thread.current is the best way to proceed with propagating values
  • Determine if it's even feasible to change the resource name for the spans without breaking dashboards
  • Consult with project owners if we want to strip the format from the route (.:format, .json) to make things similar to other languages and Sinatra
  • TESTING

Good news is that even with the most complicated rack->rails->sinatra setups we have the route correctly available
image

@zarirhamza
Copy link
Contributor

zarirhamza commented Jan 11, 2024

Update: The main functionality is now thoroughly tested for all kinds of configurations and we now strip the formatting/json string at the end of routes.

I added some barebones testing but we need to discuss how we will actually test this. Also note that a lot of current tests are failing for unexpected reasons that might need looking into. Some versions of rails system tests keep failing and older versions of rails keep failing. I presume it might be because JourneyRouter may not be used by those older versions of rails but I'm not experienced enough to know for sure

Copy link
Contributor

@TonyCTHsu TonyCTHsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are quite some tests missing.

  1. Add tests for the changes on grape, sinatra, rack
  2. Add test case for path containing path parameter
  3. Add test case for integrated behaviour, for example: mounting Rails engine or Sinatra app on Rails app

Consider those scenarios written in test suite would help me understand the outcome.

@@ -107,6 +107,7 @@
gemfile(true) do
source 'https://rubygems.org'
gem 'spring', '>= 2.0.2'
gem 'concurrent-ruby', '#{Gem.loaded_specs['concurrent-ruby'].version}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask how is this change relevant to routing instrumentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm..I found the commit 82aaa02

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this fixes a system test failure according to @marcotc

@@ -41,6 +52,9 @@ def before_initialize(app)
# Sometimes we don't want to activate middleware e.g. OpenTracing, etc.
add_middleware(app) if Datadog.configuration.tracing[:rails][:middleware]

# ActionDispatch::Journey is not available or incompatible in Rails < 4.2.
ActionDispatch::Journey::Router.prepend(JourneyRouterPatch) if Integration.version >= Gem::Version.new('4.2')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this belongs to ActionDispatch/ActionPack instead of Rails. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds sensible. This is the kind of decision I cannot take, not being a stakeholder in Tracing.

module JourneyRouterPatch
def find_routes(*args)
result = super
integration_route = result.first[2].path.spec.to_s.gsub(/\(\.:format\)/, '') if result.any?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If result.any? returns false, is it still required to push into Thread.current[:datadog_http_routing]?

Copy link
Contributor Author

@lloeki lloeki Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on whether you want to record the trip to the router even though it failed to match, which can come in really useful when it comes to debugging.

IIRC I added the condition solely to make it not crash.

lib/datadog/tracing/contrib/rails/ext.rb Outdated Show resolved Hide resolved
@TonyCTHsu TonyCTHsu self-requested a review January 17, 2024 12:37
@lloeki
Copy link
Contributor Author

lloeki commented Jan 17, 2024

I believe there are quite some tests missing.

Certainly. Reminder that my involvement here is limited to "make it work", as I had a clear path towards an implementation based on prior experience doing the exact same thing.

Overall, steps are:

  • Make it work (on me, done)
  • Make it right (on Zarir, with APM Tracing help)
  • Make it fast (if needed)

Add test case for integrated behaviour, for example: mounting Rails engine or Sinatra app on Rails app

Using https://github.com/lloeki/minimal-rack/tree/datadog as a base, such tests should cover at a minimum:

curl -vvv 127.0.0.1:3000/hello/world                                   # rails route (get) => rails controller
curl -vvv 127.0.0.1:3000/rack/hello/world                              # rails route (mount) => rack app
curl -vvv 127.0.0.1:3000/sinatra/hello/world                           # rails route (mount) => sinatra app
curl -vvv 127.0.0.1:3000/grape/hello/world                             # rails route (mount) => grape app
curl -vvv 127.0.0.1:3000/rack/sinatra/hello/world                      # rails route (mount) => rack app (map) => sinatra app
curl -vvv 127.0.0.1:3000/rack/grape/hello/world                        # rails route (mount) => rack app (map) => grape app
curl -vvv 127.0.0.1:3000/engine/simple/hello/world                     # rails route (mount) => rails engine (route) => rails engine controller
curl -vvv 127.0.0.1:3000/engine/endpoint/hello/world                   # rails route (mount) => rails engine (endpoint) => rack app
curl -vvv 127.0.0.1:3000/engine/endpoint/grape/hello/world             # rails route (mount) => rails engine (endpoint) => rack app (map) => grape app
curl -vvv 127.0.0.1:3000/engine/endpoint/sinatra/hello/world           # rails route (mount) => rails engine (endpoint) => rack app (map) => sinatra app
curl -vvv 127.0.0.1:3000/rack/rack/rack/rack/rack/hello/world          # deep rack calls
curl -vvv 127.0.0.1:3000/rack/engine/endpoint/sinatra/hello/world      # rails mounts rack app at `/rack`, rack app maps `engine/endpoint` to rails engine (which is just a rack app itself) that wraps rack app, which maps `/sinatra` to sinatra app

... as well as some 404 ones (because some parts may route, thus generate a legitimate and expected routing span, and a downstream router return 404)

I would also recommend looking at existing resource names and naming logic, because having resource names ending up being such as HelloController#world is only correct for the ActionPack spans, or GrapeAPI GET //:id is outright buggy. The backpropagating logic of the resource name up to the Rack span is very odd and breaks in many occasions.

Copy link

@hestonhoffman hestonhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No review necessary from the docs team

Comment on lines 108 to 110
if Thread.current.key?(:datadog_http_routing) && Thread.current[:datadog_http_routing].is_a?(Array)
Thread.current[:datadog_http_routing] << [:grape, endpoint.env['SCRIPT_NAME'], integration_route]
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if Thread.current.key?(:datadog_http_routing) && Thread.current[:datadog_http_routing].is_a?(Array)
Thread.current[:datadog_http_routing] << [:grape, endpoint.env['SCRIPT_NAME'], integration_route]
end
trace.set_tag(Tracing::Metadata::Ext::HTTP::TAG_ROUTE, endpoint.env['SCRIPT_NAME'] + integration_route)

Would this work instead?

@marcotc marcotc merged commit 84738dc into master Feb 8, 2024
219 checks passed
@marcotc marcotc deleted the add-rails-routing-instrumentation branch February 8, 2024 21:13
@github-actions github-actions bot added this to the 1.21.0 milestone Feb 8, 2024
Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this PR title and description? Looking at the contents, it seems we ended up changing a lot more than just the Rails routing (e.g. this change is relevant for customers not using Rails as well!)

_, path = env['sinatra.route'].split(' ', 2)
trace.set_tag(Tracing::Metadata::Ext::HTTP::TAG_ROUTE, path)
trace.set_tag(Tracing::Metadata::Ext::HTTP::TAG_ROUTE_PATH, env['SCRIPT_NAME'])
# binding.pry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover pry call >_> 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrations Involves tracing integrations tracing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants