Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use full URL in http.url tag #2265

Merged
merged 9 commits into from
Sep 23, 2022
Merged

Use full URL in http.url tag #2265

merged 9 commits into from
Sep 23, 2022

Conversation

lloeki
Copy link
Contributor

@lloeki lloeki commented Sep 9, 2022

What does this PR do?

Adjust HTTP URL tags to comply with span tag unification RFC

  • align http.url with tag naming RFC (and Rack's #url) to include the
    complete request URL
  • drop http.base_url, now unneeded
  • preserve REQUEST_URI logic that differs from Rack's logic
  • add SCRIPT_NAME which was previously unhandled for PATH_INFO
  • append QUERY_STRING when using PATH_INFO to match REQUEST_URI

Enable query string quantization by default

This change is mandated by the unifying tag RFC.

Motivation

Compliance with internal RFC document.

@lloeki lloeki force-pushed the use-full-url-in-http-url-tag branch from 353e405 to 952447d Compare September 9, 2022 15:18
delner
delner previously requested changes Sep 9, 2022
Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wait on this pending internal discussion. Thanks!

- add a { base: :show } quantization option to enable the behaviour
- align http.url with tag naming RFC (and Rack's #url) to include the
  complete request URL
- drop http.base_url when not unneeded
- preserve REQUEST_URI logic that differs from Rack's logic
- add SCRIPT_NAME which was previously unhandled for PATH_INFO
- append QUERY_STRING when using PATH_INFO to match REQUEST_URI
@lloeki
Copy link
Contributor Author

lloeki commented Sep 13, 2022

Adjust HTTP URL tags to comply with span tag unification RFC

This is now made optional via a quantization option.

drop http.base_url, now unneeded

This is now only dropped when http.url includes the URL base.

append QUERY_STRING when using PATH_INFO to match REQUEST_URI

This fixes a discrepancy between the happy path and the fallback.

Enable query string quantization by default

This is now removed from the PR.

@lloeki
Copy link
Contributor Author

lloeki commented Sep 13, 2022

Save for one test case which was adjusted for an inconsistency that is now resolved, the behaviour does not change, thus rake spec:rack tests pass right away.

I shall now add a couple of test cases to validate the new behaviour when the option is set.

Note: there seems to be leakage between spec cases, which made this test
case flaky, with quantization option being set to show foo (example
seed: 63620). The test case was previously unaffected by this leakage
because it could not handle query strings.
Rack http.url tag now properly takes SCRIPT_NAME into account in the
PATH_INFO case, consistently with REQUEST_URI. These expectations did
not account for that.
@lloeki lloeki marked this pull request as ready for review September 15, 2022 10:41
@lloeki lloeki requested a review from a team September 15, 2022 10:41
@@ -74,7 +74,7 @@
spans.each do |span|
if span.name == Datadog::Tracing::Contrib::Rack::Ext::SPAN_REQUEST
expect(span.resource).to eq('GET /endpoint')
expect(span.get_tag(Datadog::Tracing::Metadata::Ext::HTTP::TAG_URL)).to eq('/endpoint')
expect(span.get_tag(Datadog::Tracing::Metadata::Ext::HTTP::TAG_URL)).to eq('/one/endpoint')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why these Sinatra tests have changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tags actually come from Rack, not Sinatra. I believe they're here because at some point Sinatra may have been backfilling this value by itself. In any case they act as an integration test checking for side effects.

Sadly this test would previously pass but not represent what happens when real-world REQUEST_URI is set, indeed Rack::Test does not set REQUEST_URI.

The change here is mainly due to a bug that this PR addresses, that previously made the fallback to PATH_INFO be inconsistent with REQUEST_URI (i.e it did not include SCRIPT_NAME nor QUERY_STRING), which created all kinds of special cases from the lack of query string to adding base_url tag + url tag not resulting in the same value depending on whether REQUEST_URI or PATH_INFO was used, not the least in the latter case base_url tag + url tag is entirely nonsensical for virtualhosted or nested apps since it misses the intermediate SCRIPT_NAME.

**Configuring URL quantization behavior**

```ruby
Datadog.configure do |c|
# Default behavior: all values are quantized, fragment is removed.
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id&sort_by
# http://example.com/path?categories[]=1&categories[]=2 --> http://example.com/path?categories[]
Copy link
Contributor Author

@lloeki lloeki Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically, the doc here was consistent with the internal Quantization::HTTP.url but not with the user-facing Rack integration http.url tagging behaviour!

@@ -23,6 +23,8 @@ module Rack
# in the Rack environment so that it can be retrieved by the underlying
# application. If request tags are not set by the app, they will be set using
# information available at the Rack level.
#
# rubocop:disable Metrics/ClassLength
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get the argument for method length (with exceptions), but I'm wondering if there's really such a thing as a class with "too much lines".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I've ever decided to break down a class when Metrics/ClassLength is triggered, only ever added the cop exclusion tag.

If you want to remove the cop, I'd ✅ it.

Copy link
Contributor Author

@lloeki lloeki Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there are not too many of them, so it can reasonably fit into this PR. Will do.

Comment on lines +172 to +175
options = configuration[:quantize] || {}

# Quantization::HTTP.url base defaults to :show, but we are transitioning
options[:base] ||= :exclude
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be done here, as doing it as a config default would be silently overridden when :quantize is manually set, which would be most surprising to the user.

Probably a general problem with such nested configuration keys.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, this will be removed in the next major release, is that right?

If so, I suggest writing them down in the comment, so we can find it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, can you create a new file named docs/NextReleaseUpgradeGuide.md (name suggestion only), just add one section:

# Breaking changes

To aid with the next release process, please tag future breaking changes the inline code comment `# DEV-2.0:` prefix. 

## List of breaking changes

* `http.url` will change its default...

We've been meaning to do it, and it seems like your PR wants to have a breaking change in the next release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, I suggest adding the comment in this line with the prefix # DEV-2.0:.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points! Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll do that in another PR, along with this suggestion: #2283 (comment)

@@ -47,12 +47,24 @@
it { is_expected.to eq('http://example.com/path') }
end

context 'with show: :all' do
context 'with fragment: :show' do
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemed to be a typo/eager copy-pasting.

- restore the previous default behaviour of `Quantization::HTTP.url`.
- also, make an effort to keep the Rack integration default as not
  preserving the URL base, restricting the transition code for the
  upcoming breaking change to that integration.
@lloeki
Copy link
Contributor Author

lloeki commented Sep 16, 2022

Hmm there's flakiness still:

# pass
bundle exec appraisal ruby-3.1.1-contrib rake spec:rack['--seed 39851']
# fail
bundle exec appraisal ruby-3.1.1-contrib rake spec:rack['--seed 17324']

Had one like that before, seems like quantize: config is leaking from one spec example to the next.

EDIT: confirmed leakage!

$ bundle exec appraisal ruby-3.1.1-contrib rake spec:rack['--seed 17324']
[1] pry(#<RSpec::ExampleGroups::RackIntegrationTests::ForAnApplication::WithABasicRoute::POSTRequest::WithoutParameters>)> Datadog.configuration.tracing[:rack][:quantize]
=> {:base=>:show}
[2] pry(#<RSpec::ExampleGroups::RackIntegrationTests::ForAnApplication::WithABasicRoute::POSTRequest::WithoutParameters>)> rack_options
=> {}

@codecov-commenter
Copy link

Codecov Report

Merging #2265 (a5ff117) into master (0e92706) will increase coverage by 0.00%.
The diff coverage is 99.49%.

@@           Coverage Diff            @@
##           master    #2265    +/-   ##
========================================
  Coverage   97.56%   97.57%            
========================================
  Files        1087     1093     +6     
  Lines       56965    57328   +363     
========================================
+ Hits        55580    55939   +359     
- Misses       1385     1389     +4     
Impacted Files Coverage Δ
.../datadog/tracing/contrib/rack/header_collection.rb 93.75% <93.75%> (ø)
lib/datadog/tracing/contrib/rack/middlewares.rb 99.20% <97.67%> (+0.16%) ⬆️
lib/datadog/core/configuration/settings.rb 98.94% <100.00%> (+0.04%) ⬆️
lib/datadog/core/header_collection.rb 100.00% <100.00%> (ø)
lib/datadog/tracing/client_ip.rb 100.00% <100.00%> (ø)
lib/datadog/tracing/configuration/ext.rb 100.00% <100.00%> (ø)
...datadog/tracing/contrib/utils/quantization/http.rb 92.06% <100.00%> (+1.49%) ⬆️
lib/datadog/tracing/metadata/ext.rb 100.00% <100.00%> (ø)
spec/datadog/core/header_collection_spec.rb 100.00% <100.00%> (ø)
spec/datadog/tracing/client_ip_spec.rb 100.00% <100.00%> (ø)
... and 8 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks 👍 after the latest changes.

In practice this cop seems to be disabled as soon as we reach the limit.
@lloeki
Copy link
Contributor Author

lloeki commented Sep 23, 2022

Alright, all comments addressed (thanks @marcotc!), waiting for CI.

@delner I'll proceed by dismissing your review since the concern has been discussed internally and addressed.

@lloeki lloeki dismissed delner’s stale review September 23, 2022 11:25

Discussed internally, adressed.

@lloeki
Copy link
Contributor Author

lloeki commented Sep 23, 2022

All good, merging 🎉

@lloeki lloeki merged commit 6970088 into master Sep 23, 2022
@lloeki lloeki deleted the use-full-url-in-http-url-tag branch September 23, 2022 11:47
@github-actions github-actions bot added this to the 1.5.0 milestone Sep 23, 2022
@gingerlime
Copy link
Contributor

Hi @lloeki. Thanks for the fix. I'm testing dd-trace 1.5.1 and still see the same issue unfortunately.

root@host:/var/local/app/app/dependabot/bundler/ddtrace-1.5.1 %)# curl "http://localhost:3000/|"

Puma caught this error: bad URI(is not URI?): "/|" (URI::InvalidURIError)
/usr/local/lib/ruby/3.0.0/uri/rfc3986_parser.rb:67:in `split'
/usr/local/lib/ruby/3.0.0/uri/rfc3986_parser.rb:72:in `parse'
/usr/local/lib/ruby/3.0.0/uri/rfc3986_parser.rb:106:in `convert_to_uri'
/usr/local/lib/ruby/3.0.0/uri/generic.rb:1100:in `merge'
/usr/local/lib/ruby/3.0.0/uri/rfc3986_parser.rb:78:in `inject'
/usr/local/lib/ruby/3.0.0/uri/rfc3986_parser.rb:78:in `join'
/usr/local/lib/ruby/3.0.0/uri/common.rb:208:in `join'
/usr/local/bundle/gems/ddtrace-1.5.1/lib/datadog/tracing/contrib/rack/middlewares.rb:274:in `parse_url'
/usr/local/bundle/gems/ddtrace-1.5.1/lib/datadog/tracing/contrib/rack/middlewares.rb:167:in `set_request_tags!'
/usr/local/bundle/gems/ddtrace-1.5.1/lib/datadog/tracing/contrib/rack/middlewares.rb:110:in `call'
/usr/local/bundle/gems/rack-mini-profiler-3.0.0/lib/mini_profiler/profiler.rb:249:in `call'
/usr/local/bundle/gems/railties-7.0.4/lib/rails/engine.rb:530:in `call'
/usr/local/bundle/gems/puma-5.6.5/lib/puma/configuration.rb:252:in `call'
/usr/local/bundle/gems/puma-5.6.5/lib/puma/request.rb:77:in `block in handle_request'
/usr/local/bundle/gems/puma-5.6.5/lib/puma/thread_pool.rb:340:in `with_force_shutdown'
/usr/local/bundle/gems/puma-5.6.5/lib/puma/request.rb:76:in `handle_request'
/usr/local/bundle/gems/puma-5.6.5/lib/puma/server.rb:443:in `process_client'
/usr/local/bundle/gems/puma-5.6.5/lib/puma/thread_pool.rb:147:in `block in spawn_thread'

@lloeki
Copy link
Contributor Author

lloeki commented Oct 20, 2022

Thanks @gingerlime, it seems the line triggering this comes from the other PR #2310

# /usr/local/bundle/gems/ddtrace-1.5.1/lib/datadog/tracing/contrib/rack/middlewares.rb:274:in `parse_url'
::URI.join(base_url, fullpath).to_s

There are specs supposed to cover the triggering case and prevent a regression but it seems they did not trigger (maybe they don't mimic puma+rack enough). I'll look into it.

@gingerlime
Copy link
Contributor

@lloeki yeah, sorry, I wasn't sure where to post this. I already linked to this comment on the other PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants