Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle URLs with invalid characters #2311

Merged
merged 4 commits into from
Oct 17, 2022
Merged

Conversation

lloeki
Copy link
Contributor

@lloeki lloeki commented Oct 12, 2022

What does this PR do?

Handle additional URL corner cases.

Motivation

Fixes #2309

Additional Notes

Not entirely sure about the expected placeholder behaviour but it's an undocumented feature.

How to test the change?

Specs.

@lloeki lloeki requested a review from a team October 12, 2022 15:06
@github-actions github-actions bot added integrations Involves tracing integrations tracing labels Oct 12, 2022
@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2022

Codecov Report

Merging #2311 (46c2fc0) into master (f5984e8) will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2311      +/-   ##
==========================================
- Coverage   97.58%   97.54%   -0.05%     
==========================================
  Files        1094     1076      -18     
  Lines       57400    56765     -635     
==========================================
- Hits        56015    55369     -646     
- Misses       1385     1396      +11     
Impacted Files Coverage Δ
...iling_native_extension/native_extension_helpers.rb 96.15% <ø> (ø)
...ontrib/sidekiq/server_internal_tracer/heartbeat.rb 36.00% <0.00%> (-28.00%) ⬇️
...datadog/tracing/contrib/utils/quantization/http.rb 83.11% <0.00%> (-4.56%) ⬇️
lib/ddtrace/transport/http/env.rb 92.85% <0.00%> (-3.58%) ⬇️
...adog/core/configuration/agent_settings_resolver.rb 96.75% <0.00%> (-2.60%) ⬇️
lib/datadog/core/diagnostics/environment_logger.rb 98.41% <0.00%> (-1.59%) ⬇️
lib/datadog/core/configuration/settings.rb 98.41% <0.00%> (-0.53%) ⬇️
spec/support/container_helpers.rb 99.55% <0.00%> (-0.45%) ⬇️
...ng/contrib/active_support/cache/instrumentation.rb 87.58% <0.00%> (-0.09%) ⬇️
lib/datadog/core.rb 87.50% <0.00%> (ø)
... and 27 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -13,21 +13,24 @@
include Kernel # Ensure that kernel methods are always available (https://sorbet.org/docs/error-reference#7003)

PLACEHOLDER = '?'.freeze
RFC3986_URL_BASE = /\A(?<URI>(?<scheme>[A-Za-z][+\-.0-9A-Za-z]*):(?<hier-part>\/\/(?<authority>(?:(?<userinfo>(?:%\h\h|[!$&-.0-;=A-Z_a-z~])*)@)?(?<host>(?<IP-literal>\[(?:(?<IPv6address>(?:\h{1,4}:){6}(?<ls32>\h{1,4}:\h{1,4}|(?<IPv4address>(?<dec-octet>[1-9]\d|1\d{2}|2[0-4]\d|25[0-5]|\d)\.\g<dec-octet>\.\g<dec-octet>\.\g<dec-octet>))|::(?:\h{1,4}:){5}\g<ls32>|\h{1,4}?::(?:\h{1,4}:){4}\g<ls32>|(?:(?:\h{1,4}:)?\h{1,4})?::(?:\h{1,4}:){3}\g<ls32>|(?:(?:\h{1,4}:){,2}\h{1,4})?::(?:\h{1,4}:){2}\g<ls32>|(?:(?:\h{1,4}:){,3}\h{1,4})?::\h{1,4}:\g<ls32>|(?:(?:\h{1,4}:){,4}\h{1,4})?::\g<ls32>|(?:(?:\h{1,4}:){,5}\h{1,4})?::\h{1,4}|(?:(?:\h{1,4}:){,6}\h{1,4})?::)|(?<IPvFuture>v\h+\.[!$&-.0-;=A-Z_a-z~]+))\])|\g<IPv4address>|(?<reg-name>(?:%\h\h|[!$&-.0-9;=A-Z_a-z~])*))(?::(?<port>\d*))?)))(?:\/|\z)/

Check warning

Code scanning / CodeQL

Overly permissive regular expression range

Suspicious character range that is equivalent to \[0-9:;\].
@@ -13,21 +13,24 @@
include Kernel # Ensure that kernel methods are always available (https://sorbet.org/docs/error-reference#7003)

PLACEHOLDER = '?'.freeze
RFC3986_URL_BASE = /\A(?<URI>(?<scheme>[A-Za-z][+\-.0-9A-Za-z]*):(?<hier-part>\/\/(?<authority>(?:(?<userinfo>(?:%\h\h|[!$&-.0-;=A-Z_a-z~])*)@)?(?<host>(?<IP-literal>\[(?:(?<IPv6address>(?:\h{1,4}:){6}(?<ls32>\h{1,4}:\h{1,4}|(?<IPv4address>(?<dec-octet>[1-9]\d|1\d{2}|2[0-4]\d|25[0-5]|\d)\.\g<dec-octet>\.\g<dec-octet>\.\g<dec-octet>))|::(?:\h{1,4}:){5}\g<ls32>|\h{1,4}?::(?:\h{1,4}:){4}\g<ls32>|(?:(?:\h{1,4}:)?\h{1,4})?::(?:\h{1,4}:){3}\g<ls32>|(?:(?:\h{1,4}:){,2}\h{1,4})?::(?:\h{1,4}:){2}\g<ls32>|(?:(?:\h{1,4}:){,3}\h{1,4})?::\h{1,4}:\g<ls32>|(?:(?:\h{1,4}:){,4}\h{1,4})?::\g<ls32>|(?:(?:\h{1,4}:){,5}\h{1,4})?::\h{1,4}|(?:(?:\h{1,4}:){,6}\h{1,4})?::)|(?<IPvFuture>v\h+\.[!$&-.0-;=A-Z_a-z~]+))\])|\g<IPv4address>|(?<reg-name>(?:%\h\h|[!$&-.0-9;=A-Z_a-z~])*))(?::(?<port>\d*))?)))(?:\/|\z)/

Check warning

Code scanning / CodeQL

Overly permissive regular expression range

Suspicious character range that is equivalent to \[0-9:;\].
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken from Ruby 3.1.2 lib/uri/rfc3986_parser.rb and adjusted to stop at path

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable; I've left a couple of comments, I'm especially curious about the behavior change around exclude.

@lloeki lloeki merged commit a30d495 into master Oct 17, 2022
@lloeki lloeki deleted the fix-invalid-url-exception branch October 17, 2022 13:06
@github-actions github-actions bot added this to the 1.6.0 milestone Oct 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrations Involves tracing integrations tracing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dd-trace causes bad URI(is not URI?) with invalid URIs
4 participants