Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for regular expression matching and sanitizing of headers in ASGI. #1333

Merged
merged 3 commits into from
Oct 24, 2022

Conversation

rogersd
Copy link
Contributor

@rogersd rogersd commented Sep 14, 2022

Next part of #1184

Description

Add test cases for regular expression matching.
Add test cases for "all" keyword.
Add test cases for header sanitizing.
Add documentation for regular expression matching and header sanitation.
Various documentation cleanups and standardization.
Fix keys() in class ASGIGetter so it returns the HTTP header keys instead of a list of available request data. This makes it consistent with the WSGIGetter keys() method.
Make ASGIGetter.get() compare all keys in a case insensitive manner.

Fixes #1184

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

  • Unit tests added.

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@rogersd rogersd requested a review from a team as a code owner September 14, 2022 16:26
custom_headers.strip()
for custom_headers in custom_headers.split(",")
]
if custom_headers.strip().lower() == "all":
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires regular expression matching of headers to work. I'll be adding this to one instrumentation at a time (ASGI in this one).

In theory, if all the instrumentations don't get updated in the same release, this could cause problems if someone is capturing a header named All, but I think the odds of that are low.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the odds are low, I suggest not having this all option. Since the regex already does the job, I would add a note on how to capture all the headers in the doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@rogersd
Copy link
Contributor Author

rogersd commented Sep 14, 2022

@srikanthccv Could you please take a look at this?

]
if not decoded:
return None
return decoded

def keys(self, carrier: dict) -> typing.List[str]:
return list(carrier.keys())
return [
_key.decode("utf8") for (_key, _value) in carrier.get("headers")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The carrier here is expected to be headers dict so the carrier.keys() should be fine. If the caller is incorrectly using then I think fix should be implemented there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I left the decode() in but I took the other part out. Let me know if you want that part removed too.

custom_headers.strip()
for custom_headers in custom_headers.split(",")
]
if custom_headers.strip().lower() == "all":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the odds are low, I suggest not having this all option. Since the regex already does the job, I would add a note on how to capture all the headers in the doc.

Comment on lines 353 to 371
custom_request_headers_regex_compiled = re.compile(
"|".join("^" + i + "$" for i in custom_request_headers_name),
re.IGNORECASE,
)

for header_name in list(
filter(
custom_request_headers_regex_compiled.match,
asgi_getter.keys(scope),
)
):
header_values = asgi_getter.get(scope, header_name.lower())
if header_values:
key = normalise_request_header_name(header_name.lower())
attributes[key] = [
sanitize.sanitize_header_value(
header=header_name, value=header_values[0]
)
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this to util-http package so that it will be easier to use in other instrumentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@rogersd rogersd force-pushed the drogers-wildcard-headers branch 3 times, most recently from 95b881d to 29ed503 Compare October 17, 2022 21:53
… in ASGI.

Next part of open-telemetry#1184

Details:

Add test cases for regular expression matching.
Add test cases for header sanitizing.
Add documentation for regular expression matching and header sanitation.
Various documentation cleanups and standardization.
Fix keys() in class ASGIGetter so it returns the HTTP header keys instead of a list of available request data. This makes it consistent with the WSGIGetter keys() method.
Make ASGIGetter.get() compare all keys in a case insensitive manner.
Move TestCustomHeaders to a separate file to avoid module length error.
@rogersd
Copy link
Contributor Author

rogersd commented Oct 18, 2022

@srikanthccv I think this should be ready to go, sorry for the delay!

@rogersd
Copy link
Contributor Author

rogersd commented Oct 24, 2022

@srikanthccv Thanks for the review! May you please merge this when you have a minute so that I can open the equivalent PR for WGSI?

Thanks!

@srikanthccv srikanthccv merged commit 9d6ba63 into open-telemetry:main Oct 24, 2022
saartochner-lumigo pushed a commit to lumigo-io/opentelemetry-python-contrib that referenced this pull request Nov 13, 2022
saartochner-lumigo pushed a commit to lumigo-io/opentelemetry-python-contrib that referenced this pull request Nov 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add more features for adding HTTP request / response headers to spans.
2 participants