-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support space separated DD_TAGS value #2196
Conversation
5a05650
to
d1f5fac
Compare
2eec0f2
to
f6256de
Compare
ddtrace/utils/formats.py
Outdated
|
||
if not sep: | ||
log.error( | ||
"Malformed tag string: '%s'.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this have more helpful error message?
Should this raise? We only parse during startup right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more descriptive error message could be good but this hard to guess at when we don't know for sure which separator the user is intending. I could change it to: "Failed to find separator for tag string: '%s'."
.
Being that it is internal, I think we can continue as is and allow it to catch the errors and log appropriately. If we are to raise, I think we should take that up in a follow-up since that'll require a bit of work elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By more descriptive, we could also do something like:
Failed to find separate for tag string: %r.
Tag string must be comma or space separated
one:value,two:value,three:value
one:value two:value three:value
Saying comma or space separated with an example of how it is suppose to look (even if we don't use their string/values as an example) could be helpful, otherwise that error message means you have to go look up in the docs (if we wanted to add an "errors" section to the docs and have a link to what this error message means in long form... that could be fine too... ? maybe?)
ddtrace/utils/formats.py
Outdated
for tag in tags_str.split(","): | ||
# Identify separator based on which successfully identifies the correct | ||
# number of valid tags | ||
numtags = tags_str.count(":") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT of
n = tags.count(":")
for sep in ", ":
if n == sum(":" in _ for _ in tags.split(sep)):
break
else:
log.error(...)
return parsed_tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I'm not sure this is even correct. Perhaps the better thing to do is
for sep in ", ":
if all(_.count(":") == 1 for _ in tags.split(sep)):
break
else:
log.error(...)
return parsed_tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only (minor) issue with the all(...)
solution is that it would consider key1,key2,key3
an error having to do with determining the separator when it's that the tags themselves are not well-formed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, forget what I said, the original approach is indeed fine! Sorry for all the confusion! 😞
3ad1b5f
to
374f04e
Compare
if sum(1 for _ in tags_str.split(sep) if _.count(":") == 1) == numtagseps: | ||
break | ||
else: | ||
log.error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From looking at internal usage, I am still team raise an error:
$ git grep parse_tags_str
ddtrace/bootstrap/sitecustomize.py:25:from ddtrace.utils.formats import parse_tags_str
ddtrace/bootstrap/sitecustomize.py:63: modules = parse_tags_str(modules_to_patch)
ddtrace/bootstrap/sitecustomize.py:113: tracer.set_tags(parse_tags_str(env_tags))
ddtrace/profiling/exporter/http.py:18:from ddtrace.utils.formats import parse_tags_str
ddtrace/profiling/exporter/http.py:65: parse_tags_str(os.environ.get("DD_TAGS")).items(),
ddtrace/profiling/exporter/http.py:66: parse_tags_str(os.environ.get("DD_PROFILING_TAGS")).items(),
ddtrace/settings/config.py:9:from ..utils.formats import parse_tags_str
ddtrace/settings/config.py:56: self.tags = parse_tags_str(os.getenv("DD_TAGS") or "")
ddtrace/settings/config.py:65: self.service_mapping = parse_tags_str(get_env("service", "mapping", default=""))
But we can handle as a follow-up.
ddtrace/utils/formats.py
Outdated
# number of valid tags | ||
numtagseps = tags_str.count(":") | ||
for sep in [",", " "]: | ||
if sum(1 for _ in tags_str.split(sep) if _.count(":") == 1) == numtagseps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so totally my fault, you had the logic right before my last rushed comments! I'm making this final suggestion, but feel free to ignore.
if sum(1 for _ in tags_str.split(sep) if _.count(":") == 1) == numtagseps: | |
if sum(":" in _ for _ in tags_str.split(sep)) == numtagseps: |
8991391
to
cf3ab12
Compare
@@ -97,7 +101,29 @@ def parse_tags_str(tags_str): | |||
if not tags_str: | |||
return parsed_tags | |||
|
|||
for tag in tags_str.split(","): | |||
if _TAGS_NOT_SEPARATED.search(tags_str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about "key:value"
?
This regex will cause that use case to not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex is for two colons without a valid separator in between. There is a test case for the simple case you gave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait... I think I completely misread this... this is looking to see if there is more than 1 tag without a space or comma?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, like "key2:val1:"
test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this check? does it help short circuit our parsing algorithm so we don't waste time parsing something that shouldn't, or does it catch an edge case that our parsing algorithm doesn't catch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's an edge case that would otherwise be treated as a problem with separators, which it might not be the case, e.g., key:val key2:val_with_a_colon_:
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question but otherwise lgtm
for sep in [",", " "]: | ||
if sum(":" in _ for _ in tags_str.split(sep)) == numtagseps: | ||
break | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this else
block indented properly? What condition is this opposing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The else is there to catch the case that no valid separator has been identified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I wasn't familiar with the for | else
clause until now, thanks for clarifying 😄
|
||
from .deprecation import deprecation | ||
|
||
|
||
_TAGS_NOT_SEPARATED = re.compile(r":[^,\s]+:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth having a comment for this IMO for those who can't read regex (me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to provide an example or two of what we are trying to test for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is documented already in the parse_tags_str
whereas this comment is meant to be a gloss on the regex to make it easy to read.
--- | ||
features: | ||
- | | ||
Support space-separated `DD_TAGS` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support space-separated `DD_TAGS` | |
Add support for space-separated `DD_TAGS`. |
Codecov Report
@@ Coverage Diff @@
## master #2196 +/- ##
==========================================
- Coverage 91.60% 91.59% -0.01%
==========================================
Files 569 569
Lines 40803 40791 -12
==========================================
- Hits 37378 37364 -14
- Misses 3425 3427 +2
Continue to review full report at Codecov.
|
7d1a974
to
8f174dd
Compare
8f174dd
to
15dd506
Compare
* fix typo in filter debug line (#2213) Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * feat: support space separated DD_TAGS value (#2196) Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * Add sync mode to AgentWriter (#2206) * feat(agent): support aws lambda extension * chore: fix flake8 issue with test * chore: add release note for datadog lambda extension * chore: fix import order * chore: fix flake8 issue * [serverless] fix broken test * remove unused import * separate out sync writer changes * add integration tests for sync writer Co-authored-by: DarcyRaynerDD <darcy.rayner@datadoghq.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * ci: run django test suite with ddtrace enabled (#2203) * Run django test suite with ddtrace enabled * Added pylibmc install to fix build error * Added django/ddtrace install commands to separate blocks * Fixed annotation * Installed generic version of ddtrace instead of checking out * Added documentation * Changed working directory, removed inherently failing test * Reverted working directory change * Reverted skipping specific test, added pythonpath * Added more docs Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * ci: run flask test suite with ddtrace enabled (#2190) * ci: run flask test suite with ddtrace enabled * Update .github/workflows/flask.yml Co-authored-by: Brett Langdon <me@brett.is> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com> Co-authored-by: Brett Langdon <me@brett.is> Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> * docs: add note about pip version requirement (#2220) * handle logwriter configure (#2219) Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * test: move __init__ to utils (#2217) Having code in tests/__init__ makes ddtrace being imported before we have a chance to do anything, such as calling gevent.monkey.patch_all. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * refactor: distributed_tracing_enabled usages to trace_utils helper (#2216) * Modified trace util helper to account for inconsistencies in distributed_tracing config key, refactored usages * Refactored traceutils helper * Made int_config optional in the trace_util helper, updated affected integration files * Removed erroneous code addition to pyramid/trace.py * Make tracer tests less noisy (#2221) monkeypatch.setenv warns if the value given is not an instance of the base string type. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * refactor(worker): replace regular thread by PeriodicThread (#2205) Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com> * Disable known failing Django tests (#2224) * Disable known failing Django tests * fix yaml multiline string Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * test(botocore): add test case for firehose args (#2227) This change adds a test case for #2204 to ensure that the params.Records argument is excluded from the span tags. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * refactor: move ddtrace.profiling._attr to ddtrace.utils (#2230) Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * Adding urllib3 integration (#1148) * draft of urllib3 integration * black formatting * set request tags on the span before the call is made in case of errors * rename span name and support retry tags based on CR feedback * add tests and fix a few small bugs * fix flake8 * add tests and circleci config * flake8 fixes * Update ddtrace/contrib/urllib3/__init__.py Co-Authored-By: Tahir H. Butt <tahir.butt@datadoghq.com> * use dd compat for urlparse and unparse * use SpanType constants for type of span use SpanTypes.HTTP * fix tests * fix flake8 from merge conflict resolution * adapt to latest changes on master * adopt latest standards for integrations Use Pin to add patch information to urllib3 requests. Also use riot for configuring the test environment * apply checklist points * wip: make tests pass * set tags on exception also fixed the tests to override HTTP configuration where required * docs(urllib3): add urllib3 integration docs * fix service name handling * fix copy-pasta errors and improve documentation * Update ddtrace/contrib/urllib3/__init__.py Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> * Update riotfile.py Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> * remove unused imports from test sources * use the new get_argument_value utility * Update tests/contrib/urllib3/test_urllib3.py Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> * Update ddtrace/contrib/urllib3/patch.py Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> * commit Tahir's suggestions * fix TracerTestCase import Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com> Co-authored-by: Gabriele N. Tornetta <gabriele.tornetta@datadoghq.com> Co-authored-by: Gabriele N. Tornetta <P403n1x87@users.noreply.github.com> Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * Revert writer.write argument change (#2232) This is a breaking change which we're going to make at a later date. * Make processor private on Tracer (#2233) This isn't something we want to make public yet. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * fix(profiling): make sure PeriodicCollector start withing service lock (#2214) This makes sure the periodic collectors are started while the service lock is held to avoid race conditions while starting the services. This has no impact in practice since the Profiler class was made thread safe with #2179. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * refactor(forksafe): simplify code (#2222) This simplifies the code to use a banal decorator approach rather than a weird lambda-passed-as-argument-based 😉 It also adds typing and an `unregister` function to allow for… unregistering a function! Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * chore: correct typing hints for span.meta (#2237) Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * Lambda extension support (#2092) * feat(agent): support aws lambda extension * chore: fix flake8 issue with test * chore: add release note for datadog lambda extension * chore: fix import order * chore: fix flake8 issue * [serverless] fix broken test * remove unused import * Add docs, type hints Co-authored-by: Kyle Verhoog <kyle.verhoog@datadoghq.com> Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * fix(span): ensure ID args have the correct type (#2243) Among all the arguments of the Span constructor, the ones bearing IDs are the most important type-wise. This change adds checks to ensure that span_id, trace_id and parent_id all have the expected type. The rationale for adding the check at construction time is so that this error could be caught as soon as possible, rather than defer it to later processing stages, thus avoiding invalid entries in the processing queue. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * fix: only warn if using uwsgi with ddtrace-run (#2202) * fix: only warn if using uwsgi with ddtrace-run * remove DeprecationWarning prefix * update message Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com> * Update ISSUE_TEMPLATE.md (#2248) * Use multiproject pipeline to deploy to reliability env (#2249) * Use multiproject pipeline for reliability env deploy * Use default branch Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * style(pymongo): format pymongo (#2250) * fix: handle text types in _set_str_tag (#2218) * chore: correct typing hints for span.meta * fix: handle text types in _set_str_tag * chore: correct typing hints for span.meta * fix: handle text types in _set_str_tag * fix: handle text types in _set_str_tag * flake8 * add doc string, fix test * fix grpc traceback tag Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com> Co-authored-by: Gabriele N. Tornetta <P403n1x87@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com> Co-authored-by: DarcyRaynerDD <darcy.rayner@datadoghq.com> Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> Co-authored-by: Julien Danjou <julien.danjou@datadoghq.com> Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com> Co-authored-by: Brett Langdon <me@brett.is> Co-authored-by: Gabriele N. Tornetta <P403n1x87@users.noreply.github.com> Co-authored-by: Justin <justin.james.iso@gmail.com> Co-authored-by: Gabriele N. Tornetta <gabriele.tornetta@datadoghq.com> Co-authored-by: Darcy Rayner <50632605+DarcyRaynerDD@users.noreply.github.com> Co-authored-by: Laplie Anderson <randomanderson@users.noreply.github.com>
Description
Closes #2195
Checklist