Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support space separated DD_TAGS value #2196

Merged
merged 3 commits into from
Mar 23, 2021

Conversation

majorgreys
Copy link
Contributor

@majorgreys majorgreys commented Mar 18, 2021

Description

Closes #2195

Checklist

  • Added to the correct milestone.
  • Tests provided or description of manual testing performed is included in the code or PR.
  • Library documentation is updated.
  • Corp site documentation is updated (link to the PR).

@majorgreys majorgreys added this to the 0.48.0 milestone Mar 18, 2021
@majorgreys majorgreys requested a review from a team as a code owner March 18, 2021 17:58
@majorgreys majorgreys force-pushed the ddtags-space branch 3 times, most recently from 5a05650 to d1f5fac Compare March 19, 2021 14:06
ddtrace/utils/formats.py Outdated Show resolved Hide resolved
@majorgreys majorgreys force-pushed the ddtags-space branch 3 times, most recently from 2eec0f2 to f6256de Compare March 22, 2021 17:45
ddtrace/utils/formats.py Outdated Show resolved Hide resolved

if not sep:
log.error(
"Malformed tag string: '%s'.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have more helpful error message?

Should this raise? We only parse during startup right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more descriptive error message could be good but this hard to guess at when we don't know for sure which separator the user is intending. I could change it to: "Failed to find separator for tag string: '%s'.".

Being that it is internal, I think we can continue as is and allow it to catch the errors and log appropriately. If we are to raise, I think we should take that up in a follow-up since that'll require a bit of work elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By more descriptive, we could also do something like:

Failed to find separate for tag string: %r.
Tag string must be comma or space separated
one:value,two:value,three:value
one:value two:value three:value

Saying comma or space separated with an example of how it is suppose to look (even if we don't use their string/values as an example) could be helpful, otherwise that error message means you have to go look up in the docs (if we wanted to add an "errors" section to the docs and have a link to what this error message means in long form... that could be fine too... ? maybe?)

for tag in tags_str.split(","):
# Identify separator based on which successfully identifies the correct
# number of valid tags
numtags = tags_str.count(":")
Copy link
Contributor

@P403n1x87 P403n1x87 Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT of

    n = tags.count(":")
    for sep in ", ":
        if n == sum(":" in _ for _ in tags.split(sep)):
            break
    else:
        log.error(...)
        return parsed_tags

Copy link
Contributor

@P403n1x87 P403n1x87 Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I'm not sure this is even correct. Perhaps the better thing to do is

    for sep in ", ":
        if all(_.count(":") == 1 for _ in tags.split(sep)):
            break
    else:
        log.error(...)
        return parsed_tags

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only (minor) issue with the all(...) solution is that it would consider key1,key2,key3 an error having to do with determining the separator when it's that the tags themselves are not well-formed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, forget what I said, the original approach is indeed fine! Sorry for all the confusion! 😞

@majorgreys majorgreys force-pushed the ddtags-space branch 3 times, most recently from 3ad1b5f to 374f04e Compare March 22, 2021 18:36
if sum(1 for _ in tags_str.split(sep) if _.count(":") == 1) == numtagseps:
break
else:
log.error(
Copy link
Member

@brettlangdon brettlangdon Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From looking at internal usage, I am still team raise an error:

$ git grep parse_tags_str
ddtrace/bootstrap/sitecustomize.py:25:from ddtrace.utils.formats import parse_tags_str
ddtrace/bootstrap/sitecustomize.py:63:    modules = parse_tags_str(modules_to_patch)
ddtrace/bootstrap/sitecustomize.py:113:        tracer.set_tags(parse_tags_str(env_tags))
ddtrace/profiling/exporter/http.py:18:from ddtrace.utils.formats import parse_tags_str
ddtrace/profiling/exporter/http.py:65:                parse_tags_str(os.environ.get("DD_TAGS")).items(),
ddtrace/profiling/exporter/http.py:66:                parse_tags_str(os.environ.get("DD_PROFILING_TAGS")).items(),
ddtrace/settings/config.py:9:from ..utils.formats import parse_tags_str
ddtrace/settings/config.py:56:        self.tags = parse_tags_str(os.getenv("DD_TAGS") or "")
ddtrace/settings/config.py:65:        self.service_mapping = parse_tags_str(get_env("service", "mapping", default=""))

But we can handle as a follow-up.

# number of valid tags
numtagseps = tags_str.count(":")
for sep in [",", " "]:
if sum(1 for _ in tags_str.split(sep) if _.count(":") == 1) == numtagseps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so totally my fault, you had the logic right before my last rushed comments! I'm making this final suggestion, but feel free to ignore.

Suggested change
if sum(1 for _ in tags_str.split(sep) if _.count(":") == 1) == numtagseps:
if sum(":" in _ for _ in tags_str.split(sep)) == numtagseps:

P403n1x87
P403n1x87 previously approved these changes Mar 22, 2021
P403n1x87
P403n1x87 previously approved these changes Mar 22, 2021
@@ -97,7 +101,29 @@ def parse_tags_str(tags_str):
if not tags_str:
return parsed_tags

for tag in tags_str.split(","):
if _TAGS_NOT_SEPARATED.search(tags_str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about "key:value" ?

This regex will cause that use case to not work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex is for two colons without a valid separator in between. There is a test case for the simple case you gave.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wait... I think I completely misread this... this is looking to see if there is more than 1 tag without a space or comma?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, like "key2:val1:" test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this check? does it help short circuit our parsing algorithm so we don't waste time parsing something that shouldn't, or does it catch an edge case that our parsing algorithm doesn't catch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's an edge case that would otherwise be treated as a problem with separators, which it might not be the case, e.g., key:val key2:val_with_a_colon_:.

Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question but otherwise lgtm

for sep in [",", " "]:
if sum(":" in _ for _ in tags_str.split(sep)) == numtagseps:
break
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this else block indented properly? What condition is this opposing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else is there to catch the case that no valid separator has been identified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I wasn't familiar with the for | else clause until now, thanks for clarifying 😄


from .deprecation import deprecation


_TAGS_NOT_SEPARATED = re.compile(r":[^,\s]+:")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth having a comment for this IMO for those who can't read regex (me)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to provide an example or two of what we are trying to test for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is documented already in the parse_tags_str whereas this comment is meant to be a gloss on the regex to make it easy to read.

---
features:
- |
Support space-separated `DD_TAGS`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Support space-separated `DD_TAGS`
Add support for space-separated `DD_TAGS`.

@codecov-io
Copy link

Codecov Report

Merging #2196 (1d49dda) into master (2c4e4ce) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2196      +/-   ##
==========================================
- Coverage   91.60%   91.59%   -0.01%     
==========================================
  Files         569      569              
  Lines       40803    40791      -12     
==========================================
- Hits        37378    37364      -14     
- Misses       3425     3427       +2     
Impacted Files Coverage Δ
ddtrace/utils/formats.py 96.29% <100.00%> (+0.94%) ⬆️
tests/tracer/test_utils.py 100.00% <100.00%> (ø)
ddtrace/contrib/sanic/patch.py 95.23% <0.00%> (-1.20%) ⬇️
ddtrace/profiling/exporter/http.py 98.91% <0.00%> (-1.09%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c4e4ce...1d49dda. Read the comment docs.

P403n1x87
P403n1x87 previously approved these changes Mar 23, 2021
Kyle-Verhoog
Kyle-Verhoog previously approved these changes Mar 23, 2021
@Yun-Kim Yun-Kim merged commit 9982931 into DataDog:master Mar 23, 2021
Kyle-Verhoog added a commit that referenced this pull request Apr 1, 2021
* fix typo in filter debug line (#2213)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* feat: support space separated DD_TAGS value (#2196)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* Add sync mode to AgentWriter (#2206)

* feat(agent): support aws lambda extension

* chore: fix flake8 issue with test

* chore: add release note for datadog lambda extension

* chore: fix import order

* chore: fix flake8 issue

* [serverless] fix broken test

* remove unused import

* separate out sync writer changes

* add integration tests for sync writer

Co-authored-by: DarcyRaynerDD <darcy.rayner@datadoghq.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* ci: run django test suite with ddtrace enabled (#2203)

* Run django test suite with ddtrace enabled

* Added pylibmc install to fix build error

* Added django/ddtrace install commands to separate blocks

* Fixed annotation

* Installed generic version of ddtrace instead of checking out

* Added documentation

* Changed working directory, removed inherently failing test

* Reverted working directory change

* Reverted skipping specific test, added pythonpath

* Added more docs

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* ci: run flask test suite with ddtrace enabled (#2190)

* ci: run flask test suite with ddtrace enabled

* Update .github/workflows/flask.yml

Co-authored-by: Brett Langdon <me@brett.is>

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Brett Langdon <me@brett.is>
Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* docs: add note about pip version requirement (#2220)

* handle logwriter configure (#2219)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* test: move __init__ to utils (#2217)

Having code in tests/__init__ makes ddtrace being imported before we have a
chance to do anything, such as calling gevent.monkey.patch_all.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* refactor: distributed_tracing_enabled usages to trace_utils helper (#2216)

* Modified trace util helper to account for inconsistencies in distributed_tracing config key, refactored usages

* Refactored traceutils helper

* Made int_config optional in the trace_util helper, updated affected integration files

* Removed erroneous code addition to pyramid/trace.py

* Make tracer tests less noisy (#2221)

monkeypatch.setenv warns if the value given is not an instance of the
base string type.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* refactor(worker): replace regular thread by PeriodicThread (#2205)

Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>

* Disable known failing Django tests (#2224)

* Disable known failing Django tests

* fix yaml multiline string

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* test(botocore): add test case for firehose args (#2227)

This change adds a test case for #2204 to ensure that the params.Records
argument is excluded from the span tags.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* refactor: move ddtrace.profiling._attr to ddtrace.utils (#2230)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* Adding urllib3 integration (#1148)

* draft of urllib3 integration

* black formatting

* set request tags on the span before the call is made in case of errors

* rename span name and support retry tags based on CR feedback

* add tests and fix a few small bugs

* fix flake8

* add tests and circleci config

* flake8 fixes

* Update ddtrace/contrib/urllib3/__init__.py

Co-Authored-By: Tahir H. Butt <tahir.butt@datadoghq.com>

* use dd compat for urlparse and unparse

* use SpanType constants for type of span

use SpanTypes.HTTP

* fix tests

* fix flake8 from merge conflict resolution

* adapt to latest changes on master

* adopt latest standards for integrations

Use Pin to add patch information to urllib3 requests. Also use riot for configuring the test environment

* apply checklist points

* wip: make tests pass

* set tags on exception

also fixed the tests to override HTTP configuration where required

* docs(urllib3): add urllib3 integration docs

* fix service name handling

* fix copy-pasta errors and improve documentation

* Update ddtrace/contrib/urllib3/__init__.py

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* Update riotfile.py

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* remove unused imports from test sources

* use the new get_argument_value utility

* Update tests/contrib/urllib3/test_urllib3.py

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* Update ddtrace/contrib/urllib3/patch.py

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* commit Tahir's suggestions

* fix TracerTestCase import

Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Co-authored-by: Gabriele N. Tornetta <gabriele.tornetta@datadoghq.com>
Co-authored-by: Gabriele N. Tornetta <P403n1x87@users.noreply.github.com>
Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

* Revert writer.write argument change (#2232)

This is a breaking change which we're going to make at a later date.

* Make processor private on Tracer (#2233)

This isn't something we want to make public yet.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* fix(profiling): make sure PeriodicCollector start withing service lock (#2214)

This makes sure the periodic collectors are started while the service lock is
held to avoid race conditions while starting the services.

This has no impact in practice since the Profiler class was made thread safe
with #2179.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* refactor(forksafe): simplify code (#2222)

This simplifies the code to use a banal decorator approach rather than a weird
lambda-passed-as-argument-based 😉

It also adds typing and an `unregister` function to allow for… unregistering a function!

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* chore: correct typing hints for span.meta (#2237)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* Lambda extension support (#2092)

* feat(agent): support aws lambda extension

* chore: fix flake8 issue with test

* chore: add release note for datadog lambda extension

* chore: fix import order

* chore: fix flake8 issue

* [serverless] fix broken test

* remove unused import

* Add docs, type hints

Co-authored-by: Kyle Verhoog <kyle.verhoog@datadoghq.com>
Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

* fix(span): ensure ID args have the correct type (#2243)

Among all the arguments of the Span constructor, the ones bearing IDs
are the most important type-wise. This change adds checks to ensure that
span_id, trace_id and parent_id all have the expected type.

The rationale for adding the check at construction time is so that this
error could be caught as soon as possible, rather than defer it to later
processing stages, thus avoiding invalid entries in the processing
queue.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* fix: only warn if using uwsgi with ddtrace-run (#2202)

* fix: only warn if using uwsgi with ddtrace-run

* remove DeprecationWarning prefix

* update message

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>

* Update ISSUE_TEMPLATE.md (#2248)

* Use multiproject pipeline to deploy to reliability env (#2249)

* Use multiproject pipeline for reliability env deploy

* Use default branch

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* style(pymongo): format pymongo (#2250)

* fix: handle text types in _set_str_tag (#2218)

* chore: correct typing hints for span.meta

* fix: handle text types in _set_str_tag

* chore: correct typing hints for span.meta

* fix: handle text types in _set_str_tag

* fix: handle text types in _set_str_tag

* flake8

* add doc string, fix test

* fix grpc traceback tag

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Gabriele N. Tornetta <P403n1x87@users.noreply.github.com>

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Co-authored-by: DarcyRaynerDD <darcy.rayner@datadoghq.com>
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
Co-authored-by: Julien Danjou <julien.danjou@datadoghq.com>
Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Brett Langdon <me@brett.is>
Co-authored-by: Gabriele N. Tornetta <P403n1x87@users.noreply.github.com>
Co-authored-by: Justin <justin.james.iso@gmail.com>
Co-authored-by: Gabriele N. Tornetta <gabriele.tornetta@datadoghq.com>
Co-authored-by: Darcy Rayner <50632605+DarcyRaynerDD@users.noreply.github.com>
Co-authored-by: Laplie Anderson <randomanderson@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DD_TAGS separator inconsistent with heroku-buildpack-datadog
6 participants