Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting failed to send traces to Datadog Agent when DD_TRACE_ENABLED=false #4179

Closed
fredrikaverpil opened this issue Sep 8, 2022 · 37 comments
Labels

Comments

@fredrikaverpil
Copy link

fredrikaverpil commented Sep 8, 2022

Which version of dd-trace-py are you using?

1.4.4

Which version of pip are you using?

22.2.2

Which version of the libraries are you using?

N/A

How can we reproduce your problem?

N/A

What is the result that you get?

Even if I have DD_TRACE_ENABLED=false I get this a lot of times, when running code locally:

ddtrace.internal.logger:handle:158 - failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces

I am in my code calling e.g. span.set_tags and span.set_exc_info(...) but I wouldn't expect traces to be sent off to Datadog because of this.

Do you have any idea what could be triggering a trace to be sent, even if DD_TRACE_ENABLED=false?

What is the result that you expected?

I would not expect an attempt to send traces to Datadog when DD_TRACE_ENABLED=false.

More info:

ERROR    ddtrace.internal.writer:writer.py:567 failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces
Traceback (most recent call last):
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 446, in _send_payload
    response = self._put(payload, headers)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 398, in _put
    self._conn.request("PUT", self._endpoint, data, headers)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 941, in connect
    self.sock = self._create_connection(
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 845, in create_connection
    raise err
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 833, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 559, in flush_queue
    self._retry_upload(self._send_payload, encoded, n_traces)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 361, in iter
    raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x110f12470 state=finished raised ConnectionRefusedError>]
@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 8, 2022

Hi @fredrikaverpil, I'm trying to reproduce this on my end but I'm not seeing the first error log ddtrace.internal.logger:handle:158 - failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces that you mentioned in your traceback. Can you clarify where you're seeing that exact error from, and if possible could you add that traceback to this issue? Thanks!

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 8, 2022

Ok, I'm pretty sure I found what is causing this why I see this traceback.

I have a custom logger, which is based on loguru. When setting up this logger, I also intercept all standard library logging:

import logging
from loguru import logger

class LoguruInterceptHandler(logging.Handler):
    """Intercept standard logging messages toward Loguru sinks.

    Notes:
        This will e.g. make it possible to see the SQL debug logs from
        ORMs such as Peewee.
    """

    def emit(self, record):
        # Get corresponding Loguru level if it exists
        try:
            level = loguru_logger.level(record.levelname).name
        except ValueError:
            level = record.levelno

        # Find caller from where originated the logged message
        frame, depth = logging.currentframe(), 2
        while frame.f_code.co_filename == logging.__file__:
            frame = frame.f_back
            depth += 1

        loguru_logger.opt(depth=depth, exception=record.exc_info).bind(
            metadata={"intercepted_from_std_logging": True}
        ).log(
            level,
            str(record.getMessage()),
        )

def setup_interceptor():
    # intercept everything at the root logger
    logging.root.handlers = [LoguruInterceptHandler()]
    logging.root.setLevel("INFO")

    # remove every other logger's handlers
    # and propagate to root logger
    for name in logging.root.manager.loggerDict.keys():
        logging.getLogger(name).handlers = []
        logging.getLogger(name).propagate = True


setup_interceptor()

This code executes late in my service. Meaning, it executes when most libraries etc have created their loggers.

When I leave this out, the traceback no longer appear in the terminal when e.g. running pytest in my project. But I believe the ddtrace error log actually is just silenced and is executing anyway.

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 8, 2022

Could it be that the ddtrace library silently logs this message (here), and my interceptor simply found it and also picked up the raised exception traceback?

If I add a print("hello") next to this log.error statement, I see it being printed when e.g. running pytest in my project. And I have DD_TRACE_ENABLED=false.

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 8, 2022

If the above assumption is correct, I don't think ddtrace should even attempt to send traces when DD_TRACE_ENABLED=false (and then also not have to log that error).

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 9, 2022

Hi @fredrikaverpil, just so I can clarify what might be the problem could you add a few debug statements next to the log.error statement you mentioned? Thanks!

  • The value of n_traces and encoded, it should ideally be 0 and empty value but I'm curious to see what the tracer is trying to send.
  • The value of ddtrace.tracer.enabled, it should be false as you've set DD_TRACE_ENABLED=false but the only way in our design you can be in that code block is with an enabled tracer. If the value is true, could you let us know if you ever try to configure the tracer (tracer.configure(...)) in your codebase and include enable as an argument? This might potentially re-enable the tracer.

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 12, 2022

@Yun-Kim Sure!

I added some prints:

    def flush_queue(self, raise_exc=False):
        print("In flush_queue")
        # type: (bool) -> None
        try:
            ...
                else:
                    print("Debug logging for Datadog", flush=True)
                    print(f"{n_traces=}", flush=True)
                    print(f"{encoded=}", flush=True)
                    print(f"{os.getenv('DD_TRACE_ENABLED')=}", flush=True)

                    log.error("failed to send traces to Datadog Agent at %s", self._agent_endpoint, exc_info=True)

The results:

n_traces=1
encoded=b'\x91\x91\x89\xa8trace_id\xcfM\x08\xf2\xdcVq\xbeG\xa7span_id\xcf6\x1d\x8aA\xc9\x9c\x99\xce\xa7service\xb8conversations-migrations\xa8resource\xadinitContainer\xa4name\xd9 conversations.run_migrations.run\xa5start\xcf\x17\x14\n\xa3\xdc\x7f\x1a`\xa8duration\xce\x1ae\xc4H\xa4meta\x82\xaaruntime-id\xd9 e1b588fb6af148f9a628ebe93351bc40\xa8_dd.p.dm\xa2-0\xa7metrics\x85\xad_dd.agent_psr\xcb?\xf0\x00\x00\x00\x00\x00\x00\xaasystem.pid\xcd\xbbe\xad_dd.top_level\x01\xb5_sampling_priority_v1\x01\xad_dd.tracer_kr\xcb?\xf0\x00\x00\x00\x00\x00\x00'
os.getenv('DD_TRACE_ENABLED')='false'

I'm not using tracer.configure() and I'm only relying on environment variables. However, I also printed outenabled, but it doesn't get printed. It seems importing enabled somehow stops print from printing anything:

                else:
                    from ddtrace.tracer import enabled
                    print(f"{enabled=}", flush=True)

                    log.error("failed to send traces to Datadog Agent at %s", self._agent_endpoint, exc_info=True)

At first, I noticed that I did not have my db container running, and so I received a very large value full of tracebacks in encoded. But after starting up that container, I got the above value.

I ran pytest to trigger all of this. The full output is below:

Details
❯ pytest --count=100 -s -k test_s3
====================================================== test session starts =======================================================
platform darwin -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0
Using --randomly-seed=107146139
rootdir: /Users/fredrik/code/repos/<REDACTED>, configfile: pyproject.toml, testpaths: tests
plugins: anyio-3.6.1, Faker-14.1.0, freezegun-0.4.2, ddtrace-1.4.4, randomly-3.12.0, mock-3.8.2, dotenv-0.5.2, repeat-0.9.1, cov-3.0.0
collected 22693 items / 22193 deselected / 500 selected                                                                          

tests/pytest/unit/message/test_s3_url.py ........................In flush_queue
.....Debug logging for Datadog
n_traces=1
encoded=b'\x91\x91\x89\xa8trace_id\xcfM\x08\xf2\xdcVq\xbeG\xa7span_id\xcf6\x1d\x8aA\xc9\x9c\x99\xce\xa7service\xb8conversations-migrations\xa8resource\xadinitContainer\xa4name\xd9 conversations.run_migrations.run\xa5start\xcf\x17\x14\n\xa3\xdc\x7f\x1a`\xa8duration\xce\x1ae\xc4H\xa4meta\x82\xaaruntime-id\xd9 e1b588fb6af148f9a628ebe93351bc40\xa8_dd.p.dm\xa2-0\xa7metrics\x85\xad_dd.agent_psr\xcb?\xf0\x00\x00\x00\x00\x00\x00\xaasystem.pid\xcd\xbbe\xad_dd.top_level\x01\xb5_sampling_priority_v1\x01\xad_dd.tracer_kr\xcb?\xf0\x00\x00\x00\x00\x00\x00'
os.getenv('DD_TRACE_ENABLED')='false'
2022-09-12 08:57:42.959 | ERROR    | ddtrace.internal.logger:handle:158 - failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces
Traceback (most recent call last):
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 446, in _send_payload
    response = self._put(payload, headers)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 398, in _put
    self._conn.request("PUT", self._endpoint, data, headers)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 941, in connect
    self.sock = self._create_connection(
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 845, in create_connection
    raise err
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 833, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/Users/fredrik/code/repos/<REDACTED>.venv/lib/python3.10/site-packages/ddtrace/internal/periodic.py", line 73, in run
    self._target()
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 587, in periodic
    self.flush_queue(raise_exc=False)
> File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 560, in flush_queue
    self._retry_upload(self._send_payload, encoded, n_traces)
  File "/Users/fredrik/code/repos/<REDACTED>.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 361, in iter
    raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x1130932b0 state=finished raised ConnectionRefusedError>]
.....................In flush_queue
......................In flush_queue
......................In flush_queue
......................In flush_queue
......................In flush_queue
.....................In flush_queue
.....................In flush_queue
.......................In flush_queue
......................In flush_queue
.......................In flush_queue
......................In flush_queue
.......................In flush_queue
.......................In flush_queue
.....................In flush_queue
......................In flush_queue
.......................In flush_queue
.....................In flush_queue
.....................In flush_queue
.....................In flush_queue
.......................In flush_queue
......................In flush_queue
..........

======================================================== warnings summary ========================================================
In flush_queue
.venv/lib/python3.10/site-packages/pytest_freezegun.py:17: 45386 warnings
  /Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/pytest_freezegun.py:17: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    if LooseVersion(pytest.__version__) < LooseVersion('3.6.0'):

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform darwin, python 3.10.5-final-0 ----------
Coverage HTML written to dir htmlcov

===================================== 500 passed, 22193 deselected, 45386 warnings in 26.72s =====================================
In flush_queue

In case you're wondering what In flush_queue means, it's just me adding a print in this function:

    def flush_queue(self, raise_exc=False):
        print("In flush_queue")
        # type: (bool) -> None

@majorgreys
Copy link
Contributor

majorgreys commented Sep 12, 2022

I'm not using tracer.configure() and I'm only relying on environment variables. However, I also printed out enabled, but it doesn't get printed. It seems importing enabled somehow stops print from printing anything:

@fredrikaverpil That import does not work. The enabled property is on the global tracer object, ddtrace.tracer.

You'll want to use something like this for debugging:

from ddtrace import tracer
print(f"{tracer.enabled=}", flush=True)

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 12, 2022

@fredrikaverpil Just to add on, how are you running ddtrace and setting the environment variable DD_TRACE_ENABLED=false?

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 12, 2022

@fredrikaverpil Just to add on, how are you running ddtrace and setting the environment variable DD_TRACE_ENABLED=false?

We wrap our microservices with ddtrace-run in Dockerfile:

FROM python:3.10-slim-bullseye

...

ENTRYPOINT ["ddtrace-run", "python"]
CMD ["app.py"]

But when I run pytest I don't run ddtrace at all, and this is where I am having these issues. Somehow tracing is enabled when running pytest.

I'm running this command in my terminal with activated venv:

pytest --count=100 -s -k <test_name>

I have the following pytest packages installed:

pytest                      7.1.2
pytest-cov                  3.0.0
pytest-dotenv               0.5.2
pytest-freezegun            0.4.2
pytest-mock                 3.8.2
pytest-randomly             3.12.0
pytest-repeat               0.9.1
python-dateutil             2.8.2
python-dotenv               0.20.0

and I've got the following in my pyproject.toml:

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "--cov --cov-report=html --log-level=INFO -rxXs"

In the .env file, I've got, among other things:

TEST=1
DD_TRACE_ENABLED=false
LOG_LEVEL="INFO"
PYTHONDEVMODE="1"

@fredrikaverpil
Copy link
Author

You'll want to use something like this for debugging:

from ddtrace import tracer

print(f"{tracer.enabled=}", flush=True)

Right! I got this:

tracer.enabled=True

@ZStriker19
Copy link
Contributor

ZStriker19 commented Sep 12, 2022

Hi Fred, taking over for Yun here.

Could you also try just printing out the environment variable value for DD_TRACE_ENABLED so we can check all our bases?

enabled = os.getenv("DD_TRACE_ENABLED")
print(f"DD_TRACE_ENABLED= {enabled}", flush=True)

It seems that for some reason the tracer is still running with your test code despite you not running with ddtrace-run (unless hidden somewhere?) in the tests.

My original thought was that you were enabling the tracer somewhere in your code by running patch() or patch_all(), however I tested this and DD_TRACE_ENABLED=false actually keeps both ddtrace-run and the patch methods from generating traces. Therefore... I'm wondering if the environment variable DD_TRACE_ENABLED=false is getting picked up or not.

@fredrikaverpil
Copy link
Author

Could you also try just printing out the environment variable value for DD_TRACE_ENABLED so we can check all our bases?

I actually already did that in #4179 (comment)

Yeah, this is a mighty weird one. I'll dig into this tomorrow (late night here now) to see if I can narrow this down further. To me it seems something is enabling the tracer even if the environment variable is present and set to "false".

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 13, 2022

I think I just found what is causing all this.

I have this in one place in my code that gets executed when running tests in pytest:

@tracer.wrap(service="xxx", resource="yyy")
def run():
    ...

As soon as I comment this line out, I can no longer see the error log and traceback. I've reproduced this now like 20 times, so I am 100% sure about this.

The @tracer.wrap is decorating in a function used by a pytest fixture which is used to setup/teardown db migrations for my integration tests. I added debug prints again inside of this function and this is the output:

@tracer.wrap(service="xxx", resource="yyy")
def run():
    logger.info(f"{os.getenv('DD_TRACE_ENABLED')=}")
    logger.info(f"{tracer.enabled=}")
2022-09-13 08:25:04.069 | INFO     | <REDACTED>:34 - os.getenv('DD_TRACE_ENABLED')='false'
2022-09-13 08:25:04.069 | INFO     | <REDACTED>:35 - tracer.enabled=True

What's weird here is that even if I remove the @tracer.wrap, I get the same logged output as above, indicating the the tracer is still enabled.

But I can 100% verify here that when removing @tracer.wrap, I no longer get the ddtrace error log + traceback.

@ZStriker19
Copy link
Contributor

That is odd as DD_TRACE_ENABLED should directly toggle tracer.enabled: https://github.com/DataDog/dd-trace-py/blob/1.x/ddtrace/tracer.py#L195

The only other thing you should be able to do to modify tracer.enabled afterwards is calling configure(): https://github.com/DataDog/dd-trace-py/blob/1.x/ddtrace/tracer.py#L317

I could only recreate this behavior when calling configure(). Going to sync with team and see what we can come up with.

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 13, 2022

Could there be a regression issue?

See this issue: #3671

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 13, 2022

@fredrikaverpil If this is a regression issue of the pytest plugin restarting a tracer, could you try disabling the plugin as specified on #3674? Specifically in your pytest config file, add:

[pytest]
addopts = -p no:ddtrace

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 15, 2022

@Yun-Kim doing that made no difference. I still see the traceback and tracer.enabled=True.

So I just removed any ddtrace imports everywhere from my codebase. But I still have ddtrace installed in my venv. When I check sys.modules for ddtrace (from within a test running via pytest), I see it has somehow been imported anyways and shows up there. If I run pip uninstall ddtrace, I no longer see ddtrace in sys.modules (and there are no missing imports or crashes because my codebase is simply not importing ddtrace).

Do you have some sort of hook that plugs into pytest and enables the tracer?

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 15, 2022

@fredrikaverpil If this is a regression issue of the pytest plugin restarting a tracer, could you try disabling the plugin as specified on #3674? Specifically in your pytest config file, add:

[pytest]
addopts = -p no:ddtrace

Just to confirm, this did not disable the plugin on your end? We do have hooks in our pytest plugin which does start a global tracer, but as long as you set your environment variable DD_TRACE_ENABLED=false and disabled the plugin as specified above this shouldn't matter.

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 15, 2022

Just to confirm, this did not disable the plugin on your end?

Correct. This did not disable the tracer.

I have this in my pyproject.toml:

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "--cov --cov-report=html -rxXs"

...and so I added what you suggested:

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "--cov --cov-report=html -rxXs -p no:ddtrace"

But the error log and the traceback still appears in the terminal, if I have a @tracer.wrap.

All in all, to me it seems there has to be something off with the pytest plugin, as I've removed all ddtrace imports and all ddtrace from my codebase, and ddtrace still shows up in sys.modules.

When I loop through sys.modules, the last key that contains the word ddtrace is ddtrace.contrib.pytest_bdd.plugin. Coincidence?

    mods = []
    for mod in sys.modules:
        if "ddtrace" in mod:
            mods.append(sys.modules[mod])

image

image

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 15, 2022

From the ddtrace.contrib.pytest.__init__.py and ddtrace.contrib.pytest_bdd.__init__.py:

The ddtrace.pytest_bdd plugin for pytest-bdd has the side effect of importing the ddtrace package and starting a global tracer.

I'm now also running with these pyproject.toml settings but I still see the error log and the traceback and now the error log and the traceback is finally GONE! 🎉 :

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "-p no:ddtrace -p no:ddtrace.pytest_bdd"

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 15, 2022

@Yun-Kim so it seems by adding both options -p no:ddtrace -p no:ddtrace.pytest_bdd solved the issue.

Additional info, not sure if it is helpful:

I've put some logging in the asbool function and in the tracer's __init__ method:

2022-09-15 22:40:51.401 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.422 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.422 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:40:51.422 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns True based on: value=True
2022-09-15 22:40:51.422 | WARNING  | ddtrace.tracer:__init__:197 - self.enabled=True
2022-09-15 22:40:51.423 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.423 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.423 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.423 | WARNING  | ddtrace.tracer:__init__:242 - self.enabled=True

If I then use the pytest opts -p no:ddtrace -p no:ddtrace.pytest_bdd, I see no log output anymore.

And finally, if I keep the pytest opts above enabled, and import ddtrace in my code and also add a @tracer.wrap, I see how these logs appear again:

2022-09-15 22:42:28.215 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:42:28.247 | WARNING  | ddtrace.tracer:__init__:197 - self.enabled=False
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.248 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.248 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.248 | WARNING  | ddtrace.tracer:__init__:242 - self.enabled=False

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 16, 2022

@fredrikaverpil fantastic! Happy to hear you were able to resolve the issue 🥳

@Yun-Kim Yun-Kim closed this as completed Sep 16, 2022
@fredrikaverpil
Copy link
Author

fredrikaverpil commented Sep 17, 2022

@Yun-Kim yes, finally 😅

However, don't you think there's a problem here with ddtrace silently logging a traceback (the raised exception's exc_info) about not being able to connect when you run pytest and have a fixture using code wrapped with @tracer.wrap?

I mean, it seems to me it should've been enough to set DD_TRACE_ENABLED=false but in this scenario I also need to add the pytest options -p no:ddtrace -p no:ddtrace.pytest_bdd, which wasn't clear to me at all.

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 19, 2022

You're right, it is problematic that our pytest plugins don't check the value of DD_TRACE_ENABLED. I'm reopening this issue just to track this problem and for now, you can keep using the workaround -p no:ddtrace -p no:ddtrace.pytest_bdd and we'll work on getting a fix out. Thanks!

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Sep 20, 2022

Hi @fredrikaverpil, unfortunately we still weren't able to reproduce your issue on our end. The proposed fix #4204 doesn't directly address the underlying issue so we can't merge it until we figure out why we're seeing the pytest plugin disregards the value of DD_TRACE_ENABLED. Could you hop on to the Datadog Public Slack and/or open a support ticket (https://www.datadoghq.com/support/) with us so that we can have a deeper dive into this issue? Thanks!

@fredrikaverpil
Copy link
Author

@Yun-Kim hi, sorry for the late reply. I'll drop by in Slack and maybe we can schedule a session or something.

@github-actions github-actions bot added the stale label Oct 29, 2022
@fredrikaverpil
Copy link
Author

fredrikaverpil commented Nov 9, 2022

I'm opening a support case as this re-emerged for us.

I'll first post a reproducible example below 👇

@fredrikaverpil
Copy link
Author

fredrikaverpil commented Nov 9, 2022

@Yun-Kim so I think we've had some issues where we have been using dotenv file solutions to set DD_TRACE_ENABLED=false. I believe that the variable may then have been set when ddtrace already enabled the tracer (or something like this). So I think we now have finally gotten to the bottom of this problem in CI 🤞

However, the problem remains for local development, as the pytest options -p no:ddtrace -p no:ddtrace.pytest_bdd does not work. I am almost certain this used to work at some point. But here I have posted a reproducible example of where these pytest options do not disable the tracer: https://github.com/fredrikaverpil/ddcrash

It would be great to have a way to tell pytest not to enable tracing (preferably as a commandline option), without requiring the environment variable DD_TRACE_ENABLED set in the environment.

@fredrikaverpil
Copy link
Author

You may wonder why my test is creating a custom span. Well, the reason is that this is what the source code being tested is doing. For e.g. kubernetes cronjobs, we don't have incoming traces but we would still like to have errors reported into APM. We also do this for unexpected crashes via sys.excepthook.

@github-actions github-actions bot removed the stale label Nov 10, 2022
@firezdog
Copy link

I also am observing this behavior, although I'm not sure if it goes away when setting the flag or not -- but what's odd is that we use pytest-socket and the test does not fail even though all socket calls should be blocked and cause failures.

@turya
Copy link

turya commented Dec 9, 2022

I'm observing this as well. pytest always hangs if ddtrace is installed, it always installs the plugin, none of the suggestions above disable it. As suggested above, the plugin should at the very least respect the DD_TRACE_ENABLED=false env variable.

@turya
Copy link

turya commented Dec 9, 2022

In fact, I strongly believe that the default behavior should be for it to be off unless it's explicitly enabled.

@turya
Copy link

turya commented Dec 9, 2022

If it helps at all, ddtrace is version 1.6.3

root@f60ede01fe14:/app# env | grep DD
DD_PROFILING_ENABLED=false
DD_TRACE_ENABLED=false
root@f60ede01fe14:/app# pytest -p no:ddtrace -p no:ddtrace.pytest_bdd
================================================================ test session starts ================================================================
platform linux -- Python 3.9.16, pytest-7.2.0, pluggy-1.0.0
rootdir: /app, configfile: .pytest.ini
collected 17 items      

one of two things happens here

========================================================== 17 passed, 9 warnings in 1.73s ===========================================================
Unable to export profile: ddtrace.profiling.exporter.http.UploadFailed: [Errno 99] Cannot assign requested address. Ignoring.

or, hangs and ctrl-c

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/profiler.py", line 88, in stop
    self._profiler.stop(flush)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/internal/service.py", line 85, in stop
    self._stop_service(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/profiler.py", line 293, in _stop_service
    self._scheduler.flush()
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/scheduler.py", line 53, in flush
    exp.export(events, start, self._last_export)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/exporter/http.py", line 217, in export
    self._upload(client, self.endpoint_path, body, headers)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/exporter/http.py", line 222, in _upload
    self._retry_upload(self._upload_once, client, path, body, headers)
  File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 416, in __call__
    self.sleep(do)
  File "/usr/local/lib/python3.9/site-packages/tenacity/nap.py", line 31, in sleep
    time.sleep(seconds)
KeyboardInterrupt

@turya
Copy link

turya commented Dec 9, 2022

fwiw, nosetests also hangs after running trying to send it's profile, even though the env variables say it should not, so maybe not just a pytest thing. Maybe don't automatically turn on your tooling by default.

@emmettbutler
Copy link
Collaborator

Manually doing what the Github Action was supposed to do (fixed in #7835):

This issue has been automatically closed after six months of inactivity. If it's a feature request, it has been added to the maintainers' internal backlog and will be included in an upcoming round of feature prioritization. Please comment or reopen if you think this issue was closed in error.

@emmettbutler emmettbutler closed this as not planned Won't fix, can't repro, duplicate, stale Dec 4, 2023
@followben
Copy link

I also am observing this behavior - in my case the tooling isn't respecting our dotenv under local development. Pretty disappointing it hasn't been addressed after @fredrikaverpil provided a minimal repro. Please re-open.

@romainkomorndatadog romainkomorndatadog removed their assignment Feb 14, 2024
@romainkomorndatadog
Copy link
Collaborator

This isn't exactly a CI Visibility product issue, so I have to relinquish this back to our core folks. @mabdinur , I'm not sure if you want to reopen this or track it elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants