Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Datadog container erroring out trying to connect to some redis server #13828

Closed
adimyth opened this issue Oct 11, 2022 · 7 comments
Closed

Comments

@adimyth
Copy link

adimyth commented Oct 11, 2022

Agent Environment

  • Agent version 7
  • Environment - docker

Describe what happened

I am running the datadog container as one of the services in docker compose. I am running Agent: 7 for my purposes.

version: "3.9"

services:
  app:
    image: app
    container_name: app
    hostname: app
    build:
      context: .
      dockerfile: Dockerfile
    restart: unless-stopped
    ports:
      - 8080:80
    volumes:
      - shared_volume:/tmp/logs

  datadog:
    container_name: dd-agent
    image: gcr.io/datadoghq/agent:7
    restart: always
    ports:
      - 8125:8125/udp
      - 8126:8126
    environment:
      - DD_API_KEY=${DATADOG_API_KEY}
      - DD_SITE=${DD_SITE}
      - DD_DOGSTATSD_NON_LOCAL_TRAFFIC=${DD_DOGSTATSD_NON_LOCAL_TRAFFIC}
      - DD_LOGS_ENABLED="true"
      - DD_APM_ENABLED="true"
      - DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL="true"
      - DD_CONTAINER_EXCLUDE_LOGS="name:dd-agent"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /proc/:/host/proc/:ro
      # - /opt/dd-agent/run:/opt/dd-agent/run:rw
      - /sys/fs/cgroup/:/host/sys/fs/cgroup:ro


volumes:
  shared_volume:

However running the datadog container runs into an error. The error log says that it's trying to connect to a redis server. I am not sure where is this coming from, as I don't recollect redis being one of the dependencies for datadog.

error log

Pasted same log below for convenience -

dd-agent  | 2022-10-11 10:13:53 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:php_fpm | Error running check: [{"message": "Detected 1 error while loading configuration model `InstanceConfig`:\n__root__\n  Field `status_url` or `ping_url` must be set", "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1091, in run\n    initialization()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 492, in load_configuration_models\n    instance_config = self.load_configuration_model(package_path, 'InstanceConfig', raw_instance_config)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 536, in load_configuration_model\n    raise_from(ConfigurationError('\\n'.join(message_lines)), None)\n  File \"<string>\", line 3, in raise_from\ndatadog_checks.base.errors.ConfigurationError: Detected 1 error while loading configuration model `InstanceConfig`:\n__root__\n  Field `status_url` or `ping_url` must be set\n"}]
dd-agent  | 2022-10-11 10:13:57 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:redisdb | Error running check: [{"message": "Timeout connecting to server", "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 611, in connect\n    sock = self.retry.call_with_retry(\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 51, in call_with_retry\n    raise error\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 46, in call_with_retry\n    return do()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 612, in <lambda>\n    lambda: self._connect(), lambda error: self.disconnect(error)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 677, in _connect\n    raise err\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 665, in _connect\n    sock.connect(socket_address)\nsocket.timeout: timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1116, in run\n    self.check(instance)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 556, in check\n    self._check_db()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 205, in _check_db\n    info = conn.info()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/commands/core.py\", line 970, in info\n    return self.execute_command(\"INFO\", **kwargs)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/client.py\", line 1235, in execute_command\n    conn = self.connection or pool.get_connection(command_name, **options)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 1387, in get_connection\n    connection.connect()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 615, in connect\n    raise TimeoutError(\"Timeout connecting to server\")\nredis.exceptions.TimeoutError: Timeout connecting to server\n"}]

Describe what you expected

For the container to boot up without any issues 🤷‍♂️

Steps to reproduce the issue

Just start a datadog container. It would fail

Additional environment details (Operating System, Cloud provider, etc)

  • Local setup (Macbook Pro M1)
  • Running as docker using docker-compose
@scottopell
Copy link
Contributor

This looks like it's trying to run the redis integration and failing due to some misconfiguration, are you running any redis containers in docker?

@adimyth
Copy link
Author

adimyth commented Oct 14, 2022

Nope, not at all

@adimyth
Copy link
Author

adimyth commented Oct 14, 2022

Simple docker run without passing any configuration, fails as well

@scottopell
Copy link
Contributor

Could you provide the output of agent configcheck and agent status? These are commands you can run inside the agent container that will provide more data about what the agent has detected and is trying to run.

These two logs you have posted are trying to run two checks, one called php_fpm and one called redisdb. The first command should provide data about where the configuration for these checks are coming from.

@adimyth
Copy link
Author

adimyth commented Nov 11, 2022

I think I had a stale redis container, which the datadog was trying to track as well. I modified my datadog service to explicitly track only a selected container & ignore the rest -

  datadog:
    container_name: dd-agent
    image: gcr.io/datadoghq/agent:7
    restart: always
    ports:
      - 8125:8125/udp
      - 8126:8126
    environment:
      - DD_API_KEY=${DATADOG_API_KEY}
      - DD_SITE=${DD_SITE}
      - DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true
      - DD_LOGS_ENABLED=true
      - DD_APM_ENABLED=true
      - DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true
      # exclude all containers from autodiscovery
      - DD_CONTAINER_EXCLUDE = "name:.*"
      # track only below containers
      - DD_CONTAINER_INCLUDE="name:my_application"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /proc/:/host/proc/:ro
      # - /opt/dd-agent/run:/opt/dd-agent/run:rw
      - /sys/fs/cgroup/:/host/sys/fs/cgroup:ro

@adimyth adimyth closed this as completed Nov 11, 2022
@jason-hwang
Copy link

jason-hwang commented Aug 8, 2023

Is there any update for this error?
I installed the agent exactly as per the following guide, but I am still getting the error below.

docker command:

export DD_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export DD_AGENT_VERSION=7.36.1

docker run -e "DD_API_KEY=${DD_API_KEY}" \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -l com.datadoghq.ad.check_names='["mysql"]' \
  -l com.datadoghq.ad.init_configs='[{}]' \
  -l com.datadoghq.ad.instances='[{
    "dbm": true,
    "host": "<AWS_INSTANCE_ENDPOINT>",
    "port": 3306,
    "username": "datadog",
    "password": "<UNIQUEPASSWORD>"
  }]' \
  gcr.io/datadoghq/agent:${DD_AGENT_VERSION}

errors:

2023-08-08 01:15:13 UTC | TRACE | INFO | (run.go:243 in Info) | No data received
2023-08-08 01:15:15 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:redisdb | Error running check: [{"message": "Timeout connecting to server", "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 614, in connect\n    sock = self.retry.call_with_retry(\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 50, in call_with_retry\n    raise error\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 45, in call_with_retry\n    return do()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 615, in <lambda>\n    lambda: self._connect(), lambda error: self.disconnect(error)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 680, in _connect\n    raise err\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 668, in _connect\n    sock.connect(socket_address)\nsocket.timeout: timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1120, in run\n    self.check(instance)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 552, in check\n    self._check_db()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 203, in _check_db\n    info = conn.info()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/commands/core.py\", line 900, in info\n    return self.execute_command(\"INFO\", **kwargs)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/client.py\", line 1192, in execute_command\n    conn = self.connection or pool.get_connection(command_name, **options)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 1386, in get_connection\n    connection.connect()\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 618, in connect\n    raise TimeoutError(\"Timeout connecting to server\")\nredis.exceptions.TimeoutError: Timeout connecting to server\n"}]

@varunpalekar
Copy link

I am also getting the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants