Skip to content

[[inputs.diskio]] not honoring HOST_MOUNT_PREFIX / Missing HOST_DEV environment variable #18671

@Tschebbischeff

Description

@Tschebbischeff

Relevant telegraf.conf

[[inputs.net]] # https://docs.influxdata.com/telegraf/v1/input-plugins/net/
  interfaces = ["eth*", "enp*", "wlan*"]

Logs from Telegraf

W! Strict environment variable handling is the new default starting with v1.38.0! If your configuration does not work with strict handling please explicitly add the --non-strict-env-handling flag to switch to the previous behavior!
I! Loading config: /etc/telegraf/telegraf.conf
I! Starting Telegraf 1.38.2 brought to you by InfluxData the makers of InfluxDB
I! Available plugins: 244 inputs, 9 aggregators, 35 processors, 26 parsers, 68 outputs, 8 secret-stores
I! Loaded inputs: cpu disk diskio docker mem mqtt_consumer (2x) net
I! Loaded aggregators:
I! Loaded processors:
I! Loaded secretstores: docker_secrets
I! Loaded outputs: health influxdb_v3
I! Tags enabled: host=__REDACTED__
I! [agent] Config: Interval:10s, Quiet:false, Hostname:"__REDACTED__", Flush Interval:10s
I! [outputs.health] Listening on http://[::]:8080
I! [inputs.mqtt_consumer] Connected [tcp://__REDACTED___mosquitto:1883]
I! [inputs.mqtt_consumer] Connected [tcp://__REDACTED___mosquitto:1883]
W! [inputs.diskio] Unable to gather disk name for "mmcblk0p1": error reading /dev/mmcblk0p1: no such file or directory
W! [inputs.diskio] Unable to gather disk name for "mmcblk0p2": error reading /dev/mmcblk0p2: no such file or directory
W! [inputs.diskio] Unable to gather disk name for "zram0": error reading /dev/zram0: no such file or directory
W! [inputs.diskio] Unable to gather disk name for "mmcblk0": error reading /dev/mmcblk0: no such file or directory

System info

Telegraf 1.38.2, Linux Host (NixOS 25.11), Docker Image: telegraf (latest)

Docker

  telegraf:
    # build: # Removed for minimal diskio-issue-check config, note that I am running telegraf as root to access secrets properly, so I needed a custom entrypoint that just starts telegraf ignoring the root check in the original entrypoint, but this should not affect the diskio issue
    #   context: ./telegraf
    image: telegraf:latest
    container_name: telegraf
    # restart: always # Removed for minimal diskio-issue-check config
    # user: "root" # Removed for minimal diskio-issue-check config
    # env_file: # Removed for minimal diskio-issue-check config
    #   - ./.env
    #   - path: ${ENV_DIR}/.env
    #     required: false
    #   - path: ${ENV_DIR}/telegraf.env
    #     required: false
    environment:
      HOST_PROC: '/hostfs/proc'
      HOST_SYS: '/hostfs/sys'
      HOST_ETC: '/hostfs/etc'
      HOST_VAR: '/hostfs/var'
      HOST_RUN: '/hostfs/run'
      HOST_DEV: '/hostfs/dev'
      HOST_MOUNT_PREFIX: '/hostfs'
      HOST_ROOT: '/hostfs'
    # secrets: # Removed for minimal diskio-issue-check config
    #   - INFLUXDB_ADMIN_TOKEN
    volumes:
      - ./telegraf/config/telegraf.conf:/etc/telegraf/telegraf.conf:ro
      - /proc:/hostfs/proc:ro
      - /sys:/hostfs/sys:ro
      # - /etc:/hostfs/etc:ro
      # - /var:/hostfs/var:ro
      # - /run:/hostfs/run:ro
      - /run/udev:/hostfs/run/udev:ro
      - /dev:/hostfs/dev:ro
      # - /etc/os-release:/hostfs/etc/os-release:ro
    # networks: # Removed for minimal diskio-issue-check config
    #   internal:
    #     ipv4_address: '172.18.0.11'
    # healthcheck: # Removed for minimal diskio-issue-check config
    #   test: [ "CMD", "wget", "--spider", "-q", "http://127.0.0.1:8080/health" ]
    #   interval: 30s
    #   timeout: 5s
    #   retries: 5
    #   start_period: 15s
    # depends_on: # Removed for minimal diskio-issue-check config
    #   socket-proxy:
    #     condition: service_healthy

Steps to reproduce

  1. Start a docker image with the [[inputs.diskio]] input without giving the container privileged access and make sure to override all env vars needed my psutils/ telegraf (I've pasted my config and commented out what I feel isn't necessary for the test, I did not mount /etc, /var and /run)
  2. Observe how the telegraf container uses /dev/$deviceName instead of /hostfs/dev/$deviceName and reports missing devices (requires host to have devices the container doesn't have of course)

Expected behavior

Devices are searched for in /hostfs/dev according to the environment variables $HOST_DEV and/ or $HOST_MOUNT_PREFIX and/or $HOST_ROOT

Actual behavior

Host devices are searched for in /dev and reported as not being found

Additional info

This line of code hardcodes /dev/{deviceName} and should use the environment variable or a function returning the path from the environment variable similar to getProcPath():

path := "/dev/" + devName

HOST_DEV and HOST_ROOT are variable names I found in the psutils documentation, since the other plugins use the same environment variables and diskio uses psutils like them, I think HOST_DEV should be used here, but if you do not want to support this variable (it's not mentioned in your documentation) it should at least honor HOST_MOUNT_PREFIX.

Notably I still receive data about these devices in my InfluxDB so it is not breaking the metrics.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugunexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions