Error in plugin: performing bulk walk for field myfield-program: Request timeout (after 3 retries) #8271

waqaskhan137 · 2020-10-15T01:19:44Z

Every SNMP command like snmpwalk, snmptable and snmpbulkwalk are working but telegraf commands are being timeout.

Relevant telegraf.conf:

[agent]
    interval = "60s"
    debug = true
    hostname = "10.32.7.170"
    round_interval = true
    flush_interval = "10s"
    flush_jitter = "0s"
    collection_jitter = "0s"
    metric_batch_size = 1000
    metric_buffer_limit = 10000
    quiet = false
    logfile = "/var/log/telegraf/telegraf.log"
    omit_hostname = true

[[outputs.influxdb]]
    urls = ["http://influxdb:8086"]
    database = "monitoring"
    timeout = "0s"
    username = "admin"
    password = "P@ssword11"
    retention_policy = ""

[[inputs.snmp]]
  agents = ["udp://172.21.53.59:1061"]
  interval = "60s"
  timeout = "5s"
  version = 3
  sec_name = "nms-user"
  auth_protocol = "SHA"
  auth_password = "secret12"
  sec_level = "authPriv"
  priv_protocol = "AES"
  priv_password = "secret12"

[[inputs.snmp.table]]
name = "averageSpeedOfAnswerVdn"
oid = "OVERSIGHT-MIB::myfield"

System info:

Telegraf 1.15.3

Docker

version: "3"
services:
  influxdb:
    container_name: nms-influxdb
    image: influxdb
    environment:
      - INFLUXDB_DB=monitoring
      - INFLUXDB_ADMIN_USER=admin
      - INFLUXDB_ADMIN_PASSWORD=Password
      -
    ports:
      - "8083:8083"
      - "8086:8086"
    volumes:
      - influxdb-data:/var/lib/influxdb
    restart: always

  grafana:
    container_name: nms-grafana
    image: grafana/grafana
    ports:
      - "3002:3000"
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning
    restart: always

  telegraf:
    container_name: nms-telegraf
    image: telegraf
    ports:
      - "162:162/udp"
    volumes:
      - ./telegraf/telegraf.conf:/etc/telegraf/telegraf.conf
      - /var/run/docker.sock:/var/run/docker.sock
    restart: always

volumes:
  influxdb-data:

Steps to reproduce:

Start telegraf with the given configurations
observe logs

Expected behavior:

It should collect data and inset in influxdb

Actual behavior:

2020-10-14T21:51:00Z W! [inputs.snmp] Collection took longer than expected; not complete after interval of 1m0s
2020-10-14T21:51:22Z E! [inputs.snmp] Error in plugin: agent udp://172.21.53.59:1061: gathering table myfield: performing bulk walk for field myfield-program: Request timeout (after 3 retries)
2020-10-14T21:52:00Z W! [inputs.snmp] Collection took longer than expected; not complete after interval of 1m0s
2020-10-14T21:52:42Z E! [inputs.snmp] Error in plugin: agent udp://172.21.53.59:1061: gathering table averageSpeedOfAnswerVdn: performing bulk walk for field averageSpeedOfAnswerVdn-program: Request timeout (after 3 retries)
2020-10-14T21:53:00Z W! [inputs.snmp] Collection took longer than expected; not complete after interval of 1m0s

Additional info:

The text was updated successfully, but these errors were encountered:

reimda · 2020-10-15T23:28:01Z

From the errors it looks like the snmp agent is not responding. You may want to make sure you can get the table using the snmptable command from net-snmp to confirm that the snmp agent is responding before trying to get it with telegraf.

It might also help to try again with telegraf but without docker to rule out docker as the problem.

waqaskhan137 · 2020-10-15T23:41:48Z

From the errors it looks like the snmp agent is not responding. You may want to make sure you can get the table using the snmptable command from net-snmp to confirm that the snmp agent is responding before trying to get it with telegraf.

It might also help to try again with telegraf but without docker to rule out docker as the problem.

Thank you for reply @reimda

snmptable and snmpwalk or snmpbulkwalk are working fine but it seems to be the only thing with the telegraf.
And I don't see any problem with the docker yml file is attached in the question above.

affanshahid · 2020-11-29T23:22:43Z

Having the same problem. Running telegraf in a docker container causes: ...performing bulk walk for field ...: Request timeout (after 3 retries). If I remove docker however everything works fine.

affanshahid · 2020-11-30T00:49:54Z

So more information: I can open bash inside the telegraf container and run snmptable and snmpget against the agent just fine. The agent is running on the same machine. So basically snmpget works fine from the host machine and from inside the telegraf container but telegraf itself fails with the above error.

Running telegraf on the machine directly also works just fine. Also I tried adding the agent to the telegraf container network and using docker's inter-container networking and everything started working perfectly.

Edit: Running the agent on a separate machine also works fine. Also using https://github.com/qoomon/docker-host as a middle-man also works just fine.

byrdchris · 2020-12-17T16:28:53Z

I had this exact issue on Centos 7.9 outside of docker.
Telegraf, snmpwalk, snmptable, etc tests worked without issue but any version of Telegraf from 1.14-1.16 hit the "[inputs.snmp] Collection took longer than expected;" after approximately 10-12 checks

I adjusted timeouts, intervals, connection jitter etc, and tested against a few agents all with similar results.

I then tested my same configuration, but with SNMP v2 and have had zero failures.

telegraf-tiger · 2021-01-16T16:53:07Z

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Page. Thank you!

waqaskhan137 added the bug unexpected problem or unintended behavior label Oct 15, 2020

ssoroka added the area/snmp label Oct 15, 2020

reimda added support Telegraf questions, may be directed to community site or slack and removed bug unexpected problem or unintended behavior labels Oct 15, 2020

telegraf-tiger bot closed this as completed Jan 16, 2021

flyinghuman mentioned this issue Sep 28, 2021

Timeout with inputs.snmp polling squid-cache #9286

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in plugin: performing bulk walk for field myfield-program: Request timeout (after 3 retries) #8271

Error in plugin: performing bulk walk for field myfield-program: Request timeout (after 3 retries) #8271

waqaskhan137 commented Oct 15, 2020 •

edited

reimda commented Oct 15, 2020

waqaskhan137 commented Oct 15, 2020

affanshahid commented Nov 29, 2020

affanshahid commented Nov 30, 2020 •

edited

byrdchris commented Dec 17, 2020

telegraf-tiger bot commented Jan 16, 2021

Error in plugin: performing bulk walk for field myfield-program: Request timeout (after 3 retries) #8271

Error in plugin: performing bulk walk for field myfield-program: Request timeout (after 3 retries) #8271

Comments

waqaskhan137 commented Oct 15, 2020 • edited

Relevant telegraf.conf:

System info:

Docker

Steps to reproduce:

Expected behavior:

Actual behavior:

Additional info:

reimda commented Oct 15, 2020

waqaskhan137 commented Oct 15, 2020

affanshahid commented Nov 29, 2020

affanshahid commented Nov 30, 2020 • edited

byrdchris commented Dec 17, 2020

telegraf-tiger bot commented Jan 16, 2021

waqaskhan137 commented Oct 15, 2020 •

edited

affanshahid commented Nov 30, 2020 •

edited