Error while sending encoded metrics: write tcp ... write: broken pipe #1066

Atomicbeast101 · 2018-09-11T18:48:59Z

Host operating system: output of `uname -a`

Linux RPT-PROD-SMB-1 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

node_exporter version: output of `node_exporter --version`

node_exporter, version 0.16.0 (branch: HEAD, revision: d42bd70f4363dced6b77d8fc311ea57b63387e4f)
  build user:       root@a67a9bc13a69
  build date:       20180515-15:52:42
  go version:       go1.9.6

node_exporter command line flags

./node_exporter --web.listen-address=0.0.0.0:9100 --collector.systemd

Are you running node_exporter in Docker?

No.

What did you do that produced an error?

Every time the metrics are viewed through browser (Firefox) or Prometheus tries to scrape the metrics.

What did you expect to see?

To return the metrics in a timely manner and not report broken-pipe errors.

What did you see instead?

Every time the metrics are viewed, this error is produced:

Sep 11 14:42:09 localhost node_exporter[3030]: time="2018-09-11T14:42:09-04:00" level=error msg="error while sending encoded metrics: write tcp 127.0.0.1:9100->127.0.0.1:58728: write: broken pipe\n" source="<autogenerated>:1"

I noticed viewing the metrics took longer than before (worked before Sept. 6th and nothing was changed during that time).

For anyone curious, Prometheus reports this error: context deadline exceeded

Tried testing it through curl command and it does report the output but it took longer than usual (at least 10 seconds, instead of 1-3 seconds). However, the error is still reported through node_exporter service.

The text was updated successfully, but these errors were encountered:

SuperQ · 2018-09-11T19:09:08Z

Try with --log.level=debug, this should give you more information about which collector is slow.

Atomicbeast101 · 2018-09-11T19:18:57Z

My apologies for not mentioning something, --log.level=debug does not report anything before or after the error message.

SuperQ · 2018-09-11T20:08:08Z

That's fine, but the context of the error is important to understand the timing. Having the full log for a failed request is required to determine the fault.

Atomicbeast101 · 2018-09-11T20:30:11Z

Understood. Here's the full log with the error message included:

Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="collect query: []" source="node_exporter.go:36"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: filefd collector succeeded after 0.000086s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: nfsd collector succeeded after 0.000271s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: sockstat collector succeeded after 0.000131s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Not collecting NFS metrics: open /proc/net/rpc/nfs: no such file or directory" source="nfs_linux.go:97"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: nfs collector succeeded after 0.000067s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: netdev collector succeeded after 0.000588s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Return time: 1536697688.759848" source="time.go:47"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: time collector succeeded after 0.000052s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/abdstats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/arcstats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/dnodestats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/fm\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/vdev_mirror_stats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/xuio_stats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/zil\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/dbuf_stats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/dmu_tx\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/vdev_cache_stats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Cannot open \"/proc/spl/kstat/zfs/zfetchstats\" for reading" source="zfs_linux.go:48"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ZFS / ZFS statistics are not available" source="zfs.go:66"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: zfs collector succeeded after 0.000538s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: bcache collector succeeded after 0.000024s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="ipvs collector metrics are not available for this system" source="ipvs_linux.go:113"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: ipvs collector succeeded after 0.000047s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Unable to detect InfiniBand devices" source="infiniband_linux.go:110"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: infiniband collector succeeded after 0.000056s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: uname collector succeeded after 0.000015s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Set node_mem: map[string]float64{\"Inactive_bytes\":2.55762432e+09, \"Mlocked_bytes\":0, \"Writeback_bytes\":0, \"AnonPages_bytes\"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: meminfo collector succeeded after 0.001175s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: stat collector succeeded after 0.000278s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="return load 0: 0.410000" source="loadavg.go:51"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="return load 1: 0.410000" source="loadavg.go:51"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="return load 2: 0.410000" source="loadavg.go:51"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: loadavg collector succeeded after 0.000133s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: entropy collector succeeded after 0.000063s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: netstat collector succeeded after 0.001428s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: hwmon collector succeeded after 0.001529s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: arp collector succeeded after 0.000083s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Not collecting bonding, file does not exist: /sys/class/net" source="bonding_linux.go:60"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: bonding collector succeeded after 0.000048s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: vmstat collector succeeded after 0.000287s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /proc" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /dev" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/kernel/security" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /dev/shm" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /dev/pts" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/systemd" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/pstore" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/devices" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/perf_event" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/pids" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/net_cls,net_prio" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/hugetlb" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/cpuset" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/memory" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/freezer" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/cpu,cpuacct" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/cgroup/blkio" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/kernel/config" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /proc/sys/fs/binfmt_misc" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/kernel/debug" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /dev/hugepages" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /dev/mqueue" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /proc/fs/nfsd" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring fs type: rpc_pipefs" source="filesystem_linux.go:46"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /proc/sys/fs/binfmt_misc" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring mount point: /sys/fs/fuse/connections" source="filesystem_linux.go:42"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: filesystem collector succeeded after 10.004162s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: edac collector succeeded after 0.000310s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: timex collector succeeded after 0.000015s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring device: sdc1" source="diskstats_linux.go:178"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring device: sda1" source="diskstats_linux.go:178"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring device: sdb1" source="diskstats_linux.go:178"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring device: sdd1" source="diskstats_linux.go:178"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring device: sda2" source="diskstats_linux.go:178"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring device: sda3" source="diskstats_linux.go:178"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: diskstats collector succeeded after 0.000454s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: textfile collector succeeded after 0.000015s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: xfs collector succeeded after 0.000698s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: conntrack collector succeeded after 0.000070s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="collecting metrics for device md127" source="mdadm_linux.go:270"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: mdadm collector succeeded after 0.000118s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: wifi collector succeeded after 0.008060s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: cpu collector succeeded after 0.007229s." source="collector.go:135"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-4.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-c1.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-617.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-104009.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-c2.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-104010.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-193144.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-104008.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-104004.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="Ignoring unit: session-104006.scope" source="systemd_linux.go:211"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=error msg="error while sending encoded metrics: write tcp 127.0.0.1:9100->127.0.0.1:41892: write: broken pipe\n" source="<autogenerated>:1"
Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: systemd collector succeeded after 0.054420s." source="collector.go:135"

Atomicbeast101 · 2018-09-14T21:15:15Z

I noticed this issue seems to be happening on CentOS v7 only because I have the same node_exporter version running on Ubuntu and it works fine. Could it be kernel related?

EDIT: For anyone wondering what the kernels are for both CentOS & Ubuntu, here is it:
Ubuntu - 4.15.13
CentOS - 3.10.0

SuperQ · 2018-09-15T06:33:18Z

It looks like the filesystem collector is having problems on this machine. It's taking 10 seconds to collect data, which is the default timeout.

Sep 11 16:28:08 RPT-PROD-SMB-1 node_exporter[5206]: time="2018-09-11T16:28:08-04:00" level=debug msg="OK: filesystem collector succeeded after 10.004162s." source="collector.go:135"

Are there network filesystems mounted on this machine? Or anything strange going on with the filesystems?

Atomicbeast101 · 2018-09-15T11:40:47Z

@SuperQ Disabling the filesystem seems to have fix the issue. There is a network filesystem mounted on the CentOS machine and removing it seems to fix the issue.

It would be nice if the error message can point out which collector is taking the longest. This would help avoid the need of having to get the debug log and scan through to find a collector that's taking too long. Thanks for the help!

SuperQ · 2018-09-15T12:39:24Z

It's not the default because they are technically valid filesystems, but I would recommend excluding the network filesystem types with --collector.filesystem.ignored-fs-types=.... This will let you collect the usual local filesystem metrics without having to disable the feature entirely.

I'm not sure it's easily possible to capture what caused the scrape failure due to the request being canceled. It could be a combination of collectors, rather than a single obvious one like this case.

discordianfish · 2018-09-28T12:12:13Z

@SuperQ any idea why it timed out yet logged success?

SuperQ · 2018-09-28T12:22:04Z

No, I don't have any idea, I haven't really looked at the code closely.

discordianfish · 2018-10-04T11:58:02Z

@SuperQ Looking at the code, it seems the mountTimeout it set to 30 seconds. If this would timeout, it also should log something, so not sure what is going on here.
What timeout do you think was reached?

SuperQ · 2018-10-04T12:07:34Z

The timeout hit was the Prometheus scrape timeout of 10 seconds. It closed the connection to the node_exporter, hence the original error.

The Prometheus client library has methods to pass through the server's timeout value. We could pass this to collectors.

But I don't think it would be useful to fix this problem, as the filesystem read is deadlocked on IO anyway, the timeout would still result in an error.

discordianfish · 2018-10-04T12:36:07Z

Ah got it, so the client closed the connection while the exporter was writing to it in the filesystem collector.. The log is a bit confusing.. maybe we could include a stacktrace.

Anyway, the actual problem here is IMO that it logged success. But I think this is due to client_golang design where there is no way to detect an error when sending something to the collect channel.

discordianfish · 2021-11-12T10:37:30Z

I think there isn't much more we can do here, so closing.

SuperQ added help wanted question and removed help wanted labels Sep 15, 2018

discordianfish closed this as completed Nov 12, 2021

bocmanpy mentioned this issue Jun 1, 2022

error encoding and sending metric family - write: broken pipe #2387

Closed

crobert-1 mentioned this issue Jan 23, 2024

error encoding and sending metric family: write tcp <ip> -> <ip>: write: broken pipe open-telemetry/opentelemetry-collector-contrib#30700

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while sending encoded metrics: write tcp ... write: broken pipe #1066

Error while sending encoded metrics: write tcp ... write: broken pipe #1066

Atomicbeast101 commented Sep 11, 2018

SuperQ commented Sep 11, 2018

Atomicbeast101 commented Sep 11, 2018

SuperQ commented Sep 11, 2018

Atomicbeast101 commented Sep 11, 2018

Atomicbeast101 commented Sep 14, 2018 •

edited

Loading

SuperQ commented Sep 15, 2018

Atomicbeast101 commented Sep 15, 2018

SuperQ commented Sep 15, 2018

discordianfish commented Sep 28, 2018

SuperQ commented Sep 28, 2018

discordianfish commented Oct 4, 2018

SuperQ commented Oct 4, 2018

discordianfish commented Oct 4, 2018

discordianfish commented Nov 12, 2021

Error while sending encoded metrics: write tcp ... write: broken pipe #1066

Error while sending encoded metrics: write tcp ... write: broken pipe #1066

Comments

Atomicbeast101 commented Sep 11, 2018

Host operating system: output of uname -a

node_exporter version: output of node_exporter --version

node_exporter command line flags

Are you running node_exporter in Docker?

What did you do that produced an error?

What did you expect to see?

What did you see instead?

SuperQ commented Sep 11, 2018

Atomicbeast101 commented Sep 11, 2018

SuperQ commented Sep 11, 2018

Atomicbeast101 commented Sep 11, 2018

Atomicbeast101 commented Sep 14, 2018 • edited Loading

SuperQ commented Sep 15, 2018

Atomicbeast101 commented Sep 15, 2018

SuperQ commented Sep 15, 2018

discordianfish commented Sep 28, 2018

SuperQ commented Sep 28, 2018

discordianfish commented Oct 4, 2018

SuperQ commented Oct 4, 2018

discordianfish commented Oct 4, 2018

discordianfish commented Nov 12, 2021

Host operating system: output of `uname -a`

node_exporter version: output of `node_exporter --version`

Atomicbeast101 commented Sep 14, 2018 •

edited

Loading