Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centos Stream 9: error getting disk usage ("/sys/kernel/debug/tracing"): permission denied #10897

Closed
shorton3 opened this issue Mar 25, 2022 · 17 comments · Fixed by #10925
Closed
Labels
bug unexpected problem or unintended behavior

Comments

@shorton3
Copy link

Relevant telegraf.conf

Migrating from Centos 7 to Centos Stream 9, and I am seeing this log when Telegraf runs:

Mar 25 23:09:00 nxtdrp032522-d-centos-stream9 telegraf[927]: 2022-03-25T23:09:00Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/sys/kernel/debug/tracing"): permission denied

Any guidance please?

Logs from Telegraf

Mar 25 23:09:00 nxtdrp032522-d-centos-stream9 telegraf[927]: 2022-03-25T23:09:00Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/sys/kernel/debug/tracing"): permission denied

System info

telegraf-1.22.0-1.x86_64

Docker

No response

Steps to reproduce

1.start telegraf and look in /var/log/messages
2.
3.
...

Expected behavior

no permission denied logs

Actual behavior

permission denied logs

Additional info

No response

@shorton3 shorton3 added the bug unexpected problem or unintended behavior label Mar 25, 2022
@CalvinSchwartz
Copy link

CalvinSchwartz commented Mar 26, 2022

Same Issue on Raspbian (64bit) on Raspberrypi, Kernel: 5.10.103-v8+:
[inputs.disk] [SystemPS] => error getting disk usage ("/run/docker/netns/XXXXXXXXXX"): permission denied

@arestifo
Copy link

Same here on Ubuntu 20.04.4 using Telegraf 1.22.0 (git: HEAD 80a7feb8)
uname -a:
Linux 5.4.0-104-generic #118-Ubuntu SMP Wed Mar 2 19:02:41 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

telegraf output:
E! [inputs.disk] [SystemPS] => error getting disk usage ("/hostfs/run/docker/netns/XXXXXXXXXX"): permission denied

@aratik711
Copy link
Contributor

@shorton3 @CalvinSchwartz @arestifo https://access.redhat.com/solutions/5914171
Can you check if it is related to the above
@srebhan not sure if there is anything we can do in this case, Please suggest

@rwb196884
Copy link

rwb196884 commented Mar 28, 2022

This is happening on Debian 4.19.0-18-amd64 after upgrading via apt:

Upgrade: telegraf:amd64 (1.21.4-1, 1.22.0-1), ...
...
Mar 28 11:02:40 mini31 telegraf[13116]: 2022-03-28T10:02:40Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/sys/kernel/debug/tracing"): permission denied

Permissions on the directory sys/kernel/debug/tracing are drwx------ and chmod a+rx; systemctl restart telegraf has not fixed it.

/sys/kernel/debug was also drwx------ and chmod a+rx /sys/kernel/debug seems to have shut it up.

@powersj
Copy link
Contributor

powersj commented Mar 28, 2022

@reimda it looks like in #10527 you changed a debug to an error, which is causing these messages to show up now.

As an example, on my local system with no configuration of [[inputs.disk]] I get these two messages:

2022-03-28T14:37:37Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/var/lib/docker/btrfs"): permission denied
2022-03-28T14:37:37Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/run/user/1000/doc"): operation not permitted

This does not stop the plugin from correctly collecting data, only produces the above messages. Running on older versions does not produce such a message without --debug.

Should this message get switched back to a debug?

Thanks!

@alsitn
Copy link

alsitn commented Mar 29, 2022

Same Issue on Raspbian (64bit) on Raspberrypi, Kernel: 5.10.103-v8+: [inputs.disk] [SystemPS] => error getting disk usage ("/run/docker/netns/XXXXXXXXXX"): permission denied

You have to add telegraf to dkcer group --> usermod -aG docker telegraf

@alsitn
Copy link

alsitn commented Mar 29, 2022

Relevant telegraf.conf

Migrating from Centos 7 to Centos Stream 9, and I am seeing this log when Telegraf runs:

Mar 25 23:09:00 nxtdrp032522-d-centos-stream9 telegraf[927]: 2022-03-25T23:09:00Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/sys/kernel/debug/tracing"): permission denied

Any guidance please?

Logs from Telegraf

Mar 25 23:09:00 nxtdrp032522-d-centos-stream9 telegraf[927]: 2022-03-25T23:09:00Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/sys/kernel/debug/tracing"): permission denied

System info

telegraf-1.22.0-1.x86_64

Docker

No response

Steps to reproduce

1.start telegraf and look in /var/log/messages 2. 3. ...

Expected behavior

no permission denied logs

Actual behavior

permission denied logs

Additional info

No response

Refers to: https://access.redhat.com/solutions/5914171

Try this:

[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs", "tracefs"]

@reimda
Copy link
Contributor

reimda commented Mar 30, 2022

In plugins/input/system/ps.go https://github.com/powersj/telegraf/blob/b9e66f8b9a2d97e09a7e2c1ff5b2bc08bc4eeebc/plugins/inputs/system/ps.go#L157 It looks like we need to be more careful about the log level we're using. I switched everything from debug level to error level, but a 'permission denied' should be a warning as it doesn't cause the plugin to stop working, and only makes it stop producing metrics for the filesystem involved.

@reimda
Copy link
Contributor

reimda commented Mar 30, 2022

It makes sense to me to add tracefs to the ignore_fs default. I don't think it's a filesystem anyone is going to want to know the free space of.

@tagobar
Copy link

tagobar commented Mar 30, 2022

Same here on Ubuntu 20.04.4 using Telegraf 1.22.0 (git: HEAD 80a7feb8) uname -a: Linux 5.4.0-104-generic #118-Ubuntu SMP Wed Mar 2 19:02:41 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

telegraf output: E! [inputs.disk] [SystemPS] => error getting disk usage ("/hostfs/run/docker/netns/XXXXXXXXXX"): permission denied

Same problem with telegraf 1.22.0-1 on clean Debian 11
E! [inputs.disk] [SystemPS] => error getting disk usage ("/run/docker/netns/ca3833961fae"): permission denied

UPD. Adding nsfs to ignore_fs list solved my problem.

@rowansmithau
Copy link

added tracefs to ignore_fs` parameter for inputs.disk and seems quiet so far.

powersj added a commit to powersj/telegraf that referenced this issue Mar 31, 2022
The tracefs filesystem is showing up more and is not likely something a
user wishes to have enabled by default. This adds it to the ignore in
the config by default.

This also reduces the error message level from error to warning. This
was previously a debug message, but was bumped all the way to an error.
This message does not prevent any messages from getting read by Telegraf
and not an error that halts Telegraf. Users may still wish to address
the messages so reducing to a warning is better than hiding them.

Fixes: influxdata#10897
@powersj
Copy link
Contributor

powersj commented Mar 31, 2022

I have put up #10925 which will reduce the error message to a warning and correctly add tracefs to the list of default ignored filesystem types. If anyone has time, it would be great if you could download a build artifact and ensure the messages look right.

Thanks!

@nferch
Copy link
Contributor

nferch commented Apr 11, 2022

I have a similar problem, but the difference is that the problem is occurring on bind mounts that Nomad is creating:

Apr 11 19:06:00 rumours telegraf[24175]: 2022-04-11T19:06:00Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/var/lib/nomad/alloc/7e95d8a6-db75-b3d3-28d1-d2703df5a017/log-shipper/alloc"): permission denied

#findmnt | grep /var/lib/nomad/alloc/7e95d8a6-db75-b3d3-28d1-d2703df5a017/log-shipper/alloc                      
├─/var/lib/nomad/alloc/7e95d8a6-db75-b3d3-28d1-d2703df5a017/log-shipper/alloc                     /dev/sda[/var/lib/nomad/alloc/7e95d8a6-db75-b3d3-28d1-d2703df5a017/alloc] ext4        rw,noatime,errors=remount-ro

Is there any current way to exclude these?

powersj added a commit to powersj/telegraf that referenced this issue Apr 26, 2022
The tracefs filesystem is showing up more and is not likely something a
user wishes to have enabled by default. This adds it to the ignore in
the config by default.

This also reduces the error message level from error to debug. This
was previously a debug message, but was bumped all the way to an error.
This message does not prevent any messages from getting read by Telegraf
and not an error that halts Telegraf. Users may still wish to address
the messages so reducing to a debug is better than hiding them.

Fixes: influxdata#10897
@aladrin
Copy link

aladrin commented Apr 29, 2022

i had to add "fuse.gvfsd-fuse" to my ignores on xfce

@powersj
Copy link
Contributor

powersj commented Apr 29, 2022

In addition to adding the filesystem type to the ignore list, v1.22.3 of telegraf, released yesterday, has reduced these log messages back to debug level.

@0xTH0R
Copy link

0xTH0R commented Jan 8, 2024

I'm getting error permission denied from my second mount point which is inside my work directory but i have try to debug with telegraf --debug --config /etc/telegraf/telegraf.conf --input-filter disk --test the result is correct.

Also, i have checked the file permission on host and inside container it is root as the same.

Screenshot 2567-01-08 at 15 28 24 Screenshot 2567-01-08 at 15 37 11

I'm not sure if it is the same case with this issue?

@srebhan
Copy link
Contributor

srebhan commented Jan 17, 2024

@0xTH0R is this a nested mount? If so, can you please open a new issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.