Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot get smart plugin to work with passwordless sudo #8690

Closed
courtarro opened this issue Jan 13, 2021 · 10 comments
Closed

Cannot get smart plugin to work with passwordless sudo #8690

courtarro opened this issue Jan 13, 2021 · 10 comments
Labels
area/smart bug unexpected problem or unintended behavior

Comments

@courtarro
Copy link

I'm unable to get the smart plugin to work with a locally-built and installed version of smartmontools using sudo. My telegraf runs as its own user (telegraf), and I've got a sudoers clause set up to enable passwordless execution of /usr/local/sbin/smartctl by Telegraf, yet I get an error.

The log entries are visible below. I think the key message is "sudo: unable to change to root gid: Operation not permitted". I don't understand what this means or why it's appearing. When I manually run sudo -u telegraf to impersonate Telegraf, I'm able to run sudo -n /usr/local/sbin/smartctl --scan just fine, no password needed. Any idea what might be wrong with my configuration?

Relevant telegraf.conf:

[[inputs.smart]]
  path = "/usr/local/sbin/smartctl"
  use_sudo = true

Relevant sudoers entries:

Cmnd_Alias SMARTCTL = /usr/local/sbin/smartctl
telegraf  ALL=(ALL) NOPASSWD: SMARTCTL
Defaults!SMARTCTL !logfile, !syslog, !pam_session

System info:

Running Telegraf version 1.13.0-1 for Ubuntu Bionic (18.04)

Expected behavior:

Telegraf runs smartctl and gathers the relevant metrics.

Actual behavior:

Telegraf fails and the following log entries appear in its systemd log:

Jan 13 16:37:30 prismo telegraf[2893]: 2021-01-13T21:37:30Z E! [inputs.smart] Error in plugin: failed to run command '/usr/local/sbin/smartctl --scan': exit status 1 - sudo: unable to change to root gid: Operation not permitted
Jan 13 16:37:30 prismo telegraf[2893]: sudo: unable to initialize policy plugin
@courtarro courtarro added the bug unexpected problem or unintended behavior label Jan 13, 2021
@courtarro courtarro changed the title Cannot get smart plugin to work with sudo Cannot get smart plugin to work with passwordless sudo Jan 13, 2021
@p-zak
Copy link
Collaborator

p-zak commented Jan 13, 2021

@courtarro can you try with Telegraf 1.17.0?

@courtarro
Copy link
Author

@p-zak I just upgraded via the InfluxDB PPA to 1.17.0 and the result is the same, unfortunately.

@courtarro
Copy link
Author

For comparison, here's what it looks like when I test from the command line:

root@prismo:/usr/local/sbin# sudo -u telegraf -s
telegraf@prismo:/usr/local/sbin$ whoami
telegraf
telegraf@prismo:/usr/local/sbin$ id
uid=999(telegraf) gid=998(telegraf) groups=998(telegraf)
telegraf@prismo:/usr/local/sbin$ ./smartctl --all /dev/sda
smartctl 7.0 2018-12-30 r5164 [x86_64-linux-4.15.0-129-generic] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

Smartctl open device: /dev/sda failed: Permission denied
telegraf@prismo:/usr/local/sbin$ sudo ./smartctl --all /dev/sda
(WORKS)

@KubaTrojan
Copy link
Contributor

I have tried to reproduce this behaviour and here is what I got:

System info: Ubuntu 18.04 (bionic), bare-metal
Telegraf version: 1.17.0

  1. Create new telegraf_test user: sudo adduser telegraf_test
  2. Add below entry to the bottom of sudoers file: sudo visudo
Cmnd_Alias SMARTCTL = /usr/sbin/smartctl
telegraf_test  ALL=(ALL) NOPASSWD: SMARTCTL
Defaults!SMARTCTL !logfile, !syslog, !pam_session
  1. Log into created user: sudo su telegraf_test
  2. Run id command and observe output: id
uid=1003(telegraf_test) gid=1003(telegraf_test) groups=1003(telegraf_test)
  1. Run smartctl by telegraf_test user and observe output:
telegraf_test@XXX:/home/XXX/telegraf$ /usr/sbin/smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
telegraf_test@XXX:/home/XXX/telegraf$ sudo /usr/sbin/smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
  1. Configure smart plugin in telegraf configuration:
 [[inputs.smart]]
    path = "/usr/sbin/smartctl"
    use_sudo = true
  1. Run telegraf by telegraf_test user and observe output:
telegraf_test@XXX:/home/XXX/telegraf$ ./telegraf --config=telegraf.conf --test
2021-01-18T11:57:18Z I! Starting Telegraf 
2021-01-18T11:57:18Z D! [agent] Initializing plugins
2021-01-18T11:57:18Z D! [agent] Starting service inputs
boot_time=1607077935i,context_switches=8447125900i,entropy_avail=3027i,interrupts=2862299797i,processes_forked=1921046i 1610971038000000000
2021-01-18T11:57:18Z D! [agent] Stopping service inputs
2021-01-18T11:57:18Z D! [agent] Input channel closed
2021-01-18T11:57:18Z D! [agent] Stopped Successfully
> smart_device,capacity=512110190592,device=sda,enabled=Enabled,host=XXX,model=XXX,serial_no=XXX,wwn=XXX exit_status=0i,health_ok=true,temp_c=22i,udma_crc_errors=0i 1610971038000000000
  1. Remove entries provided in point 2.
  2. Run smartctl by telegraf_test user and observe output:
telegraf_test@XXX:/home/XXX/telegraf$ /usr/sbin/smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
telegraf_test@XXX:/home/XXX/telegraf$ sudo /usr/sbin/smartctl --scan
[sudo] password for telegraf_test: 
telegraf_test is not in the sudoers file.  This incident will be reported.
  1. Run telegraf by telegraf_test user and observe output:
telegraf_test@XXX:/home/XXX/telegraf$ ./telegraf --config=telegraf.conf --test
2021-01-18T13:45:37Z I! Starting Telegraf 
2021-01-18T13:45:37Z D! [agent] Initializing plugins
2021-01-18T13:45:37Z D! [agent] Starting service inputs
boot_time=1607077935i,context_switches=8454885756i,entropy_avail=2214i,interrupts=2865442312i,processes_forked=1924482i 1610977537000000000
2021-01-18T13:45:37Z E! [inputs.smart] Error in plugin: failed to run command '/usr/sbin/smartctl [--scan]': exit status 1 - sudo: a password is required
2021-01-18T13:45:37Z D! [agent] Stopping service inputs
2021-01-18T13:45:37Z D! [agent] Input channel closed
2021-01-18T13:45:37Z D! [agent] Stopped Successfully
2021-01-18T13:45:37Z E! [telegraf] Error running agent: input plugins recorded 1 errors

As you can see, I have no such problem with permission. Could you take the same steps and write your inputs and outputs?
Also, let me know if you have any entries in sudoers file about telegraf group.

@p-zak
Copy link
Collaborator

p-zak commented Jan 27, 2021

@courtarro Did you have time to check this?

@courtarro
Copy link
Author

courtarro commented Feb 2, 2021

@p-zak Okay, I finally nailed this down. It's an estoteric issue related to a change made per the ping input readme. I'm also using that input plugin, and to do so with the "native" option, I added the suggested lines to a systemd override file:

[Service]
CapabilityBoundingSet=CAP_NET_RAW
AmbientCapabilities=CAP_NET_RAW

If I understand correctly, this suggestion is not ideal. Because CapabilityBoundingSet is being set, the permissions obtainable via sudo are more restricted than they would otherwise be, and this causes Telegraf to be unable to perform sudo successfully to run the smartctl command.

Instead, I changed the override file for ping to:

[Service]
AmbientCapabilities=CAP_NET_RAW

This enables the ping input plugin to do its job, while not limiting the sudo command used by the smart plugin. So now everything works. I suggest updating the ping readme to use this particular configuration instead of the current one (remove the CapabilityBoundingSet clause).

@zeus86
Copy link

zeus86 commented May 13, 2024

for further readers: there are other limits as well, that can mess around with your config. in my case it was the DynamicUser=yes statement, which enabled some sandboxing, including NoNewPrivileges=true. Overriding this via systemctl edit won't actually do anything, you need to use systemctl edit --full to have systemd place a new copy to /etc

@MWP
Copy link

MWP commented Jan 16, 2025

for further readers: there are other limits as well, that can mess around with your config. in my case it was the DynamicUser=yes statement, which enabled some sandboxing, including NoNewPrivileges=true

I'm now having this problem with the sudo error.

Was setting DynamicUser=no the solution for you?

@zeus86
Copy link

zeus86 commented Jan 17, 2025

I'm now having this problem with the sudo error.

Was setting DynamicUser=no the solution for you?

yes, indeed it was. i did not notice, that DynamicUser and NoNewPrivileges are excluding each other. However, if you don't have explicitly enabled DynamicUser, this most likely won't apply to you.

@MWP
Copy link

MWP commented Jan 17, 2025

Okay thanks.
Unfortunately that option is not fixing the problem for me.
I don't really know what to look at next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/smart bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

6 participants