Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packaging: Add stricter systemd unit options #38109

Merged
merged 1 commit into from Sep 2, 2021
Merged

Packaging: Add stricter systemd unit options #38109

merged 1 commit into from Sep 2, 2021

Conversation

erdnaxe
Copy link
Contributor

@erdnaxe erdnaxe commented Aug 20, 2021

What this PR does / why we need it:
This increases the isolation of Grafana service on systemd systems.

Special notes for your reviewer:
I only tested these changes on NixOS.

Release notice breaking change

Potential failure to start in Ubuntu 18.04 / Debian 9 / CentOS

In Grafana v8.2.0, this change can prevent the grafana-server service from starting on older versions of systemd, present on Ubuntu 18.04 and slightly older versions of Debian. If running one of those versions, please wait until v8.2.1 is released before upgrading. If you still want to upgrade or have already ugpraded, a simple fix is available here: #40162 (comment)

@CLAassistant
Copy link

CLAassistant commented Aug 20, 2021

CLA assistant check
All committers have signed the CLA.

@grafanabot grafanabot added type/build-packaging pr/external This PR is from external contributor labels Aug 20, 2021
@daniellee
Copy link
Contributor

@erdnaxe thanks for submitting your changes upstream and helping us improve our packaging.

Could you explain the changes and the consequences of them? Are these recommendations from systemd-analyse security tool?

@daniellee daniellee requested review from a team, marefr and ying-jeanne and removed request for a team August 20, 2021 15:47
@erdnaxe
Copy link
Contributor Author

erdnaxe commented Aug 20, 2021

Could you explain the changes and the consequences of them? Are these recommendations from systemd-analyse security tool?

Yes, these are recommendations from systemd-analyse security tool. Some notes:

  • Only Debian systemd unit was setting UMask (9e21a08), I also apply this to rpm.
  • I enable ProtectHome because I believe Grafana should never look in /home/.
  • I set ProtectSystem to full which make /usr/, /etc/ and /boot/ read-only. Setting it to strict might introduce regressions in Grafana as I believe there is a need to write in /var/ and it is not trivial to list all authorized folders.
  • I enable ProtectKernel* because I don't want Grafana to collect information about the kernel.
  • NoNewPrivileges, LockPersonality and RestrictSUIDSGID (and maybe others) might introduce regressions if Grafana needs to call a setuid/setgid executable. For example if Grafana needs to call sendmail then it will fail with these options enabled.
  • RestrictAddressFamilies is restricted to IPv4/IPv6 and Unix sockets as Grafana is a web server that might also talk to local sockets.
  • PrivateUsers is going to create a new user namespace, which might introduce regressions. Maybe we can keep it disabled in a first time. For example it is impossible to make Grafana server listen on port 80 with this option as the CAP_NET_BIND_SERVICE capability will not work.

These parameters help to reduce the attack surface of an attacker that would have managed to get a RCE exploit in Grafana.

Copy link
Contributor

@kminehart kminehart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple comments. I think setting RestrictAddressFamilies multiple times may overwrite it. The docs aren't clear, it seems to suggest that only happens when you set it to "". Regardless I think it'd be clearer if it was all one line since there's no deny-list stuff happening.

Other than that, I think it looks good. All of the options make sense.

NoNewPrivileges, LockPersonality and RestrictSUIDSGID (and maybe others) might introduce regressions if Grafana needs to call a setuid/setgid executable. For example if Grafana needs to call sendmail then it will fail with these options enabled

Grafana uses https://github.com/go-mail/mail/ to send emails, which according to their documentation:

Gomail can only send emails using an SMTP server

So that shouldn't be a problem.

The one area I can see a regression happening is in plugins. It executes plugin processes using os/exec.This is done by github.com/hashicorp/go-plugin. From what I can tell, setuid/setgid is not used, and the plugin processes run as the grafana user.

packaging/deb/systemd/grafana-server.service Outdated Show resolved Hide resolved
packaging/rpm/systemd/grafana-server.service Outdated Show resolved Hide resolved
@kminehart kminehart added this to the 8.2.0 milestone Aug 31, 2021
@kminehart kminehart added this to Under review in Grafana Backend (DO NOT USE!) Aug 31, 2021
@kminehart
Copy link
Contributor

kminehart commented Aug 31, 2021

Looks like there's some CI trouble. I'll see if I can figure out why that's happening. Is your branch / fork up to date with main?

@erdnaxe
Copy link
Contributor Author

erdnaxe commented Aug 31, 2021

Looks like there's some CI trouble. I'll see if I can figure out why that's happening. Is your branch / fork up to date with main?

I have just updated my branch with main.

@xlson
Copy link
Contributor

xlson commented Sep 1, 2021

@marefr do you still want me to review this PR, it looks like @kminehart have taken ownership of it?

@kminehart
Copy link
Contributor

kminehart commented Sep 1, 2021

@xlson up to you. I'm going to test it locally real quick and it if it works I'll approve it. Feel free to add any feedback if there's anything that sticks out to you. Or if you'd like, I can hold off on my approval until you test it locally too?

Edit: tested locally and it works as expected. I tested the main concern I had which was running a backend datasource.

@marefr
Copy link
Member

marefr commented Sep 1, 2021

@xlson I added you because I was added as reviewer and don't have any knowledge of this 😄 As Kevin says, up to you.

@xlson
Copy link
Contributor

xlson commented Sep 2, 2021

@kminehart I'm good, feel free to merge :) Thanks for reviewing it!

@kminehart kminehart merged commit 5f521bd into grafana:main Sep 2, 2021
Grafana Backend (DO NOT USE!) automation moved this from Under review to Done Sep 2, 2021
@kminehart kminehart changed the title packaging: systemd unit hardening Packaging: systemd unit hardening Sep 2, 2021
@kminehart kminehart changed the title Packaging: systemd unit hardening Packaging: Add additional systemd unit options Sep 2, 2021
@kminehart kminehart changed the title Packaging: Add additional systemd unit options Packaging: Add stricter systemd unit options Sep 2, 2021
@stephenjamieson
Copy link

stephenjamieson commented Sep 21, 2021

Just FYI, this does break ubuntu 18.04 installs as its sytemd is too old.

@marefr
Copy link
Member

marefr commented Sep 22, 2021

@stephenjamieson can you please open a new bug report/issue describing the details? Thanks

@stephenjamieson
Copy link

Unfortunately, instead of debugging further I upgraded to Ubuntu 20.04, so I do not have an environment to reproduce again.

@kminehart
Copy link
Contributor

I tested the 8.2.0-beta1 package on ubuntu 18.04 and it seems to be working fine. It complains about some unrecognized options but I haven't had any trouble getting Grafana to start.

@marefr
Copy link
Member

marefr commented Sep 23, 2021

Related to #30755 ?

@henriknoerr
Copy link

henriknoerr commented Oct 7, 2021

after install of Grafana 8.2 I can no loger start Grafana, getting the error:
t=2021-10-07T13:43:01+0200 lvl=eror msg="Stopped background service *api.HTTPServer" logger=server reason="failed to open listener on address 0.0.0.0:443: listen tcp 0.0.0.0:443: bind: permission denied" t=2021-10-07T13:43:01+0200 lvl=eror msg="Server shutdown" logger=server error="*api.HTTPServer run error: failed to open listener on address 0.0.0.0:443: listen tcp 0.0.0.0:443: bind: permission denied"

I have added binding to low ports with command below which fixes this when upgrading, but not this time.
sudo setcap 'cap_net_bind_service=+ep' /usr/sbin/grafana-server

If I uncomment what was added with this commit I can start Grafana.

/usr/lib/systemd/system/grafana-server.service
#CapabilityBoundingSet= DeviceAllow= #LockPersonality=true #MemoryDenyWriteExecute=true #NoNewPrivileges=true #PrivateDevices=true #PrivateTmp=true #PrivateUsers=true #ProcSubset=pid ProtectClock=true #ProtectControlGroups=true #ProtectHome=true #ProtectHostname=true #ProtectKernelLogs=true #ProtectKernelModules=true #ProtectKernelTunables=true #ProtectProc=invisible #ProtectSystem=full #RemoveIPC=true #RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX #RestrictNamespaces=true #RestrictRealtime=true #RestrictSUIDSGID=true #SystemCallArchitectures=native #SystemCallFilter=@system-service #SystemCallFilter=~@privileged #SystemCallFilter=~@resources #UMask=0027

Red Hat Enterprise Linux release 8.4 (Ootpa)

@erdnaxe
Copy link
Contributor Author

erdnaxe commented Oct 7, 2021

after install of Grafana 8.2 I can no loger start Grafana, getting the error: t=2021-10-07T13:43:01+0200 lvl=eror msg="Stopped background service *api.HTTPServer" logger=server reason="failed to open listener on address 0.0.0.0:443: listen tcp 0.0.0.0:443: bind: permission denied" t=2021-10-07T13:43:01+0200 lvl=eror msg="Server shutdown" logger=server error="*api.HTTPServer run error: failed to open listener on address 0.0.0.0:443: listen tcp 0.0.0.0:443: bind: permission denied"

I have added binding to low ports with command below which fixes this when upgrading, but not this time. sudo setcap 'cap_net_bind_service=+ep' /usr/sbin/grafana-server

I believe that you need to change the following in your systemd unit (you can make an override):

# Give the CAP_NET_BIND_SERVICE capability
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE

# A private user cannot have process capabilities on the host's user
# namespace and thus CAP_NET_BIND_SERVICE has no effect.
PrivateUsers=false

Doing so should give the required capability to use low ports. Can you confirm that this solves your issue?

I am not really clear if this procedure should be documented in Grafana documentation, or if we should rollback PrivateUsers and CapabilityBoundingSet options to allow people to set the capability without overriding systemd unit.

@henriknoerr
Copy link

I have overwritten the three sugested settings and it works.

@kminehart
Copy link
Contributor

kminehart commented Oct 7, 2021

I think these settings that were added to this PR were intended to prevent less-than-secure setups like this.

@henriknoerr is serving Grafana on port 443 on all interfaces, which is not the default; I wager in most organizations it'd be inadvisable to use a lower port.

I think it makes sense to add this scenario to the documentation. It also may make sense to preserve backwards compatibility to add the CAP_NET_BIND_SERVICE capability to the default systemd settings, though it will allow the less secure scenario of hosting your Grafana instance on a port < 1024.

What do you think, @erdnaxe ?

@erdnaxe
Copy link
Contributor Author

erdnaxe commented Oct 7, 2021

I think the final choice depends on Grafana team philosophy:

  • If they want the update to be as transparent as possible for users that are already setting CAP_NET_BIND_SERVICE, then it is better to revert PrivateUsers and CapabilityBoundingSet options,
  • Else the systemd unit override to add CAP_NET_BIND_SERVICE could be documented with a huge warning on top explaining why it might be unsecure.

@kminehart
Copy link
Contributor

I think we will backport some additional documentation in the "What's new" page as well as additional instructions for systemd users installing via rpm and deb.

I did some reading and it seems like it is in our best interest to ensure that users explicitly grant the CAP_NET_BIND_SERVICE setting.

@fauust
Copy link
Contributor

fauust commented Oct 8, 2021

Because of this, grafana fails to start on Ubuntu 18.04.5 LTS (but I guess it is related to systemd version so other distribution may be impacted).

See: #40194

@henriknoerr
Copy link

I now see that service failure is not limited to binding on port 443. If image-render plugin is installed, the service will go in a restart loop:

t=2021-10-08T13:46:53+0200 lvl=eror msg="Stopped background service *rendering.RenderingService" logger=server reason="Failed to start renderer plugin: Unrecognized remote plugin message: \n\nThis usually means that the plugin is either invalid or simply\nneeds to be recompiled to support the latest protocol."

t=2021-10-08T13:46:53+0200 lvl=warn msg="plugin failed to exit gracefully" logger=plugins.backend pluginId=grafana-image-renderer

t=2021-10-08T13:46:53+0200 lvl=eror msg="Server shutdown" logger=server error="*rendering.RenderingService run error: Failed to start renderer plugin: Unrecognized remote plugin message: \n\nThis usually means that the plugin is either invalid or simply\nneeds to be recompiled to support the latest protocol."

Again it works with image-render plugin with the new lines in grafana-server.service commented out.

/Henrik

@kminehart
Copy link
Contributor

@fauust yes unfortunately Ubuntu does not have that syscall filter group.

@henriknoerr The line that is causing the image renderer plugin to fail is the MemoryDenyWriteExecute=true line.

For a quick workaround, view this comment: #40187 (comment)

@nsteinmetz
Copy link

nsteinmetz commented Oct 8, 2021

@kminehart but in the comment, it's false whereas it's true in this PR ?!

MemoryDenyWriteExecute=false

So I don"t get how it will fix #40187

@kminehart
Copy link
Contributor

Yeah, @nsteinmetz setting that to true causes the imagerenderer to fail. false is the default behavior, effectively undoing the change that causes it to fail.

This is the PR that introduced the breaking change, not the PR that will fix the issue.

@nsteinmetz
Copy link

ah ok :) thanks for the precision

@kminehart kminehart modified the milestones: 8.2.0-beta1, 8.2.0 Oct 8, 2021
@erdnaxe
Copy link
Contributor Author

erdnaxe commented Oct 9, 2021

If some plugins break with MemoryDenyWriteExecute=true, then maybe this option should be reverted. From experience, this option usually breaks Python CFFI and Java.

@kminehart
Copy link
Contributor

Yeah a PR was merged yesterday to revert that option. It looks like it only affected a small amount of plugins, specifically the Image Renderer plugin.

@ulidtko
Copy link

ulidtko commented Jul 18, 2023

This has broken listening on port 80 too.

A systemd override like below fixed it for me.

$ cat /etc/systemd/system/grafana-server.service.d/grant-cap-listen80.conf 
[Service]
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet