Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sslh depends on nscd #105353

Closed
symphorien opened this issue Nov 29, 2020 · 8 comments
Closed

sslh depends on nscd #105353

symphorien opened this issue Nov 29, 2020 · 8 comments

Comments

@symphorien
Copy link
Member

Describe the bug
Since 20.09, sslh runs as a DynamicUser but its iptables require the username to resolve. If nscd and sslh are restarted as the same time (happens on system update) nscd is not ready and the username does not resolve, and sslh fails.

Nov 29 05:39:19 blitiri sslh-pre-start[22263]: iptables v1.8.5 (legacy): owner: Bad value for "--uid-owner" option: "sslh"
Nov 29 05:39:19 blitiri sslh-pre-start[22263]: Try `iptables -h' or 'iptables --help' for more information.
Nov 29 05:39:19 blitiri systemd[1]: sslh.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
Nov 29 05:39:20 blitiri sslh-post-stop[22290]: iptables v1.8.5 (legacy): owner: Bad value for "--uid-owner" option: "sslh"
Nov 29 05:39:20 blitiri sslh-post-stop[22290]: Try `iptables -h' or 'iptables --help' for more information.
Nov 29 05:39:20 blitiri systemd[1]: sslh.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
Nov 29 05:39:20 blitiri systemd[1]: sslh.service: Failed with result 'exit-code'.
Nov 29 05:39:21 blitiri systemd[1]: sslh.service: Scheduled restart job, restart counter is at 1.
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring file `/etc/passwd` (1)
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring directory `/etc` (2)
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring file `/etc/group` (3)
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring directory `/etc` (2)
Nov 29 05:39:31 blitiri sslh-pre-start[22545]: RTNETLINK answers: File exists
Nov 29 05:39:31 blitiri systemd[1]: sslh.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
Nov 29 05:39:31 blitiri systemd[1]: sslh.service: Failed with result 'exit-code'.
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring file `/etc/passwd` (1)
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring directory `/etc` (2)
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring file `/etc/group` (3)
Nov 29 05:39:31 blitiri nscd[22139]: 22139 monitoring directory `/etc` (2)
Nov 29 05:39:31 blitiri nixos-upgrade-start[21772]: Job for sslh.service failed because the control process exited with error code.
Nov 29 05:39:31 blitiri nixos-upgrade-start[21772]: See "systemctl status sslh.service" and "journalctl -xe" for details.
Nov 29 05:39:32 blitiri systemd[1]: sslh.service: Scheduled restart job, restart counter is at 2.
Nov 29 05:39:32 blitiri nscd[22139]: 22139 monitoring file `/etc/passwd` (1)
Nov 29 05:39:32 blitiri nscd[22139]: 22139 monitoring directory `/etc` (2)
Nov 29 05:39:32 blitiri nscd[22139]: 22139 monitoring file `/etc/group` (3)
Nov 29 05:39:32 blitiri nscd[22139]: 22139 monitoring directory `/etc` (2)
Nov 29 05:39:34 blitiri sslh[22586]: sslh-fork v1.21c started
Nov 29 05:39:34 blitiri sslh[22586]: sslh-fork v1.21c started

To Reproduce
(Untested):
enable sslh with services.sslh.transparent = true
restart sslh and nscd at the same time

Expected behavior
sslh starts successfully on the first try

I'll open a PR shortly

Notify maintainers

@fpletz @koral

Metadata

  • system: "x86_64-linux"
  • host os: Linux 5.8.18, NixOS, 20.09.2090.e111e9d4c05 (Nightingale)
  • multi-user?: yes
  • sandbox: yes
  • version: nix-env (Nix) 2.3.9
  • channels(root): "nixos-20.09.2090.e111e9d4c05, nixos-unstable-21.03pre246543.24c9b05ac53"
  • channels(symphorien): "home-manager-20.09"
  • nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos

Maintainer information:

# a list of nixpkgs attributes affected by the problem
attribute:
# a list of nixos modules affected by the problem
module: sslh
@stale
Copy link

stale bot commented Jun 3, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 3, 2021
@symphorien
Copy link
Member Author

It still happened to me today, on 21.05.

The PR fixing this is still open #106336

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 3, 2021
flokli added a commit to flokli/nixpkgs that referenced this issue Sep 21, 2021
NSS modules are now globally provided (by providing a `/run/nss-modules`
symlink), similar to how we handle OpenGL drivers.

This removes the need for nscd as a proxy for all NSS requests, and avoids
DNS requests leaking across network namespaces.

While doing this upgrade, existing applications need to be restarted, so
they know how to pick up NSS modules from `/run/nss-modules`.

If you want to defer application restart to a later time, explicitly enable
`nscd` via `services.nscd.enable` until the application restart.

We can mix NSS modules from any version of glibc according to
https://sourceware.org/legacy-ml/libc-help/2016-12/msg00008.html,
so glibc upgrades shouldn't break old userland loading more recent NSS
modules (and most likely, NSS modules are already loaded)

Fixes: NixOS#55276
Fixes: NixOS#135888
Fixes: NixOS#105353
Cc:    NixOS#52411 (comment)
flokli added a commit to flokli/nixpkgs that referenced this issue Sep 22, 2021
NSS modules are now globally provided (by providing a `/run/nss-modules`
symlink).

See the text added to `rl-2111.section.md` for further details.

Fixes: NixOS#55276
Fixes: NixOS#135888
Fixes: NixOS#105353
Cc:    NixOS#52411 (comment)
erikarvstedt added a commit to erikarvstedt/nixpkgs that referenced this issue Oct 18, 2021
NSS modules are now globally provided by a symlink in `/run`.

See the description in `add-extra-module-load-path.patch` for further details.

Fixes: NixOS#55276
Fixes: NixOS#135888
Fixes: NixOS#105353
Cc:    NixOS#52411 (comment)

Co-authored-by: Erik Arvstedt <erik.arvstedt@gmail.com>
erikarvstedt added a commit to erikarvstedt/nixpkgs that referenced this issue Oct 24, 2021
NSS modules are now globally provided by a symlink in `/run`.

See the description in `add-extra-module-load-path.patch` for further details.

Fixes: NixOS#55276
Fixes: NixOS#135888
Fixes: NixOS#105353
Cc:    NixOS#52411 (comment)

Co-authored-by: Erik Arvstedt <erik.arvstedt@gmail.com>
@dasJ
Copy link
Member

dasJ commented Jan 4, 2022

Can you check if the problem happens less often when you do systemd.services.nscd.stopIfChanged = false?

@symphorien
Copy link
Member Author

I saw your message but I don't have time to dedicate this right now (and I think I don't keep logs long enough to get stats from my existing system to compare with). I intend to answer later.

erikarvstedt added a commit to erikarvstedt/nixpkgs that referenced this issue Jan 19, 2022
NSS modules are now globally provided by a symlink in `/run`.

See the description in `add-extra-module-load-path.patch` for further details.

Fixes: NixOS#55276
Fixes: NixOS#135888
Fixes: NixOS#105353
Cc:    NixOS#52411 (comment)

Co-authored-by: Erik Arvstedt <erik.arvstedt@gmail.com>
erikarvstedt added a commit to erikarvstedt/nixpkgs that referenced this issue Jan 19, 2022
NSS modules are now globally provided by a symlink in `/run`.

See the description in `add-extra-module-load-path.patch` for further details.

Fixes: NixOS#55276
Fixes: NixOS#135888
Fixes: NixOS#105353
Cc:    NixOS#52411 (comment)

Co-authored-by: Erik Arvstedt <erik.arvstedt@gmail.com>
erikarvstedt pushed a commit to erikarvstedt/nixpkgs that referenced this issue Jan 19, 2022
NSS modules are now globally provided (by providing a `/run/nss-modules`
symlink).

See the text added to `rl-2111.section.md` for further details.

Fixes: NixOS#55276
Fixes: NixOS#135888
Fixes: NixOS#105353
Cc:    NixOS#52411 (comment)
@symphorien
Copy link
Member Author

So I took the time to make some rough statistics: it happened about twice a month in 2021 and once a month in 2022 (it's not 100% reliable because I don't keep logs so long). I'm adding systemd.services.nscd.stopIfChanged = false and I can get back to you next year :)

@dasJ
Copy link
Member

dasJ commented Jun 5, 2022

We have that in our downstream repo and we haven't had any issues like this for months (years?) so I hope this fixes it for you

@dasJ
Copy link
Member

dasJ commented Jun 6, 2022

I finally wrote nscd-wait, this should make it possible to wait for nscd to start, eliminating most races.
Intended use is:

{
  systemd.services.nscd.ExecStartPost = "${pkgs.wait-nscd}/bin/wait-nscd";
}

@symphorien
Copy link
Member Author

So I took the time to make some rough statistics: it happened about twice a month in 2021 and once a month in 2022 (it's not 100% reliable because I don't keep logs so long). I'm adding systemd.services.nscd.stopIfChanged = false and I can get back to you next year :)

the issue does not happen anymore. Honestly I lost track of all the attempted fixes, and anyway nscd is not used for non overlaid packages anymore, so feel free to close or not.

@dasJ dasJ closed this as completed Nov 1, 2022
@flokli flokli moved this from In Progress to Done in systemd Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment