Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/modules: fix systemd start rate-limits #97512

Merged
merged 4 commits into from Oct 31, 2020
Merged

Conversation

@lf-
Copy link
Contributor

@lf- lf- commented Sep 9, 2020

Motivation for this change

These have been broken since 2016:
systemd/systemd@f0367da
since StartLimitIntervalSec got moved into [Unit] from [Service].
StartLimitBurst has also been moved accordingly, so let's fix that one
too by adding an option ourselves.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests) -- unfortunately my computer can't run kvm so I was unable to run tests
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
@lf-
Copy link
Contributor Author

@lf- lf- commented Sep 9, 2020

oopsie, fixed an ofborg eval error. ought to work now

@nh2
Copy link
Contributor

@nh2 nh2 commented Sep 9, 2020

This contributes to #45785, right?

Independently, could you confirm whether

These have been broken since 2016

is really true? In #45785 (comment) I found that systemd did not remove serviceConfig.StartLimitIntervalSec -- its legacy usage is still supported, but undocumented. But I don't know if that's really the case for the burst option. As the other one is not documented, you may not get the answer from the documentation, but only from the code (or trying it).

@lf-
Copy link
Contributor Author

@lf- lf- commented Sep 9, 2020

I believe that's incorrect: StartLimitInterval is likely still accepted in Service (not 100% sure) but StartLimitIntervalSec is definitely not, at least according to my log lines warning on it (see the commit message and the caddy PR). StartLimitBurst I think is accepted there but I was touching the files anyway and it was also moved by upstream, so I fixed it as well.

Regardless, we were relying on undocumented deprecated behaviour and I wanted to fix all of it now that I noticed it.

@lf-
Copy link
Contributor Author

@lf- lf- commented Sep 10, 2020

This contributes to #45785, right?

I guess so! This fixes #45785, arguably.

@nh2
Copy link
Contributor

@nh2 nh2 commented Sep 17, 2020

but StartLimitIntervalSec is definitely not

@lf- You're right, I had typod it in the linked comment and written StartLimitIntervalSec twice (thus making it inconsistent with the nixpkgs code I quoted there). I've fixed it now.

nh2
nh2 approved these changes Sep 17, 2020
Copy link
Contributor

@nh2 nh2 left a comment

I guess so! This fixes #45785, arguably.

Great! +1 for me on the change (but I cannot test the change to each of the services, so just +1'ing the approach).

Optionally: Would you also be up for implementing this note from #45785 that makes any further incorrect use of the deprecated field an evaluation warning?

  • We should probably also try and make usage of StartLimitInterval in the old section emit a warning when it's used, as having it in the new section has benefits

@lf-
Copy link
Contributor Author

@lf- lf- commented Sep 17, 2020

@nh2 unfortunately I'm quite busy this next couple of weeks and probably don't have time to figure out how to do it. I doubt it's complex but I'd have to do a bit of research.

AluisioASG added a commit to AluisioASG/nixexprs that referenced this issue Oct 29, 2020
curbengh pushed a commit to curbengh/blog that referenced this issue Oct 31, 2020
@lf- lf- force-pushed the StartLimitIntervalSec branch from cb7b6f4 to 7f6fad0 Oct 31, 2020
These were broken since 2016:
systemd/systemd@f0367da
since StartLimitIntervalSec got moved into [Unit] from [Service].
StartLimitBurst has also been moved accordingly, so let's fix that one
too.

NixOS systems have been producing logs such as:
/nix/store/wf98r55aszi1bkmln1lvdbp7znsfr70i-unit-caddy.service/caddy.service:31:
Unknown key name 'StartLimitIntervalSec' in section 'Service', ignoring.

I have also removed some unnecessary duplication in units disabling
rate limiting since setting either interval or burst to zero disables it
(https://github.com/systemd/systemd/blob/ad16158c10dfc3258831a9ff2f1a988214f51653/src/basic/ratelimit.c#L16)
@lf- lf- force-pushed the StartLimitIntervalSec branch from a5f4290 to 039200b Oct 31, 2020
@lf-
Copy link
Contributor Author

@lf- lf- commented Oct 31, 2020

@nh2 I've implemented the deprecation warning as requested in 039200b.

(type == "oneshot" && (restart == "always" || restart == "on-success"))
"Service '${name}.service' with 'Type=oneshot' cannot have 'Restart=always' or 'Restart=on-success'")
++ (optional hasDeprecated ''Service '${name}.service' uses the attribute
StartLimitInterval in the Service section, which is deprecated. See
https://github.com/NixOS/nixpkgs/issues/45786.''))
Copy link
Contributor

@nh2 nh2 Oct 31, 2020

I don't know how warnings are rendered, but doesn't using a '' as done here insert spaces in the middle of the message (logic from the manual)?

E.g. should this be:

... hasDeprecated ''
  Service '${name}.service' uses the attribute
  StartLimitInterval in the Service section, which is deprecated. See
  https://github.com/NixOS/nixpkgs/issues/45786.
''

Copy link
Contributor

@nh2 nh2 Oct 31, 2020

I tried it, indeed the formatting is odd this way:

trace: warning: Service 'display-manager.service' uses the attribute
      StartLimitInterval in the Service section, which is deprecated. See
      https://github.com/NixOS/nixpkgs/issues/45786.

Copy link
Contributor

@nh2 nh2 Oct 31, 2020

Suggestion for the whole warnings block:

    warnings = concatLists (
      mapAttrsToList
        (name: service:
          let
            type = service.serviceConfig.Type or "";
            restart = service.serviceConfig.Restart or "no";
            hasDeprecated = builtins.hasAttr "StartLimitInterval" service.serviceConfig;
          in
            concatLists [
              (optional (type == "oneshot" && (restart == "always" || restart == "on-success"))
                "Service '${name}.service' with 'Type=oneshot' cannot have 'Restart=always' or 'Restart=on-success'"
              )
              (optional hasDeprecated
                "Service '${name}.service' uses the attribute 'StartLimitInterval' in the Service section, which is deprecated. See https://github.com/NixOS/nixpkgs/issues/45786."
              )
            ]
        )
        cfg.services
    );

Copy link
Contributor

@nh2 nh2 Oct 31, 2020

I'll force push a refactor that does this in a moment.

Copy link
Contributor

@nh2 nh2 Oct 31, 2020

OK done. Now it renders correctly and the diff looks much nicer.

@nh2 nh2 force-pushed the StartLimitIntervalSec branch from 039200b to 644079e Oct 31, 2020
nh2
nh2 approved these changes Oct 31, 2020
Copy link
Contributor

@nh2 nh2 left a comment

I've tested your new deprecation warnings on my system, works well. Merging.

@nh2 nh2 merged commit 02d0df5 into NixOS:master Oct 31, 2020
2 of 4 checks passed
@nh2
Copy link
Contributor

@nh2 nh2 commented Oct 31, 2020

I found some challenges documenting the changes in the release notes, written up in #102246.

curbengh pushed a commit to curbengh/blog that referenced this issue Nov 8, 2020
roberth added a commit to hercules-ci/nixpkgs that referenced this issue Feb 8, 2021
This is a backport of the new option introduced in

    NixOS#97512

except without the changes to existing services and deprecation
warning.

It is not a full backport because

> that [ServiceConfig] setting is deprecated and now undocumented
> for the service section by systemd upstream, but still effective
> and somewhat buggy there

and do not know how many users rely on the old (buggy) behavior.

This commit should not rule out a full backport.
roberth added a commit to hercules-ci/nixpkgs that referenced this issue Feb 8, 2021
This is a backport of the new option introduced in

    NixOS#97512

except without the changes to existing services and deprecation
warning.

It is not a full backport because

> that [ServiceConfig] setting is deprecated and now undocumented
> for the service section by systemd upstream, but still effective
> and somewhat buggy there

and do not know how many users rely on the old (buggy) behavior.

This commit should not rule out a full backport.
@roberth
Copy link
Member

@roberth roberth commented Feb 8, 2021

Minimal 20.09 backport for compatibility: #112385

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants