Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespace: use rwlock and disable read locking after freeze #9627

Merged
merged 4 commits into from
Jan 19, 2023

Conversation

julianbrost
Copy link
Contributor

@julianbrost julianbrost commented Jan 12, 2023

This PR improves the config load performance by reduce the amount of locking required. This results in a noticeable speedup, especially on many-core machines. This is achieved by two related changes.

This PR replaces #9607 which had basically the same goal but would have introduced breaking changes that would affect Director. This PR on the other hand introduces no breaking changes.

Use a rwlock in namespaces

Before this PR, all operations on Namespace objects were synchronized using ObjectLock which is an exclusive lock. This PR adds a std::shared_timed_mutex to Namespace which is now used for synchronizing the read operations, therefore allowing parallel read operations on all namespaces.

This new mutex is only used internally, if the namespace has to be locked internally, this still has to be done with the ObjectLock. There is a new comment in the code explaining the interaction of these two locks in detail.

Disable locking after Namespace::Freeze()

This PR changes the behavior of Namespace::Freeze() to fully freeze the namespace. This means that no more modifications can be done to it, not even by setting overrideFrozen = true from within C++ code.

As this makes the namespace completely read-only after Freeze() was called, no more locking is required for read operations. Thus, this commit makes all lock operations for read operations no-ops afterwards.

This required some changes to the initialization phase as that heavily depended on overrideFrozen. A new priority for INITIALIZE_ONCE_WITH_PRIORITY is introduced: FreezeNamespaces. It is run after the Default priority (used by INITIALIZE_ONCE) which allows initializers running at the Default priority to still insert into the namespaces.

Some constants were only inserted in Main(), these were moved to an initializer function.

The individual new commits in this PR also have extensive commit messages, so I suggest also looking at them individually.

Benchmark

The performance improvements by this PR show better, the more CPU cores are available. This is quite reasonable, as there was an exclusive mutex before, at some time you reach a point where the mutex is always locked by some core. The following histograms show the time it took to do icinga2 daemon -C on this icinga2.conf on a dual Xeon E5-2680v4 machine (28 cores, 56 threads).

Versions benchmarked:

  • blue (locking-base): same baseline commit as the benchmarks in Reduce namespace locking for faster config loads #9607, effectively has the same performance as the master branch at that time
  • orange (freeze-nolock): PR Reduce namespace locking for faster config loads #9607 (freeze globals namespace, don't acquire ObjectLock after freeze, but read operations on unfrozen namespaces still acquire the exclusive ObjectLock)
  • green (shared-mutex): uses a rwlock just as in this PR, but still performs all locking operations on frozen namespaces
  • red (shared-mutex-freeze-nolock): this PR (rwlock, don't acquire read locks after freeze, but globals namespace is not frozen)

Note: the benchmarks were done with an older version of this PR before it was rebased onto the current master. Apart from that, there were no changes to the implementation.

histograms

All three variants show a very significant improvement over the baseline. This PR and #9607 are very close together in all measurements. From the raw numbers, it looks like this PR might perform a little better than #9607 but I'm pretty sure that either one can perform better based on the config (do more reads on globals and #9607 performs better (as it's frozen), add a custom namespace (that won't be frozen) and do more reads on that and this PR will perform better).

The version that only uses a shared mutex but does no optimization on frozen namespaces performs almost as good as the other two in regards to wall clock time, but when considering the user and thus total CPU time used, it's a bit worse that the other two variants. So this PR leaves a little more resources for the rest of the system (which might be a running instance during a config deployment), so doing the rather simple additional optimization on frozen namespaces is definitely worth it.

wall clock time

version min avg median max stddev
icinga/icinga2:namespace-locking-base-v2.13.0-516-gf11b612f8 15.801s 17.408s 17.343s 21.134s 0.740s
icinga/icinga2:namespace-freeze-nolock-v2.13.0-519-gf75b77a33 9.292s 12.931s 13.061s 14.330s 0.666s
icinga/icinga2:namespace-shared-mutex-v2.13.0-519-g882e26e2d 9.860s 13.008s 13.116s 15.142s 0.673s
icinga/icinga2:namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2 8.756s 12.614s 12.724s 13.983s 0.714s

total cpu time

version min avg median max stddev
namespace-locking-base-v2.13.0-516-gf11b612f8 648.668s 687.974s 686.662s 733.894s 16.500s
namespace-freeze-nolock-v2.13.0-519-gf75b77a33 268.270s 322.513s 320.930s 362.766s 13.956s
namespace-shared-mutex-v2.13.0-519-g882e26e2d 325.065s 358.470s 357.312s 400.005s 14.745s
namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2 286.420s 318.268s 316.029s 368.705s 15.280s

user cpu time

version min avg median max stddev
namespace-locking-base-v2.13.0-516-gf11b612f8 102.693s 122.502s 122.112s 137.975s 3.524s
namespace-freeze-nolock-v2.13.0-519-gf75b77a33 234.748s 284.070s 282.657s 323.145s 13.280s
namespace-shared-mutex-v2.13.0-519-g882e26e2d 286.498s 319.714s 318.660s 361.553s 14.296s
namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2 248.377s 279.257s 277.182s 329.789s 14.686s

system cpu time

version min avg median max stddev
namespace-locking-base-v2.13.0-516-gf11b612f8 529.753s 565.472s 564.495s 615.121s 15.750s
namespace-freeze-nolock-v2.13.0-519-gf75b77a33 33.522s 38.442s 38.423s 42.178s 1.111s
namespace-shared-mutex-v2.13.0-519-g882e26e2d 35.182s 38.756s 38.641s 44.993s 0.981s
namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2 34.804s 39.010s 38.935s 42.291s 1.004s

TODO

Blocked by

Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you said by yourself:

TODO

  • dependencies
  • rebase

This commit adds a new initialization priority `FreezeNamespaces` that is run
last and moves all calls to `Namespace::Freeze()` there. This allows all other
initialization functions to still update namespaces without the use of the
`overrideFrozen` flag.

It also moves the initialization of `System.Platform*` and `System.Build*` to
an initialize function so that these can also be set without setting
`overrideFrozen`.

This is preparation for a following commit that will make the frozen flag in
namespaces finial, no longer allowing it to be overriden (freezing the
namespace will disable locking, so performing further updates would be unsafe).
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do me two favours:

  1. Don't rebase against master
  2. Force-push all at once

I'd like not to loose the carefully checked Viewed boxes.

lib/config/expression.cpp Show resolved Hide resolved
lib/base/namespace.hpp Outdated Show resolved Hide resolved
lib/base/namespace.cpp Show resolved Hide resolved
…functions

This commit removes EmbeddedNamespaceValue and ConstEmbeddedNamespaceValue and
reduces NamespaceValue down to a simple struct without inheritance or member
functions. The code from these clases is inlined into the Namespace class. The
class hierarchy determining whether a value is const is moved to an attribute
of NamespaceValue.

This is done in preparation for changes to the locking in the Namespace class.
Currently, it relies on a recursive mutex. In the future, a shared mutex
(read/write lock) should be used instead, which cannot allow recursive locking
(without failing or risk deadlocking on lock upgrades). With this change, all
operations requiring a lock for one operation are within one function, no
recursive locking is not needed any more.
This allows multiple parallel read operations resulting in a overall speedup on
systems with many cores.
This makes freezing a namespace an irrevocable operation but in return allows
omitting further lock operations. This results in a performance improvement as
reading an atomic bool is faster than acquiring and releasing a shared lock.

ObjectLocks on namespaces remain untouched as these mostly affect write
operations which there should be none of after freezing (if there are some,
they will throw exceptions anyways).
@Al2Klimov Al2Klimov merged commit e38a907 into master Jan 19, 2023
@icinga-probot icinga-probot bot deleted the namespace-shared-mutex branch January 19, 2023 21:44
@julianbrost julianbrost added this to the 2.14.0 milestone Jan 20, 2023
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last but not least

  • 56 cores/threads (IDK)
  • 256G RAM
  • Config of our customer 22470
  • icinga2 daemon -C

2.13.5

real 56m40,539s
user 359m47,947s
sys 2518m14,078s

2.13.6

real 56m34,224s
user 345m49,586s
sys 2532m51,450s

master (this PR^)

real 54m53,547s
user 343m9,143s
sys 2488m3,450s

this PR

real 23m26,281s
user 1070m21,426s
sys 183m38,766s

@lippserd
Copy link
Member

lippserd commented Jan 20, 2023

This is awesome 👍

julianbrost added a commit that referenced this pull request Feb 1, 2023
This was accidentally broken by #9627 because during config sync, a config
validation happens that uses `--define System.ZonesStageVarDir=...` which fails
on the now frozen namespace.

This commit changes this to use `Internal.ZonesStageVarDir` instead. After all,
this is used for internal functionality, users should not directly interact
with this flag.

Additionally, it no longer freezes the `Internal` namespace which actually
allows using `Internal.ZonesStageVarDir` in the first place. This also fixes
`--define Internal.Debug*` which was also broken by said PR. Freezing of the
`Internal` namespace is not necessary for performance reasons as it's not
searched implicitly (for example when accessing `globals.x`) and should users
actually interact with it, they should know by that name that they are on their
own.
Al2Klimov added a commit that referenced this pull request Feb 10, 2023
master before #9627 (a0286e9):

<1> => namespace n { x = 42; x = 42 }
                             ^^^^^^
Constant must not be modified.
<2> =>

HEAD of #9627 (24b57f0):

<1> => namespace n { x = 42; x = 42 }
null
<2> =>
Al2Klimov added a commit that referenced this pull request Feb 10, 2023
master before #9627 (a0286e9):

<1> => namespace n { x = 42; x = 42 }
                             ^^^^^^
Constant must not be modified.
<2> =>

HEAD of #9627 (24b57f0):

<1> => namespace n { x = 42; x = 42 }
null
<2> =>
julianbrost added a commit that referenced this pull request Feb 20, 2023
Repair DSL Namespace values being constant broken in #9627
Al2Klimov added a commit that referenced this pull request Apr 17, 2023
master before #9627 (a0286e9):

<1> => namespace n { x = 42; x = 42 }
                             ^^^^^^
Constant must not be modified.
<2> =>

HEAD of #9627 (24b57f0):

<1> => namespace n { x = 42; x = 42 }
null
<2> =>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/configuration DSL, parser, compiler, error handling cla/signed core/quality Improve code, libraries, algorithms, inline docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants