Namespace: use rwlock and disable read locking after freeze #9627

julianbrost · 2023-01-12T14:38:01Z

This PR improves the config load performance by reduce the amount of locking required. This results in a noticeable speedup, especially on many-core machines. This is achieved by two related changes.

This PR replaces #9607 which had basically the same goal but would have introduced breaking changes that would affect Director. This PR on the other hand introduces no breaking changes.

Use a rwlock in namespaces

Before this PR, all operations on Namespace objects were synchronized using ObjectLock which is an exclusive lock. This PR adds a std::shared_timed_mutex to Namespace which is now used for synchronizing the read operations, therefore allowing parallel read operations on all namespaces.

This new mutex is only used internally, if the namespace has to be locked internally, this still has to be done with the ObjectLock. There is a new comment in the code explaining the interaction of these two locks in detail.

Disable locking after `Namespace::Freeze()`

This PR changes the behavior of Namespace::Freeze() to fully freeze the namespace. This means that no more modifications can be done to it, not even by setting overrideFrozen = true from within C++ code.

As this makes the namespace completely read-only after Freeze() was called, no more locking is required for read operations. Thus, this commit makes all lock operations for read operations no-ops afterwards.

This required some changes to the initialization phase as that heavily depended on overrideFrozen. A new priority for INITIALIZE_ONCE_WITH_PRIORITY is introduced: FreezeNamespaces. It is run after the Default priority (used by INITIALIZE_ONCE) which allows initializers running at the Default priority to still insert into the namespaces.

Some constants were only inserted in Main(), these were moved to an initializer function.

The individual new commits in this PR also have extensive commit messages, so I suggest also looking at them individually.

Benchmark

The performance improvements by this PR show better, the more CPU cores are available. This is quite reasonable, as there was an exclusive mutex before, at some time you reach a point where the mutex is always locked by some core. The following histograms show the time it took to do icinga2 daemon -C on this icinga2.conf on a dual Xeon E5-2680v4 machine (28 cores, 56 threads).

Versions benchmarked:

blue (locking-base): same baseline commit as the benchmarks in Reduce namespace locking for faster config loads #9607, effectively has the same performance as the master branch at that time
orange (freeze-nolock): PR Reduce namespace locking for faster config loads #9607 (freeze globals namespace, don't acquire ObjectLock after freeze, but read operations on unfrozen namespaces still acquire the exclusive ObjectLock)
green (shared-mutex): uses a rwlock just as in this PR, but still performs all locking operations on frozen namespaces
red (shared-mutex-freeze-nolock): this PR (rwlock, don't acquire read locks after freeze, but globals namespace is not frozen)

Note: the benchmarks were done with an older version of this PR before it was rebased onto the current master. Apart from that, there were no changes to the implementation.

All three variants show a very significant improvement over the baseline. This PR and #9607 are very close together in all measurements. From the raw numbers, it looks like this PR might perform a little better than #9607 but I'm pretty sure that either one can perform better based on the config (do more reads on globals and #9607 performs better (as it's frozen), add a custom namespace (that won't be frozen) and do more reads on that and this PR will perform better).

The version that only uses a shared mutex but does no optimization on frozen namespaces performs almost as good as the other two in regards to wall clock time, but when considering the user and thus total CPU time used, it's a bit worse that the other two variants. So this PR leaves a little more resources for the rest of the system (which might be a running instance during a config deployment), so doing the rather simple additional optimization on frozen namespaces is definitely worth it.

wall clock time

version	min	avg	median	max	stddev
icinga/icinga2:namespace-locking-base-v2.13.0-516-gf11b612f8	15.801s	17.408s	17.343s	21.134s	0.740s
icinga/icinga2:namespace-freeze-nolock-v2.13.0-519-gf75b77a33	9.292s	12.931s	13.061s	14.330s	0.666s
icinga/icinga2:namespace-shared-mutex-v2.13.0-519-g882e26e2d	9.860s	13.008s	13.116s	15.142s	0.673s
icinga/icinga2:namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2	8.756s	12.614s	12.724s	13.983s	0.714s

total cpu time

version	min	avg	median	max	stddev
namespace-locking-base-v2.13.0-516-gf11b612f8	648.668s	687.974s	686.662s	733.894s	16.500s
namespace-freeze-nolock-v2.13.0-519-gf75b77a33	268.270s	322.513s	320.930s	362.766s	13.956s
namespace-shared-mutex-v2.13.0-519-g882e26e2d	325.065s	358.470s	357.312s	400.005s	14.745s
namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2	286.420s	318.268s	316.029s	368.705s	15.280s

user cpu time

version	min	avg	median	max	stddev
namespace-locking-base-v2.13.0-516-gf11b612f8	102.693s	122.502s	122.112s	137.975s	3.524s
namespace-freeze-nolock-v2.13.0-519-gf75b77a33	234.748s	284.070s	282.657s	323.145s	13.280s
namespace-shared-mutex-v2.13.0-519-g882e26e2d	286.498s	319.714s	318.660s	361.553s	14.296s
namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2	248.377s	279.257s	277.182s	329.789s	14.686s

system cpu time

version	min	avg	median	max	stddev
namespace-locking-base-v2.13.0-516-gf11b612f8	529.753s	565.472s	564.495s	615.121s	15.750s
namespace-freeze-nolock-v2.13.0-519-gf75b77a33	33.522s	38.442s	38.423s	42.178s	1.111s
namespace-shared-mutex-v2.13.0-519-g882e26e2d	35.182s	38.756s	38.641s	44.993s	0.981s
namespace-shared-mutex-freeze-nolock-v2.13.0-521-g9e2a782d2	34.804s	39.010s	38.935s	42.291s	1.004s

TODO

Rebase after Namespace: replace behavior classes with a bool #9603 and INITIALIZE_ONCE_WITH_PRIORITY: use enum for priority values and use std::function #9606 were merged.

Blocked by

Namespace: replace behavior classes with a bool #9603 - this PR now makes use of the m_Frozen member variable directly in Namespace.
INITIALIZE_ONCE_WITH_PRIORITY: use enum for priority values and use std::function #9606 - without that, this PR would introduce some new magic priority like -10.

Al2Klimov

As you said by yourself:

TODO

dependencies
rebase

This commit adds a new initialization priority `FreezeNamespaces` that is run last and moves all calls to `Namespace::Freeze()` there. This allows all other initialization functions to still update namespaces without the use of the `overrideFrozen` flag. It also moves the initialization of `System.Platform*` and `System.Build*` to an initialize function so that these can also be set without setting `overrideFrozen`. This is preparation for a following commit that will make the frozen flag in namespaces finial, no longer allowing it to be overriden (freezing the namespace will disable locking, so performing further updates would be unsafe).

Al2Klimov

Please do me two favours:

Don't rebase against master
Force-push all at once

I'd like not to loose the carefully checked Viewed boxes.

lib/config/expression.cpp

lib/base/namespace.hpp

lib/base/namespace.cpp

…functions This commit removes EmbeddedNamespaceValue and ConstEmbeddedNamespaceValue and reduces NamespaceValue down to a simple struct without inheritance or member functions. The code from these clases is inlined into the Namespace class. The class hierarchy determining whether a value is const is moved to an attribute of NamespaceValue. This is done in preparation for changes to the locking in the Namespace class. Currently, it relies on a recursive mutex. In the future, a shared mutex (read/write lock) should be used instead, which cannot allow recursive locking (without failing or risk deadlocking on lock upgrades). With this change, all operations requiring a lock for one operation are within one function, no recursive locking is not needed any more.

This allows multiple parallel read operations resulting in a overall speedup on systems with many cores.

This makes freezing a namespace an irrevocable operation but in return allows omitting further lock operations. This results in a performance improvement as reading an atomic bool is faster than acquiring and releasing a shared lock. ObjectLocks on namespaces remain untouched as these mostly affect write operations which there should be none of after freezing (if there are some, they will throw exceptions anyways).

Al2Klimov

Last but not least

56 cores/threads (IDK)
256G RAM
Config of our customer 22470
icinga2 daemon -C

2.13.5

real 56m40,539s
user 359m47,947s
sys 2518m14,078s

2.13.6

real 56m34,224s
user 345m49,586s
sys 2532m51,450s

master (this PR^)

real 54m53,547s
user 343m9,143s
sys 2488m3,450s

this PR

real 23m26,281s
user 1070m21,426s
sys 183m38,766s

lippserd · 2023-01-20T12:58:56Z

This is awesome 👍

This was accidentally broken by #9627 because during config sync, a config validation happens that uses `--define System.ZonesStageVarDir=...` which fails on the now frozen namespace. This commit changes this to use `Internal.ZonesStageVarDir` instead. After all, this is used for internal functionality, users should not directly interact with this flag. Additionally, it no longer freezes the `Internal` namespace which actually allows using `Internal.ZonesStageVarDir` in the first place. This also fixes `--define Internal.Debug*` which was also broken by said PR. Freezing of the `Internal` namespace is not necessary for performance reasons as it's not searched implicitly (for example when accessing `globals.x`) and should users actually interact with it, they should know by that name that they are on their own.

master before #9627 (a0286e9): <1> => namespace n { x = 42; x = 42 } ^^^^^^ Constant must not be modified. <2> => HEAD of #9627 (24b57f0): <1> => namespace n { x = 42; x = 42 } null <2> =>

Repair DSL Namespace values being constant broken in #9627

master before #9627 (a0286e9): <1> => namespace n { x = 42; x = 42 } ^^^^^^ Constant must not be modified. <2> => HEAD of #9627 (24b57f0): <1> => namespace n { x = 42; x = 42 } null <2> =>

cla-bot bot added the cla/signed label Jan 12, 2023

icinga-probot bot added area/configuration DSL, parser, compiler, error handling core/quality Improve code, libraries, algorithms, inline docs labels Jan 12, 2023

This was referenced Jan 12, 2023

Reduce namespace locking for faster config loads #9607

Closed

Forbid modifications to globals namespace after evaluting top-level configuration #9628

Closed

Lock not StatsFunctions for iterating, but copies #9202

Closed

julianbrost marked this pull request as ready for review January 16, 2023 08:38

julianbrost requested a review from Al2Klimov January 16, 2023 08:38

Al2Klimov reviewed Jan 18, 2023

View reviewed changes

julianbrost force-pushed the namespace-shared-mutex branch from 9e2a782 to bc733f2 Compare January 18, 2023 15:55

julianbrost force-pushed the namespace-shared-mutex branch from bc733f2 to 48eb1aa Compare January 19, 2023 08:54

Al2Klimov self-requested a review January 19, 2023 09:14

Al2Klimov requested changes Jan 19, 2023

View reviewed changes

lib/config/expression.cpp Show resolved Hide resolved

lib/base/namespace.hpp Outdated Show resolved Hide resolved

lib/base/namespace.cpp Show resolved Hide resolved

julianbrost added 3 commits January 19, 2023 17:55

Use a shared_mutex for read Namespace operations

cc0e2ec

This allows multiple parallel read operations resulting in a overall speedup on systems with many cores.

julianbrost force-pushed the namespace-shared-mutex branch from 48eb1aa to 24b57f0 Compare January 19, 2023 16:57

julianbrost requested a review from Al2Klimov January 19, 2023 16:58

Al2Klimov approved these changes Jan 19, 2023

View reviewed changes

Al2Klimov enabled auto-merge January 19, 2023 17:16

Al2Klimov merged commit e38a907 into master Jan 19, 2023

icinga-probot bot deleted the namespace-shared-mutex branch January 19, 2023 21:44

julianbrost added this to the 2.14.0 milestone Jan 20, 2023

Al2Klimov reviewed Jan 20, 2023

View reviewed changes

This was referenced Jan 23, 2023

Type::GetByName(): cache results for faster lookup #9553

Closed

Checkable#MakeLocalsForApply(): make regex(), match() and cidr_match() "hot functions" #9581

Open

This was referenced Jan 23, 2023

Move Types namespace into type.cpp and simplify Type::GetByName() #9608

Merged

Speed up config object lookup #8118

Merged

julianbrost mentioned this pull request Jan 25, 2023

Cluster config file sync broken in master #9639

Closed

Al2Klimov mentioned this pull request Feb 1, 2023

Assign where filter rendered as match("a*",x)||match("b*",x)||match("c*",x)||... Icinga/icingaweb2-module-director#2661

Closed

julianbrost mentioned this pull request Feb 1, 2023

Fix config sync after freezing namespaces #9648

Merged

This was referenced Feb 8, 2023

Use a shared_mutex for read Dictionary operations #9657

Merged

Dictionary#*(): remove bool overrideFrozen if unused #9658

Merged

Dictionary: store data in a HybridMap #9659

Open

julianbrost added a commit that referenced this pull request Feb 20, 2023

Merge pull request #9662 from Icinga/Repair#9627

bda8be3

Repair DSL Namespace values being constant broken in #9627

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Namespace: use rwlock and disable read locking after freeze #9627

Namespace: use rwlock and disable read locking after freeze #9627

julianbrost commented Jan 12, 2023 •

edited

Loading

Al2Klimov left a comment

Al2Klimov left a comment

Al2Klimov left a comment

lippserd commented Jan 20, 2023 •

edited

Loading

Namespace: use rwlock and disable read locking after freeze #9627

Namespace: use rwlock and disable read locking after freeze #9627

Conversation

julianbrost commented Jan 12, 2023 • edited Loading

Use a rwlock in namespaces

Disable locking after Namespace::Freeze()

Benchmark

wall clock time

total cpu time

user cpu time

system cpu time

TODO

Blocked by

Al2Klimov left a comment

Choose a reason for hiding this comment

TODO

Al2Klimov left a comment

Choose a reason for hiding this comment

Al2Klimov left a comment

Choose a reason for hiding this comment

Last but not least

2.13.5

2.13.6

master (this PR^)

this PR

lippserd commented Jan 20, 2023 • edited Loading

julianbrost commented Jan 12, 2023 •

edited

Loading

Disable locking after `Namespace::Freeze()`

lippserd commented Jan 20, 2023 •

edited

Loading