-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Namespace: use rwlock and disable read locking after freeze #9627
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you said by yourself:
TODO
- dependencies
- rebase
9e2a782
to
bc733f2
Compare
This commit adds a new initialization priority `FreezeNamespaces` that is run last and moves all calls to `Namespace::Freeze()` there. This allows all other initialization functions to still update namespaces without the use of the `overrideFrozen` flag. It also moves the initialization of `System.Platform*` and `System.Build*` to an initialize function so that these can also be set without setting `overrideFrozen`. This is preparation for a following commit that will make the frozen flag in namespaces finial, no longer allowing it to be overriden (freezing the namespace will disable locking, so performing further updates would be unsafe).
bc733f2
to
48eb1aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do me two favours:
- Don't rebase against master
- Force-push all at once
I'd like not to loose the carefully checked Viewed boxes.
…functions This commit removes EmbeddedNamespaceValue and ConstEmbeddedNamespaceValue and reduces NamespaceValue down to a simple struct without inheritance or member functions. The code from these clases is inlined into the Namespace class. The class hierarchy determining whether a value is const is moved to an attribute of NamespaceValue. This is done in preparation for changes to the locking in the Namespace class. Currently, it relies on a recursive mutex. In the future, a shared mutex (read/write lock) should be used instead, which cannot allow recursive locking (without failing or risk deadlocking on lock upgrades). With this change, all operations requiring a lock for one operation are within one function, no recursive locking is not needed any more.
This allows multiple parallel read operations resulting in a overall speedup on systems with many cores.
This makes freezing a namespace an irrevocable operation but in return allows omitting further lock operations. This results in a performance improvement as reading an atomic bool is faster than acquiring and releasing a shared lock. ObjectLocks on namespaces remain untouched as these mostly affect write operations which there should be none of after freezing (if there are some, they will throw exceptions anyways).
48eb1aa
to
24b57f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last but not least
- 56 cores/threads (IDK)
- 256G RAM
- Config of our customer 22470
- icinga2 daemon -C
2.13.5
real 56m40,539s
user 359m47,947s
sys 2518m14,078s
2.13.6
real 56m34,224s
user 345m49,586s
sys 2532m51,450s
master (this PR^)
real 54m53,547s
user 343m9,143s
sys 2488m3,450s
this PR
real 23m26,281s
user 1070m21,426s
sys 183m38,766s
This is awesome 👍 |
This was accidentally broken by #9627 because during config sync, a config validation happens that uses `--define System.ZonesStageVarDir=...` which fails on the now frozen namespace. This commit changes this to use `Internal.ZonesStageVarDir` instead. After all, this is used for internal functionality, users should not directly interact with this flag. Additionally, it no longer freezes the `Internal` namespace which actually allows using `Internal.ZonesStageVarDir` in the first place. This also fixes `--define Internal.Debug*` which was also broken by said PR. Freezing of the `Internal` namespace is not necessary for performance reasons as it's not searched implicitly (for example when accessing `globals.x`) and should users actually interact with it, they should know by that name that they are on their own.
Repair DSL Namespace values being constant broken in #9627
This PR improves the config load performance by reduce the amount of locking required. This results in a noticeable speedup, especially on many-core machines. This is achieved by two related changes.
This PR replaces #9607 which had basically the same goal but would have introduced breaking changes that would affect Director. This PR on the other hand introduces no breaking changes.
Use a rwlock in namespaces
Before this PR, all operations on
Namespace
objects were synchronized usingObjectLock
which is an exclusive lock. This PR adds astd::shared_timed_mutex
toNamespace
which is now used for synchronizing the read operations, therefore allowing parallel read operations on all namespaces.This new mutex is only used internally, if the namespace has to be locked internally, this still has to be done with the
ObjectLock
. There is a new comment in the code explaining the interaction of these two locks in detail.Disable locking after
Namespace::Freeze()
This PR changes the behavior of
Namespace::Freeze()
to fully freeze the namespace. This means that no more modifications can be done to it, not even by settingoverrideFrozen = true
from within C++ code.As this makes the namespace completely read-only after
Freeze()
was called, no more locking is required for read operations. Thus, this commit makes all lock operations for read operations no-ops afterwards.This required some changes to the initialization phase as that heavily depended on
overrideFrozen
. A new priority forINITIALIZE_ONCE_WITH_PRIORITY
is introduced:FreezeNamespaces
. It is run after theDefault
priority (used byINITIALIZE_ONCE
) which allows initializers running at theDefault
priority to still insert into the namespaces.Some constants were only inserted in
Main()
, these were moved to an initializer function.The individual new commits in this PR also have extensive commit messages, so I suggest also looking at them individually.
Benchmark
The performance improvements by this PR show better, the more CPU cores are available. This is quite reasonable, as there was an exclusive mutex before, at some time you reach a point where the mutex is always locked by some core. The following histograms show the time it took to do
icinga2 daemon -C
on this icinga2.conf on a dual Xeon E5-2680v4 machine (28 cores, 56 threads).Versions benchmarked:
ObjectLock
after freeze, but read operations on unfrozen namespaces still acquire the exclusiveObjectLock
)Note: the benchmarks were done with an older version of this PR before it was rebased onto the current master. Apart from that, there were no changes to the implementation.
All three variants show a very significant improvement over the baseline. This PR and #9607 are very close together in all measurements. From the raw numbers, it looks like this PR might perform a little better than #9607 but I'm pretty sure that either one can perform better based on the config (do more reads on
globals
and #9607 performs better (as it's frozen), add a custom namespace (that won't be frozen) and do more reads on that and this PR will perform better).The version that only uses a shared mutex but does no optimization on frozen namespaces performs almost as good as the other two in regards to wall clock time, but when considering the user and thus total CPU time used, it's a bit worse that the other two variants. So this PR leaves a little more resources for the rest of the system (which might be a running instance during a config deployment), so doing the rather simple additional optimization on frozen namespaces is definitely worth it.
wall clock time
total cpu time
user cpu time
system cpu time
TODO
Blocked by
m_Frozen
member variable directly inNamespace
.-10
.