New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of MeterAdapter #443
base: master
Are you sure you want to change the base?
Conversation
The main remaining bottleneck now is the RWLock wrapping the LifetimeManager, in a profile 88% time is spent entering and exiting the lock. It's not that trivial to remove it, but I have an idea how to make this also lockfree (and also do away with the race which now may occur when deleting and incrementing metric at the same time). Would you be interested in that? |
2acff56
to
ba895bf
Compare
Thanks! There have also been some other optimization-related changes that cause some code conflicts here but I will try manually merge some of the good ideas from this PR. One thing that has already been done is reducing the lifetime management from metric instance to metric template level, although I am wondering if it might be feasible to kick it up even one more level to the metric factory to reduce the overhead further. |
Single-threaded Counter benchmark goes from 41.2 MB to 27.47 MB allocated. Runtime is only about 4% faster, which might as well be noise
ba895bf
to
a30ca40
Compare
I added a cache (Instrument, tags) -> prometheus metric, so all the label processing is done only once per label combination. More importantly, the user-specified options.ResolveHistogramBuckets results are now cached. The improvement to single-threaded performance is about 2x with about 1/6 of allocated memory: Counter | 116.1 ms + 158.69 MB -> 52.38 ms + 27.47 MB Histogram | 179.3 ms + 196.84 MB -> 59.40 ms + 33.57 MB Multithreaded performance is unfortunatelly unafected (maybe bit worse), since the limiting factor are the locks in ManagedLifetimeMetricHandle class
We only need to be able to determine if a lease is active or there was any activity in the last timer period. That is now done 2 atomic counters - a count of leased leases and a count returned leases. The timer still has a potential race if a new lease occurs right after the timer checks it didn't occur and attempts to delete it. This was there even with the locks and this doesn't make any worse (AFAIK) Performance of single-threaded benchmark is unaffected, the 16 thread test sees 1.5-3x improvement (it is noisy :/) Counter (4 runs) | 1339ms...1971ms -> 881ms...939ms Histogram (4 runs) | 2021ms...2158ms -> 771ms...928ms The improvement is this small only because there is also a RWLock which not that trivial to remove (and no, .NET RWLock is not lockfree with purely read workload)
a30ca40
to
4e86977
Compare
Great! Moving the the lifetime timer higher up is definitely good news. At least for me it doesn't really matter if it's global or per-template metric. I don't have thousands of meters, just many different label values. I rebased it on the current master branch, there was only a minor conflict. Do you have these changes in a private branch? |
Yeah, I am currently working in the "optimizing" branch |
In |
(sorry for an empty email message, I accidentally pressed enter in the title)
The MeterAdapter is quite slow compared to using prometheus-net directly, I noticed that especially in a multi threaded scenario - it isn't that hard to hammer one metric in parallel on a many core machine. This PR adds benchmarks to demonstrate the improvement, and improves performance both in the single-threaded and the multi-threaded case.
Main changes are
_options.ResolveHistogramBuckets
is not called on each observation. It is not cached together with the rest of metric initialization code.LifetimeManager
Other details should be described in the commit messages or comments.
Before
After