-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing the Registry threading/design issues #153
Conversation
b16a76e
to
10672b3
Compare
10672b3
to
c598c74
Compare
@dmagliola please take a look and merge it if you think it's fine this fixes the issue within the collector. Also please another release :( |
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
f4ac4c4
to
fb4504e
Compare
Test/broken integration
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
a7d9f46
to
80adc30
Compare
Fix thread safeness issues
Hi @ahmgeek, thank you for your PR! This PR also highlights a few other small problems, like the confusion between strings and symbols in the Sorry for the delay, and thanks for the good work! |
@dmagliola Oh man, am really sorry to disturb you during the vacation, please forget about it. Have fun and enjoy the vacation. |
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
de02eb2
to
8550da5
Compare
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
Signed-off-by: Ahmad Tolba <tolpa1@gmail.com>
Unfortunately, on local development machine randomly the In production it's always freezed, there's no way to increment them. Thanks ❤️ |
What does it fix #151
There are two issues with the current implementation:
I will go over them one by one
The collector
client_ruby/lib/prometheus/middleware/collector.rb
Lines 36 to 37 in 80adc30
These
init_*
methods get called every single time a request happensclient_ruby/lib/prometheus/middleware/collector.rb
Lines 46 to 47 in b3bbe23
As you can see, every request we try to register the default
http_*
metrics.If you think this is not an issue with MRI, it is actually, it raises
AlreadyRegisteredError
with MRIThis is the spec d4f4e32 that produces the bug.
The fix
Check before adding the metric to the registry, if the metric is already there, use it, otherwise push the metric to the registry.
Here fb4504e I introduce the
registry.get(metric)
I think Daniel mentioned something regards symbols/strings. I would vote for strings, I can try and fix this too but it will be a breaking change if I understand correctly? maybe we can have a coercing method under the hood and warn about the deprecation or something.
The registry
Turned out, in a multi-threaded world, two threads could race to register the metric, and bypass the
@registry.get(metric)
check. So one would register the metric and the other would raiseAlreadyRegisteredError
.The way I see it if multiple threads try to register the same metric (with locking introduced), if the metric is registered, there should be no errors raised, just return the metric and use it to perform the preferred action.
Hence here 80adc30 I changed from rubt's mutex, to something more efficient for the use case.
The specs for thread safety was relying on the
raise
to make it thread-safe, for me, this design is more of a workaround. If the registry is thread-safe, it means only one metric will be written to the registry at all times.The registry again
I re-designed in the last commit the registry to be used as
singleton
object. This way we don't need the ratherbadnot thread safe memoization.Points to discuss
@dmagliola Please let me know how is the new design works, maybe I understand things wrongly, or my implementation has flaws. Also if you can put me in the loop while you discuss this with your team that would be great.