-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace Patcher#do_once with OnlyOnce helper #1398
Conversation
8c23e30
to
173a572
Compare
Codecov Report
@@ Coverage Diff @@
## master #1398 +/- ##
=======================================
Coverage 98.16% 98.16%
=======================================
Files 768 770 +2
Lines 36667 36745 +78
=======================================
+ Hits 35993 36071 +78
Misses 674 674
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the explanation about how the "do once" flag isn't applied until after the operation completes makes sense, and is a good catch regarding thread safety.
However I don't understand why the existing function needs to be entirely removed in this way. I think the point of the existing module was to compose in some kind of "do once" tracking per object or class. Why do we need to add them as constants to each class instead?
Also do keep in mind, users use do_once
so it's very possible this will be a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good! I didn't expect you to start replacing do_once
in the contrib
folder so soon :)
I think this is worth pursuing further. A lot of code was cleaned up here, which is awesome to see.
Feel free to reduce scope if you think the task is too large, but I like what's already here.
I favor this approach because I see the previous one as extension, not composing: the
IMO this was rather confusing -- it took me a while to "reverse engineer" as well. That's why I mostly when with the manual addition of
I chose to convert a lot of
I hadn't considered that. My suggestion then would be to restore the patcher and to include a warning saying that its usage is deprecated (possibly something to remove for 1.0?), and still move away from it on our code. I've searched to see if anyone was publically using the class on GitHub and I didn't find any usage, which leads me to believe that it may be quite rare.
@delner and @marcotc I need some guidance on what the next steps are here.Thus far I have identified two items missing: a) tests for My question is: If I do those two, is this good to go? Or are there other concerns? I'm asking this because I don't want to get deeper into implementing this only for us to decide on a different path altogether. |
Maybe if we think of I would just be wary of backwards compatibility and compatibility with Ractors (which don't like mutable constants.) If we can address that, I think its fine to keep pushing in this direction. |
You drive a hard bargain, David of Michigan :) The current [3] pry(main)> Ractor.new { Datadog::Patcher.do_once(:foo) { :hello } }.value
#<Thread:0x00007f8cfbaa3158 run> terminated with exception (report_on_exception is true):
/Users/ivo.anjo/datadog/dd-trace-rb/lib/ddtrace/patcher.rb:26:in `do_once': can not access instance variables of classes/modules from non-main Ractors (Ractor::IsolationError)
from (pry):3:in `block in __pry__'
NoMethodError: undefined method `value' for #<Ractor:#3 (pry):3 terminated>
from (pry):3:in `__pry__'
[4] pry(main)> module MyModule
[4] pry(main)* include Datadog::Patcher
[4] pry(main)* module_function
[4] pry(main)* def do_something
[4] pry(main)* do_once(:my_module) { :hello }
[4] pry(main)* end
[4] pry(main)* end
=> :do_something
[5] pry(main)> Ractor.new { MyModule.do_something }.value
#<Thread:0x00007f8d0c324010 run> terminated with exception (report_on_exception is true):
/Users/ivo.anjo/datadog/dd-trace-rb/lib/ddtrace/patcher.rb:26:in `do_once': can not access instance variables of classes/modules from non-main Ractors (Ractor::IsolationError)
from (pry):8:in `do_something'
from (pry):11:in `block in __pry__'
NoMethodError: undefined method `value' for #<Ractor:#4 (pry):11 terminated>
from (pry):11:in `__pry__' This is equivalent-ish to the behavior we get out of I've made a few more previous uses of So with this PR, right now, I claim we are in the "haven't really made it worse than it already was". Now, looking forward, we have a few options to make these usable from Ractors. For instance, I can provide a class RactorSafeOnceSpentOnlyOnce
def initialize
@state = Available.new(self)
end
def run
@state.run { yield }
end
def ran?
@state.ran?
end
private
def mark_spent
@state = Spent.new
freeze
end
def current?(state)
@state == state
end
class Available
def initialize(only_once)
@mutex = Mutex.new
@only_once = only_once
end
def run
@mutex.synchronize do
return unless @only_once.send(:current?, self)
@only_once.send(:mark_spent)
yield
end
end
def ran?
@mutex.synchronize { !@only_once.send(:current?, self) }
end
end
class Spent
def run
nil
end
def ran?
true
end
end
end There's a few tricks here around the order of grabbing the locks and mutating This implementation of If we do want lazy-style global OnlyOnce semantics, this approach is still not enough. But that direction lies the "we're trying to shove the traditional model of programming into Ractors". Global only-once semantics are equivalent to having a global mutable variable, so it faces the same challenges and can be solved by the same solutions. That said, I'm not sure it's useful to start with the Does this answer your concerns? |
As discussed in <#1391>, I've extracted the `OnlyOnce` helper to be available generically throughout the ddtrace codebase, completely replacing all usage of `Datadog::Patcher#do_once`. Note that the semantics of what "only once" means have slightly shifted between both implementations. Having gone through every usage site I'm reasonably confident that the change is harmless (and in some cases it's even a fix for incorrect behavior), but note: * In a bunch of cases, the `Datadog::Patcher` was attached to class instances. This meant that you'd get a `run_once` per class instance, whereas most of those have been patched to have a shared `OnlyOnce` for all such instances. * The old `do_once` would only set the internal state AFTER running the code, which meant in particular that if the code raised an exception, it didn't count as "having run once". `OnlyOnce` sets the flag BEFORE running the code and thus failures still count. This change is debatable, but again in the context of what we were doing with `only_once`, I think it's acceptable.
The deprecation warnings are implemented using `OnlyOnce` 😈
26161d0
to
97eee95
Compare
I believe I have addressed most of the feedback, marking as ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me!
I don't see any breaking behaviour as well 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the explanation makes sense; I don't have strong reasons to hold this back, and it will be good to move away from global state stored in modules given they aren't Ractor friendly. This implementation gives us some options, so I'm good roll with this.
As discussed in #1391, I've extracted the
OnlyOnce
helper to be available generically throughout the ddtrace codebase, completely replacing all usage ofDatadog::Patcher#do_once
.Note that the semantics of what "only once" means have slightly shifted between both implementations. Having gone through every usage site I'm reasonably confident that the change is harmless (and in some cases it's even a fix for incorrect behavior), but note:
In a bunch of cases, the
Datadog::Patcher
was attached to class instances. This meant that you'd get arun_once
per class instance, whereas most of those have been patched to have a sharedOnlyOnce
for all such instances.The old
do_once
would only set the internal state AFTER running the code, which meant in particular that if the code raised an exception, it didn't count as "having run once".OnlyOnce
sets the flag BEFORE running the code and thus failures still count.This change is debatable, but again in the context of what we were doing with
only_once
, I think it's acceptable.Missing: I timeboxed this work and so I haven't had time to add specs for
OnlyOnce
. If we're happy with the approach, I can add those.