Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Patcher#do_once with OnlyOnce helper #1398

Merged
merged 4 commits into from
Mar 16, 2021
Merged

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Mar 8, 2021

As discussed in #1391, I've extracted the OnlyOnce helper to be available generically throughout the ddtrace codebase, completely replacing all usage of Datadog::Patcher#do_once.

Note that the semantics of what "only once" means have slightly shifted between both implementations. Having gone through every usage site I'm reasonably confident that the change is harmless (and in some cases it's even a fix for incorrect behavior), but note:

  • In a bunch of cases, the Datadog::Patcher was attached to class instances. This meant that you'd get a run_once per class instance, whereas most of those have been patched to have a shared OnlyOnce for all such instances.

  • The old do_once would only set the internal state AFTER running the code, which meant in particular that if the code raised an exception, it didn't count as "having run once". OnlyOnce sets the flag BEFORE running the code and thus failures still count.

    This change is debatable, but again in the context of what we were doing with only_once, I think it's acceptable.

Missing: I timeboxed this work and so I haven't had time to add specs for OnlyOnce. If we're happy with the approach, I can add those.

@codecov-io
Copy link

codecov-io commented Mar 8, 2021

Codecov Report

Merging #1398 (97eee95) into master (d015782) will increase coverage by 0.00%.
The diff coverage is 94.69%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1398   +/-   ##
=======================================
  Coverage   98.16%   98.16%           
=======================================
  Files         768      770    +2     
  Lines       36667    36745   +78     
=======================================
+ Hits        35993    36071   +78     
  Misses        674      674           
Impacted Files Coverage Δ
spec/ddtrace/contrib/patcher_spec.rb 100.00% <ø> (ø)
lib/ddtrace/sync_writer.rb 83.87% <66.66%> (+1.11%) ⬆️
lib/ddtrace/writer.rb 96.42% <66.66%> (+0.08%) ⬆️
lib/ddtrace/contrib/httpclient/patcher.rb 89.47% <75.00%> (+1.23%) ⬆️
lib/ddtrace/contrib/httprb/patcher.rb 89.47% <75.00%> (+1.23%) ⬆️
lib/ddtrace/contrib/presto/patcher.rb 88.88% <75.00%> (+1.38%) ⬆️
lib/ddtrace/tracer.rb 93.60% <80.00%> (+0.11%) ⬆️
lib/ddtrace/contrib/configuration/settings.rb 100.00% <100.00%> (ø)
lib/ddtrace/contrib/elasticsearch/patcher.rb 98.63% <100.00%> (+0.01%) ⬆️
lib/ddtrace/contrib/patcher.rb 100.00% <100.00%> (ø)
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d015782...97eee95. Read the comment docs.

Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the explanation about how the "do once" flag isn't applied until after the operation completes makes sense, and is a good catch regarding thread safety.

However I don't understand why the existing function needs to be entirely removed in this way. I think the point of the existing module was to compose in some kind of "do once" tracking per object or class. Why do we need to add them as constants to each class instead?

Also do keep in mind, users use do_once so it's very possible this will be a breaking change.

Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good! I didn't expect you to start replacing do_once in the contrib folder so soon :)

I think this is worth pursuing further. A lot of code was cleaned up here, which is awesome to see.

Feel free to reduce scope if you think the task is too large, but I like what's already here.

@ivoanjo
Copy link
Member Author

ivoanjo commented Mar 9, 2021

However I don't understand why the existing function needs to be entirely removed in this way. I think the point of the existing module was to compose in some kind of "do once" tracking per object or class. Why do we need to add them as constants to each class instead?

I favor this approach because I see the previous one as extension, not composing: the Patcher added state to the instance it was included in, which in our code base could mean three different behaviors:

  • Calling the patcher via Datadog::Patcher.do_once yielded global behavior, with the state being saved inside Datadog::Patcher itself (which is not thread-safe, so better not use do_once from a background thread at the same time as an unrelated do_once is running elsewhere)
  • Whenever the patcher was included in a module that was used for its module_functions, the state was kept in that module itself, yielding global behavior, but with the state being stored locally in the module
  • Finally, when including the patcher in a class, we got do_once-per-instance semantics, with the state being stored per instance.

IMO this was rather confusing -- it took me a while to "reverse engineer" as well.

That's why I mostly when with the manual addition of OnlyOnce instances somewhere: it clarifies it's supposed to be global, or per-instance, or something else as desired.

Why do we need to add them as constants to each class instead?

I chose to convert a lot of do_once-per-instance usage to global usage because for use cases such as warnings it seemed like the intent was just to nicely warn, but not flood user's logs (which could still happen for things that got the per-instance behavior, if new instances were to be created), and for attaching our modules to third-party code because that's also a global change.

Also do keep in mind, users use do_once so it's very possible this will be a breaking change.

I hadn't considered that. My suggestion then would be to restore the patcher and to include a warning saying that its usage is deprecated (possibly something to remove for 1.0?), and still move away from it on our code. I've searched to see if anyone was publically using the class on GitHub and I didn't find any usage, which leads me to believe that it may be quite rare.

Feel free to reduce scope if you think the task is too large, but I like what's already here.

@delner and @marcotc I need some guidance on what the next steps are here.

Thus far I have identified two items missing: a) tests for OnlyOnce and b) partially restoring the Datadog::Patcher class so that outside users can still use it, marking it as deprecated.

My question is: If I do those two, is this good to go? Or are there other concerns? I'm asking this because I don't want to get deeper into implementing this only for us to decide on a different path altogether.

@delner
Copy link
Contributor

delner commented Mar 11, 2021

Maybe if we think of OnlyOnce as a Mutex like object, in the sense that you instantiate a mutex and use the object instead of the class, then I think this pattern makes more sense.

I would just be wary of backwards compatibility and compatibility with Ractors (which don't like mutable constants.) If we can address that, I think its fine to keep pushing in this direction.

@ivoanjo
Copy link
Member Author

ivoanjo commented Mar 12, 2021

compatibility with Ractors (which don't like mutable constants.) If we can address that, I think its fine to keep pushing in this direction.

You drive a hard bargain, David of Michigan :)

The current Datadog::Patcher is not Ractor-friendly when used in a global or per-module fashion:

[3] pry(main)> Ractor.new { Datadog::Patcher.do_once(:foo) { :hello } }.value
#<Thread:0x00007f8cfbaa3158 run> terminated with exception (report_on_exception is true):
/Users/ivo.anjo/datadog/dd-trace-rb/lib/ddtrace/patcher.rb:26:in `do_once': can not access instance variables of classes/modules from non-main Ractors (Ractor::IsolationError)
	from (pry):3:in `block in __pry__'
NoMethodError: undefined method `value' for #<Ractor:#3 (pry):3 terminated>
from (pry):3:in `__pry__'
[4] pry(main)> module MyModule
[4] pry(main)*   include Datadog::Patcher
[4] pry(main)*   module_function
[4] pry(main)*   def do_something
[4] pry(main)*     do_once(:my_module) { :hello }
[4] pry(main)*   end
[4] pry(main)* end
=> :do_something
[5] pry(main)> Ractor.new { MyModule.do_something }.value
#<Thread:0x00007f8d0c324010 run> terminated with exception (report_on_exception is true):
/Users/ivo.anjo/datadog/dd-trace-rb/lib/ddtrace/patcher.rb:26:in `do_once': can not access instance variables of classes/modules from non-main Ractors (Ractor::IsolationError)
	from (pry):8:in `do_something'
	from (pry):11:in `block in __pry__'
NoMethodError: undefined method `value' for #<Ractor:#4 (pry):11 terminated>
from (pry):11:in `__pry__'

This is equivalent-ish to the behavior we get out of OnlyOnce with this PR.

I've made a few more previous uses of do_once move to the constant approach whereas previously they were per-instance, but I still think that that's the correct semantics for the context they are used in.

So with this PR, right now, I claim we are in the "haven't really made it worse than it already was".

Now, looking forward, we have a few options to make these usable from Ractors. For instance, I can provide a OnlyOnce implementation that becomes frozen once spent:

    class RactorSafeOnceSpentOnlyOnce
      def initialize
        @state = Available.new(self)
      end

      def run
        @state.run { yield }
      end

      def ran?
        @state.ran?
      end

      private

      def mark_spent
        @state = Spent.new
        freeze
      end

      def current?(state)
        @state == state
      end

      class Available
        def initialize(only_once)
          @mutex = Mutex.new
          @only_once = only_once
        end

        def run
          @mutex.synchronize do
            return unless @only_once.send(:current?, self)

            @only_once.send(:mark_spent)

            yield
          end
        end

        def ran?
          @mutex.synchronize { !@only_once.send(:current?, self) }
        end
      end

      class Spent
        def run
          nil
        end

        def ran?
          true
        end
      end
    end

There's a few tricks here around the order of grabbing the locks and mutating @state, but I'm reasonably confident this is correct (the intuition is --> if a thread sees the Available state, then it will go through the lock and thus observe anything that any other thread that saw the Available state does or did; and if a thread sees the Spent state, then it's done and no lock is needed).

This implementation of OnlyOnce can be interesting for things we do at initialization-time -- it's safe to touch the OnlyOnce instance all we want afterwards.

If we do want lazy-style global OnlyOnce semantics, this approach is still not enough. But that direction lies the "we're trying to shove the traditional model of programming into Ractors". Global only-once semantics are equivalent to having a global mutable variable, so it faces the same challenges and can be solved by the same solutions.

That said, I'm not sure it's useful to start with the RactorSafeOnceSpentOnlyOnce for now -- I think it makes sense to just leave a reference in the code to it, and we can come back and add it once we need it (YAGNI and all).

Does this answer your concerns?

As discussed in <#1391>,
I've extracted the `OnlyOnce` helper to be available generically
throughout the ddtrace codebase, completely replacing all usage of
`Datadog::Patcher#do_once`.

Note that the semantics of what "only once" means have slightly shifted
between both implementations. Having gone through every usage site
I'm reasonably confident that the change is harmless (and in some cases
it's even a fix for incorrect behavior), but note:

* In a bunch of cases, the `Datadog::Patcher` was attached to
  class instances. This meant that you'd get a `run_once` per class
  instance, whereas most of those have been patched to have a
  shared `OnlyOnce` for all such instances.

* The old `do_once` would only set the internal state AFTER running the
  code, which meant in particular that if the code raised an exception,
  it didn't count as "having run once". `OnlyOnce` sets the flag
  BEFORE running the code and thus failures still count.

  This change is debatable, but again in the context of what we were
  doing with `only_once`, I think it's acceptable.
The deprecation warnings are implemented using `OnlyOnce` 😈
@ivoanjo ivoanjo changed the title RFC: Replace Patcher#do_once with OnlyOnce helper Replace Patcher#do_once with OnlyOnce helper Mar 12, 2021
@ivoanjo
Copy link
Member Author

ivoanjo commented Mar 12, 2021

I believe I have addressed most of the feedback, marking as ready for review.

@ivoanjo ivoanjo marked this pull request as ready for review March 12, 2021 11:10
@ivoanjo ivoanjo requested a review from a team March 12, 2021 11:10
Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!
I don't see any breaking behaviour as well 👍

Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the explanation makes sense; I don't have strong reasons to hold this back, and it will be good to move away from global state stored in modules given they aren't Ractor friendly. This implementation gives us some options, so I'm good roll with this.

@ivoanjo ivoanjo merged commit 9ef4f44 into master Mar 16, 2021
@ivoanjo ivoanjo deleted the ivoanjo/wip-only-once branch March 16, 2021 08:59
@github-actions github-actions bot added this to the 0.47.0 milestone Mar 16, 2021
@marcotc marcotc added the core Involves Datadog core libraries label Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Involves Datadog core libraries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants