Skip to content

Commit

Permalink
[PROF-10241] Use Process._fork hook in at_fork monkey patch on Ruby…
Browse files Browse the repository at this point in the history
… 3.1+

**What does this PR do?**

This PR solves the long-standing "TODO" we had on the `at_fork` monkey
patch where on Ruby 3.1+ there's a VM hook for instrumenting fork
operations that avoids us needing to monkey patch `Kernel` and `Object`.

To use the new hook we need to monkey patch `Process._fork` instead
(see ruby/ruby#5017 and
https://bugs.ruby-lang.org/issues/17795).

Thus, I've added this monkey patch and disabled patching of
`Kernel` and `Object` on Ruby 3.1+.

**Motivation:**

Avoid monkey patching core classes that no longer need to.

**Additional Notes:**

I'm not quite happy with either the AtForkMonkeyPatch or its specs, but
I decided to not do a full revamp for now.

**How to test the change?**

This change includes test coverage.

Here's a tiny test app that also exercises this feature:

```ruby
puts RUBY_DESCRIPTION

require 'datadog/core/utils/at_fork_monkey_patch'

Datadog::Core::Utils::AtForkMonkeyPatch.apply!

Process.datadog_at_fork(:child) { puts "Hello from child process: #{Process.pid} "}

puts "Parent pid: #{Process.pid}"

puts "fork { }"
fork { }
Process.wait

puts "Kernel.fork { }"
Kernel.fork { }
Process.wait

puts "Process.fork { }"
Process.fork { }
Process.wait

puts "Foo.new.call"
class Foo
  def call
    fork { }
  end

  def self.do_fork
    fork {  }
  end
end
Foo.new.call
Process.wait

puts "Foo.do_fork"
Foo.do_fork
Process.wait

puts "Fork from outside"
Foo.new.send(:fork) { }
Process.wait

class BasicFoo < BasicObject
  include ::Kernel

  def call
    fork { }
  end

  def self.do_fork
    fork { }
  end
end

puts "BasicFoo.new.call"
BasicFoo.new.call
Process.wait

puts "BasicObject:"
Class.new(BasicObject) { include(::Kernel); def call; fork { }; end }.new.call
Process.wait

puts "BasicFork.do_fork"
BasicFoo.do_fork
Process.wait
```

P.s.: You may notice when you run this example on < 3.1 that
`BasicFoo.new.call` and `BasicObject` are not instrumented!

This was actually a known-gap from the old instrumentation.
  • Loading branch information
ivoanjo committed Aug 7, 2024
1 parent 4c69eb3 commit b55c33f
Show file tree
Hide file tree
Showing 2 changed files with 121 additions and 90 deletions.
75 changes: 44 additions & 31 deletions lib/datadog/core/utils/at_fork_monkey_patch.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,8 @@
module Datadog
module Core
module Utils
# Monkey patches `Kernel#fork` and similar functions, adding a `Kernel#datadog_at_fork` callback mechanism which
# Monkey patches `Kernel#fork` and similar functions, adding a `Process#datadog_at_fork` callback mechanism which
# is used to restart observability after the VM forks (e.g. in multiprocess Ruby apps).
#
# TODO: Use `Process._fork` on Ruby 3.1+, see
# https://github.com/ruby/ruby/pull/5017 and https://bugs.ruby-lang.org/issues/17795
#
# Known limitations: Does not handle `BasicObject`s that include `Kernel` directly; e.g.
# `Class.new(BasicObject) { include(::Kernel); def call; fork { }; end }.new.call`.
# This will be fixed once we move to hooking into `Process._fork`
module AtForkMonkeyPatch
def self.supported?
Process.respond_to?(:fork)
Expand All @@ -20,20 +13,22 @@ def self.supported?
def self.apply!
return false unless supported?

[
::Process.singleton_class, # Process.fork
::Kernel.singleton_class, # Kernel.fork
::Object, # fork without explicit receiver (it's defined as a method in ::Kernel)
# Note: Modifying Object as we do here is irreversible. During tests, this
# change will stick around even if we otherwise stub `Process` and `Kernel`
].each { |target| target.prepend(KernelMonkeyPatch) }
if RUBY_VERSION < '3.1'
[
::Process.singleton_class, # Process.fork
::Kernel.singleton_class, # Kernel.fork
::Object, # fork without explicit receiver (it's defined as a method in ::Kernel)
# Note: Modifying Object as we do here is irreversible. During tests, this
# change will stick around even if we otherwise stub `Process` and `Kernel`
].each { |target| target.prepend(KernelMonkeyPatch) }
end

::Process.singleton_class.prepend(ProcessDaemonMonkeyPatch)

true
end

# Adds `Kernel#datadog_at_fork` behavior; see parent module for details.
# Adds `datadog_at_fork` behavior; see parent module for details.
module KernelMonkeyPatch
def fork
# If a block is provided, it must be wrapped to trigger callbacks.
Expand All @@ -60,17 +55,6 @@ def fork
result
end

# NOTE: You probably want to wrap any calls to datadog_at_fork with a OnlyOnce so as to not re-register
# the same block/behavior more than once.
def datadog_at_fork(stage, &block)
raise ArgumentError, 'Bad \'stage\' for ::datadog_at_fork' unless stage == :child

datadog_at_fork_blocks[stage] ||= []
datadog_at_fork_blocks[stage] << block

nil
end

module_function

def datadog_at_fork_blocks
Expand All @@ -80,11 +64,23 @@ def datadog_at_fork_blocks
end
end

# A call to Process.daemon ( https://rubyapi.org/3.1/o/process#method-c-daemon ) forks the current process and
# keeps executing code in the child process, killing off the parent, thus effectively replacing it.
#
# This monkey patch makes the `Kernel#datadog_at_fork` mechanism defined above also work in this situation.
# Adds `datadog_at_fork` behavior; see parent module for details.
module ProcessDaemonMonkeyPatch
# Hook provided by Ruby 3.1+ for observability libraries that want to know about fork, see
# https://github.com/ruby/ruby/pull/5017 and https://bugs.ruby-lang.org/issues/17795
def _fork
datadog_at_fork_blocks = Datadog::Core::Utils::AtForkMonkeyPatch::KernelMonkeyPatch.datadog_at_fork_blocks

pid = super

datadog_at_fork_blocks[:child].each(&:call) if datadog_at_fork_blocks.key?(:child) && pid == 0

pid
end

# A call to Process.daemon ( https://rubyapi.org/3.1/o/process#method-c-daemon ) forks the current process and
# keeps executing code in the child process, killing off the parent, thus effectively replacing it.
# This is not covered by `_fork` and thus we have some extra code for it.
def daemon(*args)
datadog_at_fork_blocks = Datadog::Core::Utils::AtForkMonkeyPatch::KernelMonkeyPatch.datadog_at_fork_blocks

Expand All @@ -94,6 +90,23 @@ def daemon(*args)

result
end

# NOTE: You probably want to wrap any calls to datadog_at_fork with a OnlyOnce so as to not re-register
# the same block/behavior more than once.
def datadog_at_fork(stage, &block)
ProcessDaemonMonkeyPatch.datadog_at_fork(stage, &block)
end

# Also allow calling without going through Process for tests
def self.datadog_at_fork(stage, &block)
raise ArgumentError, 'Bad \'stage\' for ::datadog_at_fork' unless stage == :child

datadog_at_fork_blocks = Datadog::Core::Utils::AtForkMonkeyPatch::KernelMonkeyPatch.datadog_at_fork_blocks
datadog_at_fork_blocks[stage] ||= []
datadog_at_fork_blocks[stage] << block

nil
end
end
end
end
Expand Down
136 changes: 77 additions & 59 deletions spec/datadog/core/utils/at_fork_monkey_patch_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,57 +9,44 @@
let(:toplevel_receiver) { TOPLEVEL_BINDING.receiver }

context 'when forking is supported' do
around do |example|
# NOTE: Do not move this to a before, since we also want to skip the around as well
skip 'Forking not supported' unless described_class.supported?

if ::Process.singleton_class.ancestors.include?(Datadog::Core::Utils::AtForkMonkeyPatch::KernelMonkeyPatch)
skip 'Unclean Process class state.'
before do
if ::Process.singleton_class.ancestors.include?(Datadog::Core::Utils::AtForkMonkeyPatch::ProcessDaemonMonkeyPatch)
skip 'Monkey patch already applied (unclean state)'
end
end

unmodified_process_class = ::Process.dup
unmodified_kernel_class = ::Kernel.dup

example.run

# Clean up classes
Object.send(:remove_const, :Process)
Object.const_set('Process', unmodified_process_class)
context 'on Ruby 3.0 or below' do
before { skip 'Test applies only to Ruby 3.0 or below' if RUBY_VERSION >= '3.1' }

Object.send(:remove_const, :Kernel)
Object.const_set('Kernel', unmodified_kernel_class)
it 'applies the monkey patch' do
expect_in_fork do
apply!

# Check for leaks (make sure test is properly cleaned up)
expect(::Process <= described_class::KernelMonkeyPatch).to be nil
expect(::Process <= described_class::ProcessDaemonMonkeyPatch).to be nil
expect(::Kernel <= described_class::KernelMonkeyPatch).to be nil
# Can't assert this because top level can't be reverted; can't guarantee pristine state.
# expect(toplevel_receiver.class.ancestors.include?(described_class::KernelMonkeyPatch)).to be false
expect(::Process.ancestors).to include(described_class::KernelMonkeyPatch)
expect(::Process.ancestors).to include(described_class::ProcessDaemonMonkeyPatch)
expect(::Kernel.ancestors).to include(described_class::KernelMonkeyPatch)
expect(toplevel_receiver.class.ancestors).to include(described_class::KernelMonkeyPatch)

expect(::Process.method(:fork).source_location).to be nil
expect(::Kernel.method(:fork).source_location).to be nil
expect(::Process.method(:daemon).source_location).to be nil
# Can't assert this because top level can't be reverted; can't guarantee pristine state.
# expect(toplevel_receiver.method(:fork).source_location).to be nil
expect(::Process.method(:fork).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(::Process.method(:daemon).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(::Kernel.method(:fork).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(toplevel_receiver.method(:fork).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
end
end
end

it 'applies the Kernel patch' do
# NOTE: There's no way to undo a modification of the TOPLEVEL_BINDING.
# The results of this will carry over into other tests...
# Just assert that the receiver was patched instead.
# Unfortunately means we can't test if "fork" works in main Object.
context 'on Ruby 3.1 or above' do
before { skip 'Test applies only to Ruby 3.1 or above' if RUBY_VERSION < '3.1' }

apply!
it 'applies the monkey patch' do
expect_in_fork do
apply!

expect(::Process.ancestors).to include(described_class::KernelMonkeyPatch)
expect(::Process.ancestors).to include(described_class::ProcessDaemonMonkeyPatch)
expect(::Kernel.ancestors).to include(described_class::KernelMonkeyPatch)
expect(toplevel_receiver.class.ancestors).to include(described_class::KernelMonkeyPatch)

expect(::Process.method(:fork).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(::Process.method(:daemon).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(::Kernel.method(:fork).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(toplevel_receiver.method(:fork).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(::Process.ancestors).to include(described_class::ProcessDaemonMonkeyPatch)
expect(::Process.method(:daemon).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
expect(::Process.method(:_fork).source_location.first).to match(/.*at_fork_monkey_patch.rb/)
end
end
end
end

Expand Down Expand Up @@ -108,7 +95,7 @@ def fork(&block)
let(:child) { double('child') }

before do
fork_class.datadog_at_fork(:child) { child.call }
Datadog::Core::Utils::AtForkMonkeyPatch::ProcessDaemonMonkeyPatch.datadog_at_fork(:child) { child.call }
end

after do
Expand All @@ -121,7 +108,6 @@ def fork(&block)

it do
is_expected.to respond_to(:fork)
is_expected.to respond_to(:datadog_at_fork)
end

describe '#fork' do
Expand Down Expand Up @@ -185,23 +171,17 @@ def fork(&block)
let(:callback) { double('callback') }
let(:block) { proc { callback.call } }

context 'given a stage' do
subject(:datadog_at_fork) do
fork_class.datadog_at_fork(stage, &block)
end

context ':child' do
let(:stage) { :child }
subject(:datadog_at_fork) do
Datadog::Core::Utils::AtForkMonkeyPatch::ProcessDaemonMonkeyPatch.datadog_at_fork(:child, &block)
end

it 'adds a child callback' do
datadog_at_fork
it 'adds a child callback' do
datadog_at_fork

expect(child).to receive(:call).ordered
expect(callback).to receive(:call).ordered
expect(child).to receive(:call).ordered
expect(callback).to receive(:call).ordered

fork_class.fork {}
end
end
fork_class.fork {}
end
end
end
Expand Down Expand Up @@ -230,7 +210,15 @@ def fork(&block)
end

describe Datadog::Core::Utils::AtForkMonkeyPatch::ProcessDaemonMonkeyPatch do
let(:process_module) { Module.new { def self.daemon(nochdir = nil, noclose = nil); end } }
let(:_fork_result) { nil }
let(:process_module) do
result = _fork_result

Module.new do
def self.daemon(nochdir = nil, noclose = nil); end
define_singleton_method(:_fork) { result }
end
end
let(:child_callback) { double('child', call: true) }

before do
Expand Down Expand Up @@ -264,5 +252,35 @@ def fork(&block)

expect(process_module.daemon).to be :process_daemon_result
end

describe 'Process._fork monkey patch' do
context 'in the child process' do
let(:_fork_result) { 0 }

it 'monkey patches _fork to call the child datadog_at_fork callbacks on the child process' do
expect(child_callback).to receive(:call)

expect(process_module._fork).to be 0
end

it 'returns the result from _fork' do
expect(process_module._fork).to be _fork_result
end
end

context 'in the parent process' do
let(:_fork_result) { 1234 }

it 'does not trigger any callbacks' do
expect(child_callback).to_not receive(:call)

process_module._fork
end

it 'returns the result from _fork' do
expect(process_module._fork).to be _fork_result
end
end
end
end
end

0 comments on commit b55c33f

Please sign in to comment.