question: Is it safe to use with unicorn-worker-killer? #92

kyontan · 2024-02-28T07:34:39Z

Hi, I'm new to the pitchfork, and hope this will work well and improve memory efficiency in our large Rails application.

Currently, we are using unicorn with kzk/unicorn-worker-killer, that is famous(?) middleware for Unicorn to provide following features to deal with memory-leak like issues:

Unicorn::WorkerKiller::Oom
Unicorn::WorkerKiller::MaxRequests

Some of our endpoints consumes much memory during request, and sometimes it causes OOM of whole process (and container), we currently utilizes these middlewares.

Now we are testing pitchfork, and I guess we should fork unicorn-worker-killer to support pitchfork also, but I'm not sure this middleware works or not by just replacing Unicorn to Pitchfork in its source code.

Have anyone tried this, or know some technical restricts on that?

Thanks!

The text was updated successfully, but these errors were encountered:

casperisfine · 2024-02-28T07:46:19Z

Hello @kyontan I remember meeting you in Matsumoto :)

Now we are testing pitchfork, and I guess we should fork unicorn-worker-killer to support pitchfork also, but I'm not sure this middleware works or not by just replacing Unicorn to Pitchfork in its source code.

So you could do that, but it would need a bit more adjusting because this gems works by monkey patching Unicorn's process_client method, which take a different list of arguments in Pitchfork.

However, Pitchfork do provide some additional callbacks that make this monkey patching unnecessary.

See the after_request_complete callback: https://github.com/Shopify/pitchfork/blob/master/docs/CONFIGURATION.md#after_request_complete.

You can simply check the memory usage or request count in it, and call exit when desired. The provided worker instance also already contain all the information you need, so in the end instead of adding a gem, you can do something like:

after_request_complete do |_server, worker, _env|
  if worker.requests_count > MAX_REQUESTS
    exit
  end

  if too_much_memory?
    exit
  end
end

You can then implement too_much_memory? as you like, Pitchfork has an utility class to get the RSS and PSS (more relevant) of a process: https://github.com/Shopify/pitchfork/blob/7ff21fb29e9a9b58f0291411da5c2887244fa184/lib/pitchfork/mem_info.rb, it's not exactly official public API, so it's probably best if you do you own, but feel free to copy the code.

Hope this helps, I'll close as I answered, but feel free to respond if you need more information.

Earlopain · 2024-05-03T19:29:58Z

Here's an example that more closely mimicks unicorn-worker-killer:

# Each worker will have its own copy of this data structure.
WorkerData = Data.define(:max_requests, :max_mem)
worker_data = nil

def worker_pss(pid)
  data = File.read("/proc/#{pid}/smaps_rollup")
  pss_line = data.lines.find { |line| line.start_with?("Pss:") }
  pss_line.split(" ")[1].to_i * 1024
end

after_worker_ready do |server, worker|
  max_requests = Random.rand(5_000..10_000)
  max_mem = Random.rand((386 * (1024**2))..(768 * (1024**2)))
  worker_data = WorkerData.new(max_requests: max_requests, max_mem: max_mem)

  server.logger.info("worker=#{worker.nr} gen=#{worker.generation} ready, serving #{max_requests} requests, #{max_mem} bytes")
end

after_request_complete do |server, worker, _env|
  if worker.requests_count > worker_data.max_requests
    server.logger.info("worker=#{worker.nr} gen=#{worker.generation}) exit: request limit (#{worker_data.max_requests})")
    exit # rubocop:disable Rails/Exit
  end

  if worker.requests_count % 16 == 0
    pss_bytes = worker_pss(worker.pid)
    if pss_bytes > worker_data.max_mem
      server.logger.info("worker=#{worker.nr} gen=#{worker.generation}) exit: memory limit (#{pss_bytes} bytes > #{worker_data.max_mem} bytes), after #{worker.requests_count} requests")
      exit # rubocop:disable Rails/Exit
    end
  end
end

V2

# Each worker will have its own copy of this data structure
WorkerData = Data.define(:max_requests, :max_mem)
worker_data = nil

after_worker_ready do |server, worker|
  max_requests = Random.rand(5_000..10_000)
  max_mem = Random.rand((386 * (1024**2))..(768 * (1024**2)))
  worker_data = WorkerData.new(max_requests: max_requests, max_mem: max_mem)

  server.logger.info("worker=#{worker.nr} gen=#{worker.generation} ready, serving #{max_requests} requests, #{max_mem} bytes")
end

after_request_complete do |server, worker, _env|
  if worker.requests_count > worker_data.max_requests
    server.logger.info("worker=#{worker.nr} gen=#{worker.generation}) exit: request limit (#{worker_data.max_requests})")
    exit
  end

  if worker.requests_count % 16 == 0
    mem_info = Pitchfork::MemInfo.new(worker.pid)
    if mem_info.pss > worker_data.max_mem
      server.logger.info("worker=#{worker.nr} gen=#{worker.generation}) exit: memory limit (#{mem_info.pss} bytes > #{worker_data.max_mem} bytes)")
      exit
    end
  end
end

V1

after_worker_ready do |server, worker|
  max_requests = Random.rand(5_000..10_000)
  worker.instance_variable_set(:@_max_requests, max_requests)
  max_mem = Random.rand((386*(1024**2))..(768*(1024**2)))
  worker.instance_variable_set(:@_max_mem, max_mem)

  server.logger.info("worker=#{worker.nr} gen=#{worker.generation} ready, serving #{max_requests} requests, #{max_mem} bytes")
end

after_request_complete do |server, worker, _env|
  if worker.requests_count > worker.instance_variable_get(:@_max_requests)
    server.logger.info("worker=#{worker.nr} gen=#{worker.generation}) exit: request limit (#{worker.instance_variable_get(:@_max_requests)})")
    exit
  end

  if worker.requests_count % 16 == 0
    mem_info = Pitchfork::MemInfo.new(worker.pid) # Copy this implementation
    if mem_info.pss > worker.instance_variable_get(:@_max_mem) # pss over rss
      server.logger.info("worker=#{worker.nr} #{worker.generation}) exit: memory limit (#{mem_info.pss} bytes > #{worker.instance_variable_get(:@_max_mem)} bytes)")
      exit
    end
  end
end

It's a bit ugly with the instance variables but since these aren't static I need to stash them somewhere. I looked around but didn't find some worker-specific date storage I could use. @casperisfine, is there some better way I can accomplish this?

casperisfine · 2024-05-06T12:35:32Z

Here's an example that more closely mimicks unicorn-worker-killer:

Honestly I'm not sure this randomization of the limits make sense. They probably did with Unicorn 5 and older, but in 6+ (or 6.1+?) and in Pitchfork, there is a distribution bias toward the lowest PID because of the API used, so you don't incurr the risk of all your workers restarting at the same time.

The request limit also doesn't make that much sense with Pitchfork assuming you enabled reforking. If you didn't then yes, it may make sense.

I looked around but didn't find some worker-specific date storage

There's none. You can use a weak map:

WORKER_DATA = ObjectSpace::WeakKeyMap.new # Ruby 3.3+
WorkerData = Struct.new(:max_requests, :max_mem)
# ....
worker_info = (WORKER_DATA[worker] ||= WorkerData.new(...))

Earlopain · 2024-05-06T12:48:21Z

Oh, that's pretty nice with the WeakKeyMap, thanks for the tip. I'm on 3.3 so no trouble making use of that.

I can't tell if the randomization is necessary, however for me it's safer to just go with it anyways since it's been that way for years already. It doesn't incur much cost to keep around, though the implementation without is no doubt simpler.

I use ruby-vips, so no reforking for me unfortunately.

Earlopain · 2024-05-06T14:55:39Z

What I'm observing with the WeakKeyMap approach is that its size will always be one. I naively expected this to be run in the context of the monitor but I guess each worker will instead have its own copy of the data structure. I don't believe it matters much, just a bit unexpected. Seems to work great otherwise.

casperisfine · 2024-05-06T15:01:22Z

Yeah, after_request_complete is ran in the worker context, so you indeed don't really even need the weakmap, you can just have global state and reset it on after_worker_fork.

casperisfine closed this as completed Feb 28, 2024

casperisfine mentioned this issue Mar 7, 2024

Thoughts on a plugin architecture #96

Closed

Earlopain mentioned this issue May 5, 2024

Unicorn migration guide #108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question: Is it safe to use with unicorn-worker-killer? #92

question: Is it safe to use with unicorn-worker-killer? #92

kyontan commented Feb 28, 2024

casperisfine commented Feb 28, 2024

Earlopain commented May 3, 2024 •

edited

casperisfine commented May 6, 2024

Earlopain commented May 6, 2024 •

edited

Earlopain commented May 6, 2024 •

edited

casperisfine commented May 6, 2024

question: Is it safe to use with unicorn-worker-killer? #92

question: Is it safe to use with unicorn-worker-killer? #92

Comments

kyontan commented Feb 28, 2024

casperisfine commented Feb 28, 2024

Earlopain commented May 3, 2024 • edited

casperisfine commented May 6, 2024

Earlopain commented May 6, 2024 • edited

Earlopain commented May 6, 2024 • edited

casperisfine commented May 6, 2024

Earlopain commented May 3, 2024 •

edited

Earlopain commented May 6, 2024 •

edited

Earlopain commented May 6, 2024 •

edited