Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not processing unfinished jobs across server restarts using async_server mode on Iodine server #244

Closed
daya opened this issue May 7, 2021 · 6 comments 路 Fixed by #246
Closed
Projects

Comments

@daya
Copy link

daya commented May 7, 2021

First of all a bunch of thanks for doing such a good job 馃槏

FYI, I am using Iodine server instead of Puma and using async_server mode, to test if unfinished jobs will be retried across server restarts I did the following

class TestReliabilityJob < ApplicationJob
  queue_as :test_reliability

  def perform(num)
    begin
      if ENV["ALLOW_TO_PASS"]
        puts "I am allowed to finish, arg was #{num}"
      else
        raise "Forcibly not allowed, hence failing a test job"
      end
    rescue RealTimeSyncException => e
      puts "RealTimeSyncException thrown"
      puts e.message
      puts e.backtrace
      raise e
    end
  end
end

in application.rb I have

    config.good_job = {
        execution_mode: :async_server,
        queues: "test_reliability:1;cdr_sns:4",
        max_threads: 5,
        poll_interval: 30
    }

in application_job.rb

require 'real_time_sync'
class ApplicationJob < ActiveJob::Base
  retry_on StandardError, wait: :exponentially_longer, attempts: Float::INFINITY

  retry_on ::RealTimeSyncException, attempts: 100, wait: :exponentially_longer  do |_job, exception|
    puts exception.message
    puts exception.backtrace
  end

  JobTimeoutError = Class.new(StandardError)

  around_perform do |_job, block|
    # Timeout jobs after 10 minutes
    Timeout.timeout(10.minutes, JobTimeoutError) do
      block.call
    end
  rescue StandardError => e
    puts "Job Error: #{e.message}"
    puts e.backtrace
    raise
  end

end

then submitted the job

TestReliabilityJob.set(wait_until: 15.seconds).perform_later 911

The job was tried 3 times in after 15-20 seconds and showed up as unfinished on the dashboard
image

I stopped and restarted the server this time with

ALLOW_TO_PASS=true WEB_CONCURRENCY=5 RAILS_MAX_THREADS=10 GOOD_JOB_EXECUTION_MODE=async_server GOOD_JOB_MAX_THREADS=1 GOOD_JOB_POLL_INTERVAL=30 bundle exec rails s

Not sure what I am missing but the job is never retried.

So I am worried that if I try this approach in production then I am going to have a bunch of unfinished jobs that would never be retried 馃槖 . Could you enlighten me please?

Also is there a way to manually retry unfinished jobs in case I need to do it sometimes?

@bensheldon bensheldon added this to Inbox in Backlog May 7, 2021
@daya
Copy link
Author

daya commented May 7, 2021

Further investigation into the issue using good_job external worker

GOOD_JOB_QUEUES="test_reliability:1;cdr_sns:4" ALLOW_TO_PASS=true be good_job start

shows the unfinished jobs are re-processed and ultimately finished.

So this means async_server execution mode is not really production ready. What do you think @bensheldon ?

@bensheldon
Copy link
Owner

@daya thanks for opening this issue and documenting your configuration. This is strange and unexpected; unfinished jobs should be executed continuously until they are finished or errored.

Let me try to reproduce the problem with the configuration that you shared. I haven't used Iodine before. This is the code that async_server mode uses to detect that it is running within a server process:

def in_server_process?
return @_in_server_process if defined? @_in_server_process
@_in_server_process = Rails.const_defined?('Server') ||
caller.grep(%r{config.ru}).any? || # EXAMPLE: config.ru:3:in `block in <main>' OR config.ru:3:in `new_from_string'
(Concurrent.on_jruby? && caller.grep(%r{jruby/rack/rails_booter}).any?) # EXAMPLE: uri:classloader:/jruby/rack/rails_booter.rb:83:in `load_environment'
end

@daya
Copy link
Author

daya commented May 7, 2021

is there a way to manually retry unfinished jobs in case I need to do it sometimes?
@bensheldon ^^

@bensheldon
Copy link
Owner

is there a way to manually retry unfinished jobs in case I need to do it sometimes?

You can execute jobs with GoodJob.where(id: JOB_ID).perform_with_advisory_lock, but that's not intended to be a public API.

Backlog automation moved this from Inbox to Done May 10, 2021
@bensheldon bensheldon changed the title Not processing unfinished jobs across server restarts using async_server mode Not processing unfinished jobs across server restarts using async_server mode on Iodine server May 10, 2021
@bensheldon
Copy link
Owner

@daya I released GoodJob v1.9.3 which includes support for Iodine. Please let me know if it addresses the issue you experienced with async_server mode!

Also, I'm curious about the benefits your seeing from Iodine. I'd never heard of that server before. Thanks!

@daya
Copy link
Author

daya commented May 10, 2021

@bensheldon thanks for such a quick turnaround.

The performance is better than puma. And apparently there is a better native support for websockets, better than ActionCable plus Redis free pubsub. I haven't yet used those features but they were tempting to select Iodine as I know I will need them later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Backlog
  
Done
Development

Successfully merging a pull request may close this issue.

2 participants