Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis is busy running script and script never terminates #441

Closed
tacticiankerala opened this issue Nov 20, 2019 · 14 comments · Fixed by #443
Closed

Redis is busy running script and script never terminates #441

tacticiankerala opened this issue Nov 20, 2019 · 14 comments · Fixed by #443
Assignees

Comments

@tacticiankerala
Copy link

Describe the bug
I was trying to upgrade to v7-beta2 of this gem. After I've upgraded and restarted sidekiq, I've started getting Redis error. Upon investigating I found out that a script is blocking all Redis operations. I think it might be the lock migration script that is causing this. Here are the details of my current setup

OS: Mac OS High Sierra
Redis: v3.2.9
Sidekiq: v5.2.7
Sidekiq Uniquene Jobs: 6.0.13
Rails: 5.2.3
Ruby: 2.6.2

Expected behavior
Sidekiq/rails server should start as expected and connect to Redis and operates as usual

Current behavior
All operations on redis is blocked because a non-terminated script is running. I've kept it runnng for a few hours but it never finished.

Additional context
Screen Shot 2019-11-13 at 9 26 42 AM
Screen Shot 2019-11-13 at 9 26 30 AM

Could you guide me on further debugging this or you need more information/context on the problem

@mhenrixon
Copy link
Owner

Hi @tacticiankerala thanks for the report!

Could you try to switch to the ruby version of the reaper?

SidekiqUniqueJobs.configure do |config|
  config.reaper = :ruby
end

There are other things you can do to control the reaper like reaper_interval, reaper_frequency, reaper_timeout to reduce the problem.

@tacticiankerala
Copy link
Author

Sure. Let me try this out!

@tacticiankerala
Copy link
Author

@mhenrixon changing it to :ruby reaper did not solve this. I've had only 4-5 uniquejobs keys in redis. They seems to have upgraded when I checked after killing the running script. See the keys below:

1) "uniquejobs:digests"
2) "uniquejobs:9c7f35913990a103b81278b123f62756_lx_default:LOCKED"
3) "uniquejobs:live"
4) "uniquejobs:034265e1ff086338d94d46973328f259_lx_default:LOCKED"
5) "uniquejobs:live:UPGRADED"

@tacticiankerala
Copy link
Author

Actually, I've modified the uniqueness key digest to make it play nice with different tenants we have. That could be messing this up as well. Could you confirm?

module SidekiqUniqueJobs
  class UniqueArgs
    # Sidekiq Apartment middleware in inserted after the unique args hence
    # item['apartment'] is not yet available. So using current tenant
    # here
    def create_digest_with_apartment
      # this method is called twice even when the job already has unique digest
      # follow this thread for more information - https://github.com/mhenrixon/sidekiq-unique-jobs/issues/391
      return item[UNIQUE_DIGEST_KEY] if item[UNIQUE_DIGEST_KEY]

      "#{create_digest_without_apartment}_#{Apartment::Tenant.current}"
    end

    alias_method :create_digest_without_apartment, :create_digest
    alias_method :create_digest, :create_digest_with_apartment
  end
end

@mhenrixon
Copy link
Owner

No, you are right. It must be the migration script then. I’ll have to think a little on this... baffled by it to be honest

@tacticiankerala
Copy link
Author

I think I was wrong and you are right about the source of problem is not the migration script. I see the following logs

2019-11-20 19:15: Sidekiq -- Start - Upgrading Locks
2019-11-20 19:15: Sidekiq -- Start - Converting v6 locks to v7
2019-11-20 19:15:  Sidekiq -- Done - Converting v6 locks to v7
2019-11-20 19:15: Sidekiq -- Start - Deleting v6 keys
2019-11-20 19:15: Sidekiq -- Done - Deleting v6 keys
2019-11-20 19:15: Sidekiq -- Done - Upgrading Locks
2019-11-20 19:15: Sidekiq -- Starting Reaper
2019-11-20 20:05: Sidekiq -- Sidekiq 6.0 will require Redis 4.0+, you are using Redis v3.2.9
2019-11-20 20:05: Sidekiq -- Executed update_version.lua in 1ms
2019-11-20 20:05: Sidekiq -- Already upgraded to 7.0.0.beta2
2019-11-20 20:05: Sidekiq -- Starting Reaper

Let me investigate a bit more and comeback. Meanwhile if you have any more pointers or any specific things I should look for, please let me know.

@tacticiankerala
Copy link
Author

I was able to narrow down the issue to SidekiqUniqueJobs::Orphans::Reaper#belongs_to_job?

Sidekiq.redis {|conn| SidekiqUniqueJobs::Orphans::Reaper.new(conn).belongs_to_job? 'uniquejobs:9c7f35913990a103b81278b123f62756_lx_default' }

this is what is causing the issue. I'll look into the exact script and let you know my further findings

@mhenrixon
Copy link
Owner

Try beta3 @tacticiankerala this problem should be no more 👍 thank you for taking the time to report

@tacticiankerala
Copy link
Author

Thank you so much for the quick turnaround! I'll try it out and report back.

@tacticiankerala
Copy link
Author

Redis is no longer getting blocked after upgrading to beta3. But, noticing the below exception in the logs:

2019-11-25 12:03:46.770563 I [64875:70117505070760] Sidekiq -- (2019-11-25 12:03:46 +0530) Cleanup failed with error undefined local variable or method `size' for #<SidekiqUniqueJobs::Orphans::Reaper:0x00007f8b01f76618>
2019-11-25 12:03:46.770600 E [64875:70117505070760 logging.rb:75] Sidekiq -- Exception: NameError: undefined local variable or method `size' for #<SidekiqUniqueJobs::Orphans::Reaper:0x00007f8b01f76618>

@tacticiankerala
Copy link
Author

@mhenrixon
Copy link
Owner

Ah, of course. Thanks @tacticiankerala I will fix it in a couple of hours!

@mhenrixon
Copy link
Owner

@tacticiankerala v7.0.0.beta4 fixes the last issue. Keen to hear how it is working out for you. Keep in mind that you can tweak the reaper to run more or less often, how long it should be allowed to run and how many orphans it should clean on every run.

Let me know how you get along with it!

@tacticiankerala
Copy link
Author

@mhenrixon : Just got a chance to test this and it works great! Thank you so much for fixing the issues in no time. Really appreciate all your effort :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants