Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidekiq stops processing tasks, queue grows to huge size #9389

Closed
joenepraat opened this issue Nov 29, 2018 · 8 comments
Closed

Sidekiq stops processing tasks, queue grows to huge size #9389

joenepraat opened this issue Nov 29, 2018 · 8 comments

Comments

@joenepraat
Copy link
Contributor

joenepraat commented Nov 29, 2018

As told to @Gargron yesterday, Sidekiq stopped processing tasks and the queue kept growing to ~40.000. Restarting Sidekiq restarted processing (and temporary increasing the # of Sidekiq threads cleared the queue).

This morning I encountered the same issue. The # of processed tasks didn't grow anymore and the queue was at ~20.000 tasks and growing. I found these errors in the systemd journal 8 times between 01:16 and 03:08. This is the last one:

Nov 29 03:08:31 todon.nl bundle[6377]: /home/mastodon/live/vendor/bundle/ruby/2.5.0/gems/http-3.3.0/lib/http/connection.rb:129: warning: HTTP::Timeout::Null#closed? at /home/mastodon/.rbenv/versions/2.5.3/lib/ruby/2.5.0/forwardable.rb:157 forwarding to private method NilClass#closed?

Restarting Sidekiq restarted processing tasks.

Specifications

Mastodon 2.6.2
Sidekiq 5.2.2
Debian 9.6

@joenepraat
Copy link
Contributor Author

joenepraat commented Nov 29, 2018

P.S. This started after updating to Mastodon 2.6.2 (from 2.6.1)

@joenepraat
Copy link
Contributor Author

I think the error message (what is more a warning when I look better) I added here is not related. I had these messages today already a few times without impact.

Please tell me how where to look for?

@Gargron
Copy link
Member

Gargron commented Nov 29, 2018

I have asked you to check if some Sidekiq jobs are taking unnaturally long, that would shed light on the cause of the issue. On the Sidekiq "busy" tab, there is a listing of all jobs that are currently being processed, with a column that displays when the job started processing. Usually they should all say "just now" or "20 seconds ago". You will notice a problem if some of them say "20 minutes ago" or "21 hours ago".

Lacking this information, I can guess that this might be related to c39d7e7, which I intend to release in 2.6.3 asap, but it would be nice to get confirmation.

@joenepraat
Copy link
Contributor Author

I checked that column, but they were all recent or a certain amount of seconds ago.

Let me see how things work after upgrading to 2.6.3.

@ClearlyClaire
Copy link
Contributor

I've seen it happen on a fellow admin's instance, with ThreadResolveWorker, ProcessingWorker and DeliveryWorker.
I suspect it's because of #9329 and would thus be fixed by #9381. I instructed them to cherry-pick the latter, we will see if that works out.

@kit-ty-kate
Copy link

kit-ty-kate commented Nov 29, 2018

cherry-picking those two PRs worked for me so far. It unlocked the froze processes

@Gargron
Copy link
Member

Gargron commented Nov 30, 2018

I released 2.6.3

@Gargron Gargron closed this as completed Nov 30, 2018
@joenepraat
Copy link
Contributor Author

Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants