Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

Fresque has stopped processing jobs once a month #9

Closed
andrewryno opened this issue Apr 12, 2013 · 3 comments
Closed

Fresque has stopped processing jobs once a month #9

andrewryno opened this issue Apr 12, 2013 · 3 comments
Assignees

Comments

@andrewryno
Copy link

I can't really figure out why and it's hard to for me to debug but last month (March 1st) and this month (April 5th) fresque stopped polling for jobs. It was still searching for new jobs, but it just wouldn't see that they were there and grab them. resque-web showed 900+ on the queue, but none would be processing. Restarting the workers works.

I'm using php-resque-ex as the library. Everything else is pretty much default.

Not sure if this is something you've seen?

@wa0x6e
Copy link
Owner

wa0x6e commented Apr 12, 2013

what does it says when you restart the workers ?

Does it kill the old workers correctly ? Or there's an error message saying
that PID does not exist ?

Are your workers still running ? Sometimes, there's some glitch when
forking process, so the workers die when the fork died.

What platform are you on ?

Next time it happens, check if you're workers are still running with "ps
aux | grep resque". If not, there's something weird killing the workers.

On Fri, Apr 12, 2013 at 11:40 AM, Andrew Ryno notifications@github.comwrote:

I can't really figure out why and it's hard to for me to debug but last
month (March 1st) and this month (April 5th) fresque stopped polling for
jobs. It was still searching for new jobs, but it just wouldn't see that
they were there and grab them. resque-web showed 900+ on the queue, but
none would be processing. Restarting the workers works.

I'm using php-resque-ex as the library. Everything else is pretty much
default.

Not sure if this is something you've seen?


Reply to this email directly or view it on GitHubhttps://github.com//issues/9
.

@andrewryno
Copy link
Author

Yeah sorry for the lack of info. I've tried to debug it but haven't seen anything out of the ordinary. Killing the workers finished fine, but when starting them up again it says failed, but they actually are there. But they still process jobs. So I don't know if that's part of it. resque-web said the workers were still running, they have a PID, etc. :\ Feel free to close this then if it happens again I can comment again. Just wasn't sure if you've seen this before.

@wa0x6e
Copy link
Owner

wa0x6e commented Apr 13, 2013

I'm encountering this sometime too, but I don't know if it's the same issue.

Sometimes, some of my workers die unexpectedly, I can confirm that by running ps aux | grep resque, because it will not find the workers process.

But you're telling me your workers are still running, but just not processing jobs. You should check the log files, they'll tell you what your workers are doing. With full verbose, it'll log each action : sleep for x seconds, check queues X, found X jobs on queues X, etc ...

It's the most important piece of evidence to solve the mystery.

For the restarting problem, where it'll says that it failed, but in reality, it's not, it's because the Redis server is overloaded. When restarted, the workers are processing all the 900+ jobs, and fresque have some difficulties connecting to redis to confirm that the workers are really started.

Feel free to open another ticket with your log if it happens again.

@wa0x6e wa0x6e closed this as completed Apr 13, 2013
@ghost ghost assigned wa0x6e Apr 13, 2013
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants