Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker jobs not running frequently #29

Closed
rahulbot opened this issue Nov 28, 2017 · 5 comments
Closed

worker jobs not running frequently #29

rahulbot opened this issue Nov 28, 2017 · 5 comments
Assignees
Labels

Comments

@rahulbot
Copy link
Collaborator

I set up a flower monitor for our redis queue and the jobs are running super slowly - just one job every two minutes or so! :-(

flower

At first I thought maybe some service was slow - but this chart suggests the job ran quickly, but took forever to get pulled off the queue to be executed, right?

@rahulbot
Copy link
Collaborator Author

Celery is also chewing up all of the CPU; so it doing something compute-intensive, but that isn't the tasks. I must be missing something...
celery-flower-heroku_ _ssh_mcservices1_ _199x47

@rahulbot
Copy link
Collaborator Author

I restarted postgres and all the other services and now the queue is being serviced quickly. I think perhaps there were too many DB handles open and Postgres was the slow part. Of course this means I flushed the queue of jobs, so I have to run update_tasks again manually now to get the last posts and get them analyzed by all the algorithms.
flower
Note: These jobs are being serviced right away, but it looks like average execution time for algorithm jobs is just over one second - this seems pretty long.

@rahulbot
Copy link
Collaborator Author

Reopening, because after I ran add_tasks_to_queue this slowed to crawl again. Why? Perhaps the queue can't handle all the adding and processing a the same time? Perhaps the DB can't handle that many open connections (>50)? What else could be causing this?

Workaround - what if rethink our strategy for fetching. Perhaps something like this:

  1. add a last_twitter_fetch and last_facebook_fetch timestamp columns to User
  2. a new add_incremental_tasks_to_queue that queues up tasks for the last ~30 users, sorted by last_twitter_fetch DESC (ie. people who haven't been updated in a while) and filtered for users that haven't been updated in the last 4 hours (same for last_facebook_fetch)
  3. run add_incremental_tasks_to_queue every 5 minutes on cron

This would have the effect of spreading the queue-filling load, hopefully leaving time for the queue to empty out a bit. It wouldn't reduce the number of DB handles open (one for each worker).

@rahulbot rahulbot reopened this Nov 29, 2017
@jasrub
Copy link
Contributor

jasrub commented Dec 6, 2017

Yes it is probably all the DB connections or all the adding to the queue.
This strategy sounds good to me.

Another option would be to refactor the DB logic to have less connections open somehow.
But as we are not sure this is actually the cause of slowness - your idea seems better.

I won't have time to do this until the holidays. Feel free to take a stab at it in the meantime.

@rahulbot
Copy link
Collaborator Author

split off to a new issue to try solving.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants