worker jobs not running frequently #29

rahulbot · 2017-11-28T23:06:57Z

I set up a flower monitor for our redis queue and the jobs are running super slowly - just one job every two minutes or so! :-(

At first I thought maybe some service was slow - but this chart suggests the job ran quickly, but took forever to get pulled off the queue to be executed, right?

rahulbot · 2017-11-28T23:11:31Z

Celery is also chewing up all of the CPU; so it doing something compute-intensive, but that isn't the tasks. I must be missing something...

rahulbot · 2017-11-28T23:37:28Z

I restarted postgres and all the other services and now the queue is being serviced quickly. I think perhaps there were too many DB handles open and Postgres was the slow part. Of course this means I flushed the queue of jobs, so I have to run update_tasks again manually now to get the last posts and get them analyzed by all the algorithms.

Note: These jobs are being serviced right away, but it looks like average execution time for algorithm jobs is just over one second - this seems pretty long.

rahulbot · 2017-11-29T00:44:56Z

Reopening, because after I ran add_tasks_to_queue this slowed to crawl again. Why? Perhaps the queue can't handle all the adding and processing a the same time? Perhaps the DB can't handle that many open connections (>50)? What else could be causing this?

Workaround - what if rethink our strategy for fetching. Perhaps something like this:

add a last_twitter_fetch and last_facebook_fetch timestamp columns to User
a new add_incremental_tasks_to_queue that queues up tasks for the last ~30 users, sorted by last_twitter_fetch DESC (ie. people who haven't been updated in a while) and filtered for users that haven't been updated in the last 4 hours (same for last_facebook_fetch)
run add_incremental_tasks_to_queue every 5 minutes on cron

This would have the effect of spreading the queue-filling load, hopefully leaving time for the queue to empty out a bit. It wouldn't reduce the number of DB handles open (one for each worker).

jasrub · 2017-12-06T04:43:02Z

Yes it is probably all the DB connections or all the adding to the queue.
This strategy sounds good to me.

Another option would be to refactor the DB logic to have less connections open somehow.
But as we are not sure this is actually the cause of slowness - your idea seems better.

I won't have time to do this until the holidays. Feel free to take a stab at it in the meantime.

rahulbot · 2017-12-18T18:22:36Z

split off to a new issue to try solving.

rahulbot added the bug label Nov 28, 2017

rahulbot assigned jasrub Nov 28, 2017

rahulbot closed this as completed Nov 28, 2017

rahulbot reopened this Nov 29, 2017

rahulbot mentioned this issue Dec 18, 2017

re-architect content fetching #31

Closed

rahulbot closed this as completed Dec 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

worker jobs not running frequently #29

worker jobs not running frequently #29

rahulbot commented Nov 28, 2017

rahulbot commented Nov 28, 2017

rahulbot commented Nov 28, 2017

rahulbot commented Nov 29, 2017

jasrub commented Dec 6, 2017

rahulbot commented Dec 18, 2017

worker jobs not running frequently #29

worker jobs not running frequently #29

Comments

rahulbot commented Nov 28, 2017

rahulbot commented Nov 28, 2017

rahulbot commented Nov 28, 2017

rahulbot commented Nov 29, 2017

jasrub commented Dec 6, 2017

rahulbot commented Dec 18, 2017