RQ: periodically clear failed jobs #4306

rauchy · 2019-11-06T11:29:56Z

As per rq/rq#1143, failed jobs stay in Redis forever. If this is true, we should implement our own periodic cleanup of these jobs.

…Registry at the moment

arikfr · 2019-11-06T11:53:35Z

redash/tasks/general.py

+def purge_failed_jobs():
+    jobs = rq_redis_connection.scan_iter('rq:job:*')
+
+    is_idle = lambda key: rq_redis_connection.object('idletime', key) > settings.JOB_DEFAULT_FAILURE_TTL


This subcommand is available when maxmemory-policy is set to an LRU policy or noeviction.

(This command being idletime)

Is this the default config for Redis?

The default is noeviction. On AWS, for example, it defaults to volatile-lru.

arikfr · 2019-11-06T11:54:14Z

redash/tasks/general.py

+    stale_jobs = [key for key in jobs if is_idle(key) and has_failed(key) and not_in_any_failed_registry(key)]
+
+    for key in stale_jobs:
+        rq_redis_connection.delete(key)


Maybe worth removing it from the FailedRegistry while we at it?

We could do that, but the point is that we want to let the FailedJobRegistry handle its own state. From what I can tell, there aren't any dire consequences to have ghost job ids in the FailedJobRegistry (it is only used for requeueing and in that case - these jobs will simply not get requeued)

If we delete from FailedJobRegistry, we might as well just do that and avoid checking for job inclusion (and avoid the whole comment+bypass at the top of the function).

🤔

I'm just worried that over a year it can accumulate quite a lot of job ids there, which might have some consequences in performance or at least memory usage.

Re. avoid checking job inclusion: I guess we can skip this.

Yeah 02555b9 makes things simpler.

redash/settings/__init__.py

Co-Authored-By: Arik Fraimovich <arik@arikfr.com>

arikfr

👍

…clean-failed-jobs

arikfr added the Backend label Oct 28, 2019

arikfr added this to To do in Switch from Celery to RQ via automation Oct 28, 2019

weekly-digest bot mentioned this pull request Nov 4, 2019

Weekly Digest (28 October, 2019 - 4 November, 2019) #4334

Closed

rauchy moved this from To do to In progress in Switch from Celery to RQ Nov 4, 2019

Omer Lachish added 4 commits November 5, 2019 17:09

add some logging to scheduler

fb50431

clean failed RQ job data from Redis

f73c98a

move stale job purging to tasks/general.py

4508750

provide better documentation on why we don't reject keys in FailedJob…

813e2f9

…Registry at the moment

rauchy requested review from arikfr and jezdez November 6, 2019 11:30

pleasing the CodeClimate overlords

b552095

arikfr reviewed Nov 6, 2019

View reviewed changes

Omer Lachish added 3 commits November 6, 2019 15:03

simplified clenaup by deleting both job data and registry entry

02555b9

use FailedJobRegistry as source of truth for purging

4abbd2b

remove redundant key deletion

0f1f0ec

arikfr reviewed Nov 6, 2019

View reviewed changes

redash/settings/__init__.py Outdated Show resolved Hide resolved

Update redash/settings/__init__.py

79033f4

Co-Authored-By: Arik Fraimovich <arik@arikfr.com>

arikfr reviewed Nov 6, 2019

View reviewed changes

Omer Lachish added 2 commits November 7, 2019 10:31

Merge branch 'master' into clean-failed-jobs

0950f75

Merge branch 'clean-failed-jobs' of github.com:getredash/redash into …

378fd2e

…clean-failed-jobs

rauchy merged commit a33d11d into master Nov 7, 2019

Switch from Celery to RQ automation moved this from In progress to Done Nov 7, 2019

rauchy deleted the clean-failed-jobs branch November 7, 2019 15:00

weekly-digest bot mentioned this pull request Nov 11, 2019

Weekly Digest (4 November, 2019 - 11 November, 2019) #4343

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RQ: periodically clear failed jobs #4306

RQ: periodically clear failed jobs #4306

rauchy commented Nov 6, 2019

arikfr Nov 6, 2019

rauchy Nov 6, 2019

arikfr Nov 6, 2019

rauchy Nov 6, 2019

rauchy Nov 6, 2019

arikfr Nov 6, 2019

rauchy Nov 6, 2019

arikfr left a comment

RQ: periodically clear failed jobs #4306

RQ: periodically clear failed jobs #4306

Conversation

rauchy commented Nov 6, 2019

arikfr Nov 6, 2019

Choose a reason for hiding this comment

rauchy Nov 6, 2019

Choose a reason for hiding this comment

arikfr Nov 6, 2019

Choose a reason for hiding this comment

rauchy Nov 6, 2019

Choose a reason for hiding this comment

rauchy Nov 6, 2019

Choose a reason for hiding this comment

arikfr Nov 6, 2019

Choose a reason for hiding this comment

rauchy Nov 6, 2019

Choose a reason for hiding this comment

arikfr left a comment

Choose a reason for hiding this comment