job-manager: add inactive job purge capability #4286

garlick · 2022-04-15T04:51:51Z

This adds two methods of purging inactive jobs:

One can purge interactively, e.g.

$ flux job purge --num-limit=10
purged 90 inactive jobs, 10 remaining

or one can configure automatic purging via TOML

[job-manager]

inactive-age-limit = "7d"
inactive-num-limit = 10000

The job manager retains inactive jobs in a hash and places them in a list ordered by the time they became inactive. When jobs are purged, the entry in the hash is removed along with the corresponding KVS directory.

mergify · 2022-04-15T13:55:59Z

@codecov[bot] is not allowed to run commands

mergify · 2022-04-15T14:37:27Z

@codecov[bot] is not allowed to run commands

grondo · 2022-04-15T14:52:49Z

This is great!

This does seem to have an impact on job throughput, though I'm not sure if it is something to worry about at this point.
For example, on master:

$ flux jobs --stats-only
0 running, 32768 completed, 0 failed, 0 pending
$ src/test/throughput.py -sn 4096
  4096 jobs:   4096 submitted,      0 running, 4096 completed
number of jobs: 4096
submit time:    3.690 s (1110.1 job/s)
script runtime: 29.725s
job runtime:    29.180s
throughput:     140.4 job/s (script: 137.8 job/s)

vs this branch:

$ flux jobs --stats-only
0 running, 32768 completed, 0 failed, 0 pending
$ src/test/throughput.py -sn 4096
  4096 jobs:   4096 submitted,      0 running, 4096 completed
number of jobs: 4096
submit time:    3.444 s (1189.3 job/s)
script runtime: 35.197s
job runtime:    34.764s
throughput:     117.8 job/s (script: 116.4 job/s)

Purging jobs (down to 1024) in this case restores the throughput:
(though of course job-list still shows all 36K jobs)

$ flux job purge --num-limit=1024 --batch=100
purged 35840 inactive jobs, 1024 remaining
$ flux jobs --stats-only
0 running, 36864 completed, 0 failed, 0 pending
$ src/test/throughput.py -sn 4096
  4096 jobs:   4096 submitted,      0 running, 4096 completed
number of jobs: 4096
submit time:    3.719 s (1101.5 job/s)
script runtime: 27.975s
job runtime:    27.574s
throughput:     148.5 job/s (script: 146.4 job/s)

Since the purge is really only meant for the system instance, which is by design the least common instance type, I wonder if we should consider only enabling the job-manager inactive cache by configuration?

Having said that, the throughput numbers here may not even be relevant, since the throughput of real jobs is limited in other ways due to the reality of executing actual processes. However, I did want to bring this up since we've worked so hard to keep this numbers high in the past, I just wanted us to keep aware of the issue.

garlick · 2022-04-15T14:56:01Z

Thank you for testing that! That result is a little surprising to me. I wonder if the list insertion into purge->queue is the problem. Let me poke at that a bit.

grondo · 2022-04-15T15:10:09Z

Some other random thoughts:

The dependency plugins in the job-manager could also make great use of the inactive job cache to allow jobs to have dependencies on other, completed jobs (right now jobs can only have dependencies on active jobs). This might only work if the job "result" for inactive jobs is available.
Apropos of the previous bullet, it might be nice to have a function to lookup a jobid that checks active hash and falls back to inactive hash. Callers can check the job state to determine if the job was active vs inactive to determine from which hash the job was found.
The job-list module of course doesn't purge its cache of inactive jobs. So, after a purge (currently) a job can be listed, but flux job info JOBID would fail (since the jobid is no longer in the kvs). That might be confusing, but probably isn't a big issue at this point.

grondo · 2022-04-15T15:10:40Z

That result is a little surprising to me.

Hm, you may also want to make sure the result is reproducible for you!

garlick · 2022-04-15T15:40:57Z

Good observations!

The job-list module of course doesn't purge its cache of inactive jobs. So, after a purge (currently) a job can be listed, but flux job info JOBID would fail (since the jobid is no longer in the kvs). That might be confusing, but probably isn't a big issue at this point.

I do have a development branch in which the journal is modified to include inactive jobs (optionally) so that job-list doesn't have to scan the KVS. In that branch, the entire eventlog for each job is retained in memory and so a new journal consumer can completely "catch up" with the job manager and track ongoing job events in one stream.

In that scenario, it would make sense to add a message to the journal stream when a job gets purged even though it's not an event per se. That would let a journal consumer like job-list keep the size of its caches in check without the need for something separate like #3688. This is an "open loop" or eventually consistent design, so the situation you describe could still arise although it would be far less likely.

I should probably rebase that branch on top of this and post it as a WIP. If nothing else it's something concrete to contribute to the discussion about #4273 (job-db).

grondo · 2022-04-15T15:47:26Z

Thanks for that update! I apologize, I now realize my comment was a bit off-topic for this specific PR.

garlick · 2022-04-15T16:11:13Z

I now realize my comment was a bit off-topic for this specific PR.

Not at all - that was a good observation and a loose end I failed to articulate.

Hm, you may also want to make sure the result is reproducible for you!

I'm not really seeing it. Or at least the variability in results is too high to really conclude anything. Here are throughput results (src/test/throughput.py -sn 4096) for master and purge branches;

                total           master TP       purge TP

+4k jobs        4k              288.4           300.6
+4k jobs        8k              235.7           280.3
+4k jobs        12k             233.3           224.7
+4k jobs        16k             269.8           244.7
+4k jobs        20k             225.7           229.9
+4k jobs        24k             224.6           247.7
+4k jobs        28k             252.5           226.2
+4k jobs        32k             235.7           246.1

I might be missing something or maybe I should have tried with more jobs?

grondo · 2022-04-15T17:15:07Z

Yeah, I was not getting that much variability on my system. You could try more jobs since your system is clearly double the speed of mine. But if it is not reproducible for you, then this might not be an actual issue.

chu11 · 2022-04-15T17:35:38Z

high level comment, is this something we want to hold off on until we have job archiving (such as #4273) more in place?

or is the idea that since this configurable, the default behavior stays the same and we don't have to worry about archiving. If users want to lose inactive data in the kvs permanently, its ok.

garlick · 2022-04-15T20:44:14Z

or is the idea that since this configurable, the default behavior stays the same and we don't have to worry about archiving. If users want to lose inactive data in the kvs permanently, its ok.

Yeah this is what I was thinking. I might go a little further and recommend that our system instance early adopters purge at 1 week or so just to ensure that we don't have an unsustainable situation. Then we'll add in some form of offline garbage collection and we will be on a healthy trajectory. job-archive will grab all the data pretty aggressively so as long as the configured purge age is on the order of days, we shouldn't be losing data as far as flux-accounting is concerned.

However it's up for debate. Purging would limit the ability to ask what or who ran on a "bad node" beyond the purge period for example. Maybe we can swoop in with a good job-db solution in the near term that would allow longer term preservation and querying of job data?

garlick · 2022-04-15T20:45:47Z

I did reproduce the slowdown by testing with more jobs. Here's some data that shows it on my system:

                total           master TP       purge TP
+32k jobs       32k             203.7           207.7
+8k jobs        40k             210.4           223.1
+8k jobs        48k             228.9           197.0
+8k jobs        56k             210.8           178.8
+8k jobs        64k             206.0           150.9
+8k jobs        72k             203.0           134.4
+8k jobs        80k             205.4           113.3
+8k jobs        88k             216.9           100.5

garlick · 2022-04-15T22:22:02Z

I was using the list backwards so I had to scan the whole thing to insert the next job! With that fixed (last column) the numbers at scale are better:

		total	master	purge	nolist	revlist
+32k jobs	32k	203.7	207.7	198.7	204.5
+8k jobs	40k	210.4	223.1	237.9	225.0
+8k jobs	48k	228.9	197.0	225.5	208.5
+8k jobs	56k	210.8	178.8	217.6	216.9
+8k jobs	64k	206.0	150.9	217.1	211.0
+8k jobs	72k	203.0	134.4	215.5	211.5
+8k jobs	80k	205.4	113.3	197.5	207.4
+8k jobs	88k	216.9	100.5	191.9	209.5

I'll post the fix shortly. I wanted to add a test to ensure the oldest jobs are purged first since I was purging the youngest and no tests were catching it.

grondo · 2022-04-15T22:34:05Z

Nice catch!

garlick · 2022-04-15T23:49:21Z

Hmm, the bionic - py3.7,clang-6.0 builder seems to have gotten stuck in the broker unit tests, on test_runat.t specifically. Restarting since this doesn't seem likely to be related to this pr.

2022-04-15T22:46:21.3560385Z make  check-TESTS
2022-04-15T22:46:21.5422684Z                test_attr.t:  PASS: N=53  PASS=53  FAIL=0 SKIP=0 XPASS=0 XFAIL=0
2022-04-15T22:46:21.5439307Z             test_pmiutil.t:  PASS: N=9   PASS=9   FAIL=0 SKIP=0 XPASS=0 XFAIL=0
2022-04-15T22:46:21.5592137Z             test_service.t:  PASS: N=18  PASS=18  FAIL=0 SKIP=0 XPASS=0 XFAIL=0
2022-04-15T22:46:21.5598217Z             test_liblist.t:  PASS: N=6   PASS=6   FAIL=0 SKIP=0 XPASS=0 XFAIL=0
2022-04-15T22:46:21.6522470Z         test_boot_config.t:  PASS: N=54  PASS=54  FAIL=0 SKIP=0 XPASS=0 XFAIL=0
2022-04-15T22:46:22.7412790Z             test_overlay.t:  PASS: N=83  PASS=83  FAIL=0 SKIP=0 XPASS=0 XFAIL=0
2022-04-15T23:42:25.5427971Z ##[error]The operation was canceled.

garlick · 2022-04-16T00:20:08Z

It failed again but in the purge sharness test. I realize I have a test that depends on a previous test that has the NO_CHAIN_LINT conditional, so adding that to the other dependent test.

garlick · 2022-04-16T15:18:43Z

The job-list module of course doesn't purge its cache of inactive jobs. So, after a purge (currently) a job can be listed, but flux job info JOBID would fail (since the jobid is no longer in the kvs). That might be confusing, but probably isn't a big issue at this point.

I had purging set to 8h on my test system and just spent a minute confused by the fact that flux jobs was still showing jobs that should have been purged. Well I guess that brought home this point for me!

Hmm, I wonder if it would be OK to do something simple for now like publish the list of purged jobs so caches in job-list can be pruned?

garlick · 2022-04-16T17:17:40Z

It was pretty easy to implement so I went ahead and pushed a change to have the job manager publish an event listing job IDs being purged, and then have job-list subscribe to that message and remove jobs.

This invites comparison to PR #3688 (pending forever), where a lot of work was done to implement job-list purging. I think the job manager is the correct place to perform the purge since it also also solves the problem of the KVS growing without bound and is the "source of truth" for a lot of other job information. But some good discussion happened over in that PR, e.g.

flux job list-purge lists job info that is being purged (like dmesg -c)
purged jobids are added by job-list to a hash so they can still be distinguished from invalid jobids

I'm not sure either of those points makes as much sense when job-manager is doing the purging, but I thought I'd raise those points in case there should be some discussion, or if other aspects of that PR come to mind that would be applicable here.

Edit: just realized that the job-list stats are not updated, so purged jobs will still be included in flux jobs --stats-only counts. I should probably fix that.

grondo · 2022-04-17T20:37:36Z

src/modules/job-manager/purge.c

+     * It doesn't seem worth the effort to structure the code to avoid this
+     * due to high complexity of solution, low probability of error, and
+     * minor consequences.
+     */


I didn't have a chance to fully review this PR yet, but FLUX_TEST_VALGRIND=t ./t2809-job-purge.t caught a leak of jobs on successful return here.

==89443== 1,168 (320 direct, 848 indirect) bytes in 8 blocks are definitely lost in loss record 21 of 29 ==89443== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==89443== by 0x4A25070: json_array (in /usr/lib/x86_64-linux-gnu/libjansson.so.4.13.0) ==89443== by 0xA39A93C: purge_inactive_jobs (purge.c:110) ==89443== by 0xA39AC50: purge_request_cb (purge.c:283) ==89443== by 0x487B8B7: call_handler (msg_handler.c:344) ==89443== by 0x487BC49: dispatch_message (msg_handler.c:380) ==89443== by 0x487BC49: handle_cb (msg_handler.c:481) ==89443== by 0x48AACE2: ev_invoke_pending (ev.c:3770) ==89443== by 0x48AE02F: ev_run (ev.c:4190) ==89443== by 0x48AE02F: ev_run (ev.c:4021) ==89443== by 0x487A9FE: flux_reactor_run (reactor.c:128) ==89443== by 0xA392C11: mod_main (job-manager.c:220) ==89443== by 0x119B20: module_thread (module.c:183) ==89443== by 0x4AC4B42: start_thread (pthread_create.c:442)

Ah, most likely that was in the code I just added for updating job-list. Thank you for catching that!

Fixed (squashed into 3966478)

grondo

This LGTM! I ran through some random manual testing and didn't find any issues. Even tried running a set of flux job purge commands in parallel with a small batch count and everything worked great!

Since flux job purge is an unrecoverable operation, I wonder if we want to require a --force option, but otherwise report how many jobs would be purged. I'm not sure how difficult that would be to add. Another option would be an are you sure? (yes/no) type prompt, unless --force is given. I don't think this has to be done before merging this PR, it was just something that occurred to me.

grondo · 2022-04-18T02:16:13Z

src/modules/job-manager/purge.c

+                             "batch", &batch) < 0)
+        goto error;
+    if (age_limit != INACTIVE_AGE_UNLIMITED && age_limit < 0) {
+        errmsg = "age limit must be >= 0 (0=unlimited)";


Is errmsg correct? (INACTIVE_AGE_UNLIMITED is -1. not 0, but perhaps there's some subtlety I missed)

grondo · 2022-04-18T02:17:05Z

src/modules/job-manager/purge.c

+        goto error;
+    }
+    if (num_limit != INACTIVE_NUM_UNLIMITED && num_limit < 0) {
+        errmsg = "num limit must be >= 0 (0=unlimited)";


Same question as above, isn't -1 unlimited?

oops yeah, I changed the values but didn't update the message!

grondo · 2022-04-18T02:20:55Z

src/modules/job-manager/purge.c

+            return errprintf (error,
+                              "job-manager.inactive-num-limit: must be >= 0");


Should the error message be job-manager.inactive-num-limit: must be >= -1?

Actually, for the TOML config my thought was that users wouldn't specify a limit if they don't want one. Does that sound OK?

O/w we should probably define a special value for unlimited age too, like "unlimited".

That makes sense to me.

garlick · 2022-04-18T13:52:14Z

I like the idea of reporting number of jobs purged only, unless --force is given!

Problem: various job manager services cannot distinguish incorrect job IDs from inactive jobs. Instead of removing jobs when they become inactive, store them in an inactive hash. For now, jobs remain there until the job manager is unloaded.

Problem: various job manager requests return "unknown job id" on an inactive but valid jobid. Now that inactive jobs are available in their own hash, try to look up the jobid in the inactive hash and if found, return "job is inactive" instead.

Problem: there is no way to externally probe how many active and inactive jobs the job manager is tracking. Add active_jobs, inactive_jobs keys to the stats object so that 'flux module stats job-manager' can show the number of jobs of each type that are tracked.

Problem: inactive job purging should be based on the time the job actually enters INACTIVE state but that time is not captured. Add job->t_clean which captures the timestamp on the clean event, which is the only event that transitions a job to inactive.

Problem: a zlistx comparator function used to order jobs by priority is named job_comparator(), which makes naming another compararator difficult. Rename internal function to job_priority_comparator(). Update users.

Problem: the overhead of inactive jobs grows without bound in a system instance. Add a mechanism to trim the oldest inactive jobs from the job manager internal hash and KVS. There are two ways to do it. One way is to make a job-manager.purge RPC specifying either a maximum number of retained inactive jobs, or a maximim age (time since transition to inactive) of retained inactive jobs, or both. The other way is to configure those criteria via TOML, e.g. [job-manager] inactive-age-limit = "7d" inactive-num-limit = 10000 The KVS transactions used to unlink batches of jobs is limited to 100 entries to avoid giant messages and head of line blocking in the broker. When purging via configuration, this limits the number of jobs purged per heartbeat interval. When purging via RPC, the caller specifies the batch size (up to the maximum) and must check the returned job count, and repeat the RPC until a count less than the batch size is returned. Note: KVS job.* directories that become emtpy after all the jobs in them have been purged are NOT removed by the purge. This can be accomplished later by an offline dump/restore of the content backing store, since empty directories are not included in dumps.

Problem: there is no convenient way to clean unwanted inactive jobs if purge limits were not configured. Add 'flux job purge', which can interactively remove inactive jobs using limits set on the command line. For example: flux job purge --num-limit=100 purges all inactive jobs except the 100 newest, where "newest" is gauged by the time the job became inactve. flux job purge --age-limit=3d purges all jobs that have been inactive longer than 3 days.

Problem: t2201-job-cmd.t includes some tests that could pass if the 'flux job' fails in unusual ways. Use "FLUX_URI=value test_must_fail cmd..." not "! FLUX_URI=value cmd..." Also, enclose those in a subshell because apparently in some environments but not others, this causes the environment variable setting to leak to other tests, as noted in 9a77e05.

Problem: there is no test coverage for attempting to perform inappropriate job operations on inactive jobs. Add a few tests to t2201-job-cmd.t and t2280-job-memo.t to cover these cases.

Problem: job-manager may purge jobs from the KVS but other parts of the system like job-list may have cached inactive job IDs and are not informed that they are no longer valid. Publish a 'job-purge-inactive' event message containing a list of job IDs each time a batch of jobs are purged by the job manager.

Problem: flux job list -a still shows inactive jobs that have been purged. Subscribe to the 'job-purge-inactive' event message and remove purged jobs from the 'index' hash and the 'inactive' list when notified.

Problem: flux jobs --stats-only includes purged jobs in stats. Update the stats just before purging a job.

Problem: the job manager purge function has no test coverage Add a new sharness test script.

Problem: job purge functionality is undocumented. Add an entry for the purge subcommand to flux-job(1), and add configuration details to flux-config-job-manager(5).

codecov · 2022-04-18T17:00:46Z

Codecov Report

Merging #4286 (a854fd4) into master (77f1708) will increase coverage by 0.01%.
The diff coverage is 84.48%.

❗ Current head a854fd4 differs from pull request most recent head 3cb8819. Consider uploading reports for the commit 3cb8819 to get more accurate results

@@            Coverage Diff             @@
##           master    #4286      +/-   ##
==========================================
+ Coverage   83.56%   83.58%   +0.01%     
==========================================
  Files         388      389       +1     
  Lines       64848    65139     +291     
==========================================
+ Hits        54191    54445     +254     
- Misses      10657    10694      +37

Impacted Files	Coverage Δ
src/modules/job-list/stats.c	`75.00% <45.45%> (-7.23%)`	⬇️
src/modules/job-manager/getattr.c	`58.18% <50.00%> (-1.08%)`	⬇️
src/modules/job-manager/purge.c	`82.69% <82.69%> (ø)`
src/modules/job-manager/job-manager.c	`60.37% <90.90%> (+3.23%)`	⬆️
src/cmd/flux-job.c	`87.36% <93.33%> (+0.11%)`	⬆️
src/modules/job-list/job-list.c	`79.22% <95.00%> (+5.53%)`	⬆️
src/modules/job-manager/alloc.c	`82.80% <100.00%> (ø)`
src/modules/job-manager/annotate.c	`84.09% <100.00%> (+0.18%)`	⬆️
src/modules/job-manager/event.c	`76.92% <100.00%> (+0.27%)`	⬆️
src/modules/job-manager/job.c	`90.71% <100.00%> (+0.20%)`	⬆️
... and 14 more

garlick · 2022-04-18T17:02:57Z

Just force pushed with the fixes discussed, adding flux job purge --force, and squashing some incremental development (hope that's OK). I left the publication of the purged job ids and subscription in job-list broken out as separate commits since that might be a temporary solution, and it should be made clear what to back out later if we come up with something better.

grondo · 2022-04-18T18:12:10Z

Nice! I tested this out and works as described! Feel free to set MWP when you're ready.

grondo · 2022-04-18T18:20:37Z

BTW, this feature satisfies one of my minor use cases: When running the Flux testsuite as a set of jobs within an instance, it is convenient to use flux jobs --stats-only to get a report if any jobs failed. The only way to do this multiple times, though, is to restart the instance to get the stats and previously completed jobs cleared.

Now flux job purge may be used as an alternative (e.g. when you do not want to launch another large instance) 👍

garlick force-pushed the purge branch from fe830b8 to 308ee7b Compare April 15, 2022 13:55

garlick force-pushed the purge branch from 308ee7b to f1bffb7 Compare April 15, 2022 14:36

garlick force-pushed the purge branch from f1bffb7 to 7d01991 Compare April 15, 2022 22:41

garlick mentioned this pull request Apr 15, 2022

job-manager: allow dependencies on completed jobs #4287

Closed

garlick force-pushed the purge branch from 7d01991 to d758904 Compare April 16, 2022 00:18

garlick force-pushed the purge branch from fede56f to 4ca4c93 Compare April 16, 2022 20:56

grondo reviewed Apr 17, 2022

View reviewed changes

garlick force-pushed the purge branch from 4ca4c93 to a854fd4 Compare April 17, 2022 21:35

grondo approved these changes Apr 18, 2022

View reviewed changes

job-manager: preserve inactive jobs

215a94a

Problem: various job manager services cannot distinguish incorrect job IDs from inactive jobs. Instead of removing jobs when they become inactive, store them in an inactive hash. For now, jobs remain there until the job manager is unloaded.

garlick added 13 commits April 18, 2022 10:00

job-manager: rename internal job comparator

cf592cd

Problem: a zlistx comparator function used to order jobs by priority is named job_comparator(), which makes naming another compararator difficult. Rename internal function to job_priority_comparator(). Update users.

testsuite: cover operations on inactive jobs

4ed3749

Problem: there is no test coverage for attempting to perform inappropriate job operations on inactive jobs. Add a few tests to t2201-job-cmd.t and t2280-job-memo.t to cover these cases.

job-list: handle purge event

028b71f

Problem: flux job list -a still shows inactive jobs that have been purged. Subscribe to the 'job-purge-inactive' event message and remove purged jobs from the 'index' hash and the 'inactive' list when notified.

job-list: update statistics when job is purged

9b8989c

Problem: flux jobs --stats-only includes purged jobs in stats. Update the stats just before purging a job.

testsuite: cover job purge

035541d

Problem: the job manager purge function has no test coverage Add a new sharness test script.

flux-job(1), flux-config-job-manager(5): add purge

3cb8819

Problem: job purge functionality is undocumented. Add an entry for the purge subcommand to flux-job(1), and add configuration details to flux-config-job-manager(5).

garlick force-pushed the purge branch from a854fd4 to 3cb8819 Compare April 18, 2022 17:00

garlick added the merge-when-passing label Apr 18, 2022

mergify bot merged commit f800288 into flux-framework:master Apr 18, 2022

garlick deleted the purge branch April 18, 2022 20:12

garlick mentioned this pull request May 3, 2022

job-list: support RPC to purge cache #3688

Closed

grondo mentioned this pull request Jul 26, 2022

job-manager: cache inactive jobs for dependency resolution, etc #3702

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

job-manager: add inactive job purge capability #4286

job-manager: add inactive job purge capability #4286

garlick commented Apr 15, 2022

mergify bot commented Apr 15, 2022

mergify bot commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

chu11 commented Apr 15, 2022

garlick commented Apr 15, 2022

garlick commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022 •

edited

garlick commented Apr 16, 2022

garlick commented Apr 16, 2022

garlick commented Apr 16, 2022 •

edited

grondo Apr 17, 2022

garlick Apr 17, 2022

garlick Apr 17, 2022

grondo left a comment

grondo Apr 18, 2022 •

edited

grondo Apr 18, 2022

garlick Apr 18, 2022

grondo Apr 18, 2022

garlick Apr 18, 2022

grondo Apr 18, 2022

garlick commented Apr 18, 2022 •

edited

codecov bot commented Apr 18, 2022

garlick commented Apr 18, 2022

grondo commented Apr 18, 2022

grondo commented Apr 18, 2022

		return errprintf (error,
		"job-manager.inactive-num-limit: must be >= 0");

job-manager: add inactive job purge capability #4286

job-manager: add inactive job purge capability #4286

Conversation

garlick commented Apr 15, 2022

mergify bot commented Apr 15, 2022

mergify bot commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

chu11 commented Apr 15, 2022

garlick commented Apr 15, 2022

garlick commented Apr 15, 2022

garlick commented Apr 15, 2022

grondo commented Apr 15, 2022

garlick commented Apr 15, 2022 • edited

garlick commented Apr 16, 2022

garlick commented Apr 16, 2022

garlick commented Apr 16, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grondo left a comment

Choose a reason for hiding this comment

grondo Apr 18, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

garlick commented Apr 18, 2022 • edited

codecov bot commented Apr 18, 2022

Codecov Report

garlick commented Apr 18, 2022

grondo commented Apr 18, 2022

grondo commented Apr 18, 2022

garlick commented Apr 15, 2022 •

edited

garlick commented Apr 16, 2022 •

edited

grondo Apr 18, 2022 •

edited

garlick commented Apr 18, 2022 •

edited