etc: support content.backing-module=none #4492

chu11 · 2022-08-13T05:04:47Z

Per discussion in #4267, always loading the backing module content-sqlite can lead to performance issues, especially for shorter lived single user instances. Default to not loading a backing module.

to enable this, cache checkpoint put/get in the broker even when a backing module does not exist. This allows a number of checkpoint operations to still work even if a backing module isn't loaded.
update rc1/rc3 to not load the checkpoint module by default
- update systemd service file to default configure loading content-sqlite
- update tests that require content-sqlite to specifically configure for it

one possible remaining gotcha is that if the user doesn't set a backing module, they are still allowed to load one and that becomes the backing module. Dunno if we'd like to support a special backing module config of "none" (or equivalent word) to say "do not allow a backing module to be loaded"?

garlick · 2022-08-13T14:21:37Z

In #4267

I'm not sure about making this is the default, since there is unexplained job throughput degradation in this mode. However it is not currently possible to even select this mode. Let's allow this issue to be closed once setting content.backing-module=none works as you'd expect.

Could we cut this PR there and save the discussion about making it the default for another time?

chu11 · 2022-08-13T14:37:50Z

Could we cut this PR there and save the discussion about making it the default for another time?

ahhh, I missed that. I'll tweak the PR. Should be simpler as a result as many testsuite updates don't need to happen now.

garlick · 2022-08-13T14:44:28Z

One other thought is would it make sense to split the broker checkpoint stuff off to another source file? content-cache.c is already long and complex and this seems to be somewhat disjoint.

chu11 · 2022-08-16T04:38:42Z

re-pushed

split out content checkpoint code into another file
- the "context' that the content-checkpoint API creates is actually stored within struct content_cache and not struct broker. I
  didn't initially intend for this, but content-cache needs to call content-checkpoint at one point, so I felt it better to do it this way.
rc scripts do not load none by default now

grondo · 2022-08-16T14:13:06Z

rc scripts do not load none by default now

You are probably already on it, but just in case: don't forget to update PR description if it no longer describes the overall change.

garlick

Thanks!

I'm noting that while I can shut down a "none" instance with flux shutdown --dump=foo.tgz, I can't restart from that file:

$ flux start -o,-Scontent.backing-module=none,-Scontent.restore=foo.tgz
2022-08-21T14:05:21.398977Z broker.err[0]: rc1.0: flux-restore: error flushing content cache: Function not implemented

It might be a good idea to pause and question whether the "flush" operation should fail when there is no backing store. Maybe we should redefine it as "flush to backing store, if any"? I know that would require other content tests to be updated. I'm not sure if that's the right answer or not, but if it is, it could also enable the "startlog" stuff to remain in the rc files, and an instance that is restarted several times using dump/restore could have a startlog like one that restarts with a backing store.

Another thing that's probably worth testing is that you can reload the kvs module in a "none" instance, and its content remains valid.

Edit: just realized this checkpoint service is created on all ranks, but I think it should only be created on rank 0? If for some reason it is accessed from other ranks, it should just let the broker forward the requests to rank 0 (which is what happens if the message handlers are only registered on rank 0).

garlick · 2022-08-21T13:02:47Z

src/broker/content-checkpoint.c

+        return;
+    }


Leaks msgcpy on this return path

garlick · 2022-08-21T13:07:36Z

src/broker/content-checkpoint.c

@@ -55,24 +128,36 @@ void content_checkpoint_get_request (flux_t *h, flux_msg_handler_t *mh,
 {
    struct content_checkpoint *checkpoint = arg;
    const char *topic = "content-backing.checkpoint-get";
-    const char *s = NULL;
    const flux_msg_t *msgcpy = flux_msg_incref (msg);


consider renaming msgcpy to msgref since it's only a reference not a copy, and code relies on the fact that key, which points to memory allocated wtihin msg, remains valid after msg is destroyed.

garlick · 2022-08-21T13:11:44Z

src/broker/content-checkpoint.c

+            flux_log_error (checkpoint->h, "%s: flux_respond_pack", __FUNCTION__);
+            goto error;


After respond fails, just log it, don't try to send an error response since that's likely to fail too.

Also log a better message like "error responding to checkpoint-get".

garlick · 2022-08-21T13:16:08Z

src/broker/content-checkpoint.c

-    if (!(f = flux_rpc (h, topic, s, 0, 0))
+    if (!(f = flux_rpc_pack (h, topic, 0, 0, "{s:s}", "key", key))
        || flux_future_aux_set (f,
                                "msg",
                                (void *)msgcpy,
                                (flux_free_f)flux_msg_decref) < 0
+        || flux_future_aux_set (f, "key", (void *)key, NULL) < 0
        || flux_future_then (f,
                             -1,
                             checkpoint_get_continuation,


If this fails after the aux_set of msgcpy succeeds, there is a double free when both f and msgcpy are freed.

Also, instead of logging "error starting checkpoint-get", can you just set errstr to that message and let the caller deal with it?

It would simplify things a bit if key were not stored directly in the aux hash, and instead just retrieved from msg again in the continuation.

garlick · 2022-08-21T13:32:48Z

src/broker/content-checkpoint.c

+    if (!(f = flux_rpc_pack (h, topic, 0, 0,
+                             "{s:s s:O}",
+                             "key", key,
+                             "value", value))
        || flux_future_aux_set (f,


IF this fails after the aux_set, double free when both f and msgcpy are freed.

Also, send textual error response back to requestor rather than logging it.

garlick · 2022-08-21T13:34:08Z

src/broker/content-checkpoint.c

+            flux_log_error (checkpoint->h, "%s: flux_respond", __FUNCTION__);
+            goto error;
+        }
+        return;


leaks msgcpy on this return path

If respond fails, log a better message and don't try to send an error response.

garlick · 2022-08-21T13:39:44Z

etc/rc1

+    backingmod=${backingmod:-content-sqlite}
+    echo ${backingmod}


just echo ${backingmod:-content-sqlite} and skip the second assignment? (repeated in rc3)

chu11 · 2022-08-22T20:09:34Z

Edit: just realized this checkpoint service is created on all ranks, but I think it should only be created on rank 0? If for some reason it is accessed from other ranks, it should just let the broker forward the requests to rank 0 (which is what happens if the message handlers are only registered on rank 0).

Oh good catch, although I think we want to load the checkpoint service all of the time. We just don't want the checkpoint caching service used on rank != 0. This was actually a bug back in #4463, ENOSYS should only be returned when the backing module isn't loaded on rank == 0.

chu11 · 2022-08-22T21:49:34Z

re-pushed, addressing all of the comments above except the discussion about if content.flush should be an error or not. Of particular note:

created a bunch of new functions to deal with the cleanup paths better. Those mem-leaks / double frees were embarrasing :P
add some more tests per comments above (reload kvs, forwarding from rank != 0 works as intended)

chu11 · 2022-08-22T21:56:20Z

It might be a good idea to pause and question whether the "flush" operation should fail when there is no backing store. Maybe we should redefine it as "flush to backing store, if any"? I know that would require other content tests to be updated. I'm not sure if that's the right answer or not, but if it is, it could also enable the "startlog" stuff to remain in the rc files, and an instance that is restarted several times using dump/restore could have a startlog like one that restarts with a backing store.

Hmmmm, my initial feeling is that when someone does content.flush there should be an error returned. Presumably you'd want to know that things weren't backed up properly?

Here's a thought, could content.flush take an argument that is something like --quiet? i.e. don't return an error if the backing isn't there?

Alternately, we could simply add some options to startlog, restore, etc. to not flush when the option is set, and we could set the option when backing module == none.

Edit: not necessarily for this PR, could be a follow up one

chu11 · 2022-08-23T00:14:32Z

argh, re-pushed, fixed up a mem-leak and fixed some bash-isms in my tests that were affecting the CI

codecov · 2022-08-23T04:46:35Z

Codecov Report

Merging #4492 (e9cb6c8) into master (11b0680) will decrease coverage by 0.02%.
The diff coverage is 77.43%.

❗ Current head e9cb6c8 differs from pull request most recent head 08ffb5f. Consider uploading reports for the commit 08ffb5f to get more accurate results

@@            Coverage Diff             @@
##           master    #4492      +/-   ##
==========================================
- Coverage   83.36%   83.34%   -0.03%     
==========================================
  Files         401      402       +1     
  Lines       67649    67771     +122     
==========================================
+ Hits        56397    56481      +84     
- Misses      11252    11290      +38

Impacted Files	Coverage Δ
src/broker/content-checkpoint.c	`76.71% <76.71%> (ø)`
src/broker/content-cache.c	`85.83% <100.00%> (+0.09%)`	⬆️
src/modules/content-files/content-files.c	`77.43% <0.00%> (-1.83%)`	⬇️
src/modules/job-archive/job-archive.c	`62.62% <0.00%> (-0.70%)`	⬇️
src/modules/job-info/guest_watch.c	`76.21% <0.00%> (-0.55%)`	⬇️
src/cmd/builtin/restore.c	`87.50% <0.00%> (-0.38%)`	⬇️
src/common/libsubprocess/subprocess.c	`87.89% <0.00%> (-0.30%)`	⬇️
src/modules/kvs/kvs.c	`70.46% <0.00%> (-0.14%)`	⬇️
src/broker/overlay.c	`86.39% <0.00%> (-0.11%)`	⬇️
src/cmd/flux-job.c	`87.50% <0.00%> (+0.12%)`	⬆️
... and 5 more

chu11 · 2022-08-23T05:24:22Z

and re-pushed, fixing a s3 CI issue

garlick

Much improved though I still had a couple of comments/questions

garlick · 2022-08-23T18:14:50Z

src/broker/content-checkpoint.c

+    if (flux_future_aux_set (f,
+                             "msg",
+                             (void *)msgref,
+                             (flux_free_f)flux_msg_decref) < 0) {
+        flux_msg_decref (msgref);


The msgref temporary variable isn't really accomplishing anything at this point. Might as well just use

if (flux_future_aux_set (f, "msg", (void *)flux_msg_incref (msg), (flux_free_f)flux_msg_decref) < 0) { flux_msg_decref (msg);

Same comment applies to put

garlick · 2022-08-23T18:16:53Z

src/broker/content-checkpoint.c

+    if (content_checkpoint_get_backing (checkpoint, msg, key, &errstr) < 0)
+        goto error;
+


For rank 0, this is forcing the RPC to use the backing store, so it would get ENOSYS in the "none" case. The request should go to the broker RPC. Same comment applies to put.

It looks like t0028-content-backing-none.t includes a test that expects this behavior. Shouldn't it work?

I hate to keep dragging this out. Do we have a use case for not returning ENOSYS to the checkpoint operations on rank > 0? This is primarily used internally by the kvs on rank 0 only, and by the dump/restore and startlog tools, all of which one would expect to run on rank0 I think. Up to you but we could just prune that case and call it good if it cuts down the code. I maybe shouldn't have brought it up, but I was thinking that we could just load the service on rank 0 and let requests be forwarded "naturally", forgetting that the service name is now shared with the content load/store ops that must work on all ranks.

My initial thought upon reading your comments was that I screwed up. We should have content.checkpoint-{get,put} forward to rank 0's content.checkpoint-{get,put} on non-rank 0 brokers. But your comment makes sense that maybe we don't have to, we should just ENOSYS right off the bat for ranks != 0.

For myself, I tend to lean towards "consistency", b/c I just dislike seeing something different in the code.

Let me have rank > 0 forward appropriately to rank 0. I'll spin that off into another PR since it's sort of independent of this and really a mistake from #4463

chu11 · 2022-08-24T20:24:24Z

re-pushed, building on top of #4519, adjust some tests as a result of #4519

chu11 · 2022-08-25T17:58:28Z

rebased and re-pushed now that #4519 is merged, this PR is a lot smaller now :-)

garlick

LGTM, thanks for sticking it out through all the review comments :-)

chu11 · 2022-08-27T02:28:53Z

re-started one builder that had a ton of failures not related to this PR. assumption is workflow/container borked and affected bunch of tests.

Problem: content checkpoints presently only work when content backing modules are loaded. Solution: Cache checkpoint data so that checkpoint put/get works regardless if the backing module is loaded. Update tests in t2807-dump-cmd that need to check for new error messages.

Problem: There are presently no tests to ensure that checkpoint get/put work correctly when backing modules are loaded / unloaded. Solution: Add tests to content-sqlite, content-files, and content-s3 to ensure checkpoint get/put work as expected when backing modules are loaded and unloaded. Add additional tests in a new content "none" testfile.

Problem: By default, rc scripts always assume a content backing module will be loaded. There is no way to specify "no" backing module. Solution: Support "none" as a special input to not load a content backing module. Fixes flux-framework#4267

Problems: No test exists to ensure content.backing-module "none" works in the rc scripts. Solution: Add a test.

chu11 · 2022-08-27T03:20:54Z

rebased & re-pushed, mergifyio seemed to get stuck on something

chu11 force-pushed the issue4267_content_backing_module_none branch from ee8fae5 to 7944d68 Compare August 16, 2022 04:35

chu11 changed the title ~~broker: do not load content backing module by default~~ etc: support content.backing-module=none Aug 16, 2022

garlick requested changes Aug 21, 2022

View reviewed changes

chu11 force-pushed the issue4267_content_backing_module_none branch from 7944d68 to d866c69 Compare August 22, 2022 21:49

chu11 force-pushed the issue4267_content_backing_module_none branch 3 times, most recently from 0b90df9 to 3c3c02a Compare August 23, 2022 00:13

chu11 force-pushed the issue4267_content_backing_module_none branch 2 times, most recently from e9cb6c8 to 08ffb5f Compare August 23, 2022 04:33

chu11 force-pushed the issue4267_content_backing_module_none branch from 08ffb5f to 2846ca1 Compare August 23, 2022 05:24

garlick reviewed Aug 23, 2022

View reviewed changes

This was referenced Aug 24, 2022

testsuite: add more checkpoint sequence tests #4518

Merged

broker: forward content.checkpoint-{get,put} RPCs to rank 0 #4519

Merged

chu11 force-pushed the issue4267_content_backing_module_none branch from 2846ca1 to 293b313 Compare August 24, 2022 20:23

chu11 force-pushed the issue4267_content_backing_module_none branch 2 times, most recently from 2a051fa to 5877028 Compare August 25, 2022 17:57

chu11 mentioned this pull request Aug 26, 2022

[WIP] broker: do not drop dirty cache entries on error #4524

Closed

garlick approved these changes Aug 26, 2022

View reviewed changes

chu11 added the merge-when-passing label Aug 26, 2022

jameshcorbett force-pushed the issue4267_content_backing_module_none branch from 5877028 to aee7c40 Compare August 26, 2022 23:12

chu11 added 4 commits August 26, 2022 20:20

etc: support "none" backing module

5eb618c

Problem: By default, rc scripts always assume a content backing module will be loaded. There is no way to specify "no" backing module. Solution: Support "none" as a special input to not load a content backing module. Fixes flux-framework#4267

testsuite: cover backing module "none" case

f3c2437

Problems: No test exists to ensure content.backing-module "none" works in the rc scripts. Solution: Add a test.

chu11 force-pushed the issue4267_content_backing_module_none branch from aee7c40 to f3c2437 Compare August 27, 2022 03:20

mergify bot merged commit 7deb4f3 into flux-framework:master Aug 27, 2022

chu11 deleted the issue4267_content_backing_module_none branch August 29, 2022 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etc: support content.backing-module=none #4492

etc: support content.backing-module=none #4492

chu11 commented Aug 13, 2022

garlick commented Aug 13, 2022

chu11 commented Aug 13, 2022

garlick commented Aug 13, 2022

chu11 commented Aug 16, 2022

grondo commented Aug 16, 2022

garlick left a comment •

edited

garlick Aug 21, 2022

garlick Aug 21, 2022

garlick Aug 21, 2022

garlick Aug 21, 2022

garlick Aug 21, 2022

garlick Aug 21, 2022

garlick Aug 21, 2022

chu11 commented Aug 22, 2022

chu11 commented Aug 22, 2022

chu11 commented Aug 22, 2022 •

edited

chu11 commented Aug 23, 2022 •

edited

codecov bot commented Aug 23, 2022

chu11 commented Aug 23, 2022

garlick left a comment

garlick Aug 23, 2022

garlick Aug 23, 2022

garlick Aug 23, 2022

chu11 Aug 23, 2022

chu11 commented Aug 24, 2022

chu11 commented Aug 25, 2022

garlick left a comment

chu11 commented Aug 27, 2022

chu11 commented Aug 27, 2022

		flux_log_error (checkpoint->h, "%s: flux_respond_pack", __FUNCTION__);
		goto error;

		if (content_checkpoint_get_backing (checkpoint, msg, key, &errstr) < 0)
		goto error;

etc: support content.backing-module=none #4492

etc: support content.backing-module=none #4492

Conversation

chu11 commented Aug 13, 2022

garlick commented Aug 13, 2022

chu11 commented Aug 13, 2022

garlick commented Aug 13, 2022

chu11 commented Aug 16, 2022

grondo commented Aug 16, 2022

garlick left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chu11 commented Aug 22, 2022

chu11 commented Aug 22, 2022

chu11 commented Aug 22, 2022 • edited

chu11 commented Aug 23, 2022 • edited

codecov bot commented Aug 23, 2022

Codecov Report

chu11 commented Aug 23, 2022

garlick left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chu11 commented Aug 24, 2022

chu11 commented Aug 25, 2022

garlick left a comment

Choose a reason for hiding this comment

chu11 commented Aug 27, 2022

chu11 commented Aug 27, 2022

garlick left a comment •

edited

chu11 commented Aug 22, 2022 •

edited

chu11 commented Aug 23, 2022 •

edited