New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etc: support content.backing-module=none #4492
etc: support content.backing-module=none #4492
Conversation
In #4267
Could we cut this PR there and save the discussion about making it the default for another time? |
ahhh, I missed that. I'll tweak the PR. Should be simpler as a result as many testsuite updates don't need to happen now. |
One other thought is would it make sense to split the broker checkpoint stuff off to another source file? content-cache.c is already long and complex and this seems to be somewhat disjoint. |
ee8fae5
to
7944d68
Compare
re-pushed
|
You are probably already on it, but just in case: don't forget to update PR description if it no longer describes the overall change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I'm noting that while I can shut down a "none" instance with flux shutdown --dump=foo.tgz
, I can't restart from that file:
$ flux start -o,-Scontent.backing-module=none,-Scontent.restore=foo.tgz
2022-08-21T14:05:21.398977Z broker.err[0]: rc1.0: flux-restore: error flushing content cache: Function not implemented
It might be a good idea to pause and question whether the "flush" operation should fail when there is no backing store. Maybe we should redefine it as "flush to backing store, if any"? I know that would require other content tests to be updated. I'm not sure if that's the right answer or not, but if it is, it could also enable the "startlog" stuff to remain in the rc files, and an instance that is restarted several times using dump/restore could have a startlog like one that restarts with a backing store.
Another thing that's probably worth testing is that you can reload the kvs module in a "none" instance, and its content remains valid.
Edit: just realized this checkpoint service is created on all ranks, but I think it should only be created on rank 0? If for some reason it is accessed from other ranks, it should just let the broker forward the requests to rank 0 (which is what happens if the message handlers are only registered on rank 0).
src/broker/content-checkpoint.c
Outdated
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaks msgcpy on this return path
src/broker/content-checkpoint.c
Outdated
@@ -55,24 +128,36 @@ void content_checkpoint_get_request (flux_t *h, flux_msg_handler_t *mh, | |||
{ | |||
struct content_checkpoint *checkpoint = arg; | |||
const char *topic = "content-backing.checkpoint-get"; | |||
const char *s = NULL; | |||
const flux_msg_t *msgcpy = flux_msg_incref (msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider renaming msgcpy
to msgref
since it's only a reference not a copy, and code relies on the fact that key
, which points to memory allocated wtihin msg
, remains valid after msg
is destroyed.
src/broker/content-checkpoint.c
Outdated
flux_log_error (checkpoint->h, "%s: flux_respond_pack", __FUNCTION__); | ||
goto error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After respond fails, just log it, don't try to send an error response since that's likely to fail too.
Also log a better message like "error responding to checkpoint-get".
src/broker/content-checkpoint.c
Outdated
if (!(f = flux_rpc (h, topic, s, 0, 0)) | ||
if (!(f = flux_rpc_pack (h, topic, 0, 0, "{s:s}", "key", key)) | ||
|| flux_future_aux_set (f, | ||
"msg", | ||
(void *)msgcpy, | ||
(flux_free_f)flux_msg_decref) < 0 | ||
|| flux_future_aux_set (f, "key", (void *)key, NULL) < 0 | ||
|| flux_future_then (f, | ||
-1, | ||
checkpoint_get_continuation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this fails after the aux_set of msgcpy
succeeds, there is a double free when both f
and msgcpy
are freed.
Also, instead of logging "error starting checkpoint-get", can you just set errstr
to that message and let the caller deal with it?
It would simplify things a bit if key
were not stored directly in the aux hash, and instead just retrieved from msg
again in the continuation.
src/broker/content-checkpoint.c
Outdated
if (!(f = flux_rpc_pack (h, topic, 0, 0, | ||
"{s:s s:O}", | ||
"key", key, | ||
"value", value)) | ||
|| flux_future_aux_set (f, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IF this fails after the aux_set, double free when both f
and msgcpy
are freed.
Also, send textual error response back to requestor rather than logging it.
src/broker/content-checkpoint.c
Outdated
flux_log_error (checkpoint->h, "%s: flux_respond", __FUNCTION__); | ||
goto error; | ||
} | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaks msgcpy
on this return path
If respond fails, log a better message and don't try to send an error response.
etc/rc1
Outdated
backingmod=${backingmod:-content-sqlite} | ||
echo ${backingmod} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just echo ${backingmod:-content-sqlite}
and skip the second assignment? (repeated in rc3)
Oh good catch, although I think we want to load the checkpoint service all of the time. We just don't want the checkpoint caching service used on rank != 0. This was actually a bug back in #4463, ENOSYS should only be returned when the backing module isn't loaded on rank == 0. |
7944d68
to
d866c69
Compare
re-pushed, addressing all of the comments above except the discussion about if
|
Hmmmm, my initial feeling is that when someone does Here's a thought, could Alternately, we could simply add some options to startlog, restore, etc. to not flush when the option is set, and we could set the option when backing module == Edit: not necessarily for this PR, could be a follow up one |
0b90df9
to
3c3c02a
Compare
argh, re-pushed, fixed up a mem-leak and fixed some bash-isms in my tests that were affecting the CI |
e9cb6c8
to
08ffb5f
Compare
Codecov Report
@@ Coverage Diff @@
## master #4492 +/- ##
==========================================
- Coverage 83.36% 83.34% -0.03%
==========================================
Files 401 402 +1
Lines 67649 67771 +122
==========================================
+ Hits 56397 56481 +84
- Misses 11252 11290 +38
|
and re-pushed, fixing a s3 CI issue |
08ffb5f
to
2846ca1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much improved though I still had a couple of comments/questions
src/broker/content-checkpoint.c
Outdated
if (flux_future_aux_set (f, | ||
"msg", | ||
(void *)msgref, | ||
(flux_free_f)flux_msg_decref) < 0) { | ||
flux_msg_decref (msgref); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The msgref
temporary variable isn't really accomplishing anything at this point. Might as well just use
if (flux_future_aux_set (f,
"msg",
(void *)flux_msg_incref (msg),
(flux_free_f)flux_msg_decref) < 0) {
flux_msg_decref (msg);
Same comment applies to put
src/broker/content-checkpoint.c
Outdated
if (content_checkpoint_get_backing (checkpoint, msg, key, &errstr) < 0) | ||
goto error; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For rank 0, this is forcing the RPC to use the backing store, so it would get ENOSYS in the "none" case. The request should go to the broker RPC. Same comment applies to put
.
It looks like t0028-content-backing-none.t
includes a test that expects this behavior. Shouldn't it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate to keep dragging this out. Do we have a use case for not returning ENOSYS to the checkpoint operations on rank > 0? This is primarily used internally by the kvs on rank 0 only, and by the dump/restore and startlog tools, all of which one would expect to run on rank0 I think. Up to you but we could just prune that case and call it good if it cuts down the code. I maybe shouldn't have brought it up, but I was thinking that we could just load the service on rank 0 and let requests be forwarded "naturally", forgetting that the service name is now shared with the content load/store ops that must work on all ranks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial thought upon reading your comments was that I screwed up. We should have content.checkpoint-{get,put}
forward to rank 0's content.checkpoint-{get,put}
on non-rank 0 brokers. But your comment makes sense that maybe we don't have to, we should just ENOSYS
right off the bat for ranks != 0.
For myself, I tend to lean towards "consistency", b/c I just dislike seeing something different in the code.
Let me have rank > 0 forward appropriately to rank 0. I'll spin that off into another PR since it's sort of independent of this and really a mistake from #4463
2846ca1
to
293b313
Compare
2a051fa
to
5877028
Compare
rebased and re-pushed now that #4519 is merged, this PR is a lot smaller now :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for sticking it out through all the review comments :-)
5877028
to
aee7c40
Compare
re-started one builder that had a ton of failures not related to this PR. assumption is workflow/container borked and affected bunch of tests. |
Problem: content checkpoints presently only work when content backing modules are loaded. Solution: Cache checkpoint data so that checkpoint put/get works regardless if the backing module is loaded. Update tests in t2807-dump-cmd that need to check for new error messages.
Problem: There are presently no tests to ensure that checkpoint get/put work correctly when backing modules are loaded / unloaded. Solution: Add tests to content-sqlite, content-files, and content-s3 to ensure checkpoint get/put work as expected when backing modules are loaded and unloaded. Add additional tests in a new content "none" testfile.
Problem: By default, rc scripts always assume a content backing module will be loaded. There is no way to specify "no" backing module. Solution: Support "none" as a special input to not load a content backing module. Fixes flux-framework#4267
Problems: No test exists to ensure content.backing-module "none" works in the rc scripts. Solution: Add a test.
aee7c40
to
f3c2437
Compare
rebased & re-pushed, mergifyio seemed to get stuck on something |
Per discussion in #4267, always loading the backing module content-sqlite can lead to performance issues, especially for shorter lived single user instances. Default to not loading a backing module.
content-sqlite
one possible remaining gotcha is that if the user doesn't set a backing module, they are still allowed to load one and that becomes the backing module. Dunno if we'd like to support a special backing module config of "none" (or equivalent word) to say "do not allow a backing module to be loaded"?