-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libsubprocess: support FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF #5975
libsubprocess: support FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF #5975
Conversation
5bb7b6e
to
406d9ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of quick comments on a first pass
/* This function is similar to flux_subprocess_read() but is used | ||
* exclusively with FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF. | ||
* | ||
* Unlike flux_subprocess_read() the returned data is not NUL | ||
* terminated. This can be useful for performance improvements, | ||
* especially when data is written out to storage instead of a user's | ||
* terminal. | ||
*/ | ||
int flux_subprocess_read_local_unbuf (flux_subprocess_t *p, | ||
const char *stream, | ||
const char **bufp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just have the user call flux_subprocess_read()
and change the behavior based on the flag?
The additional public function seems unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the lack of a guaranteed NUL at the end made me want to create a new function and make that fact explicit, since it is a pretty big difference. It certainly could be done with flux_subprocess_read()
, but I didn't like having returned data being different depending on flags.
So I guess a style choice. Shall we get another vote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure on another vote.
Documentation that flux_subprocess_read()
does not NULL terminate seems sufficient to me but maybe I like to live dangerously :-)
Alternatively, mabye we could just change iodecode()
to add the termination?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is fine to change behavior of the flux_subprocess_read()
function based on the flag.
I haven't re-familiarized myself with the code enough to comment on @garlick's proposal to change iodecode()
as an alternative though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well if we've got two votes to not worry about changing the behavior then I would say lets not get into changing iodecode()
, if @chu11 agrees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, lets just change flux_subprocess_read()
based on flags.
src/modules/job-exec/bulk-exec.c
Outdated
if (flags & FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF) { | ||
if ((len = flux_subprocess_read_local_unbuf (p, stream, &s)) < 0) { | ||
flux_log_error (exec->h, "flux_subprocess_read_local_unbuf"); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fine to assume the UNBUF flag, e.g. just |= to the flags provided to bulk_exec_push_cmd()
and drop the new public function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
c7b178a
to
ea18008
Compare
re-pushed given the comments above, did an additional profiling run and it showed some smalll incremental improvements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a first pass and had some comments/questions
@@ -420,35 +420,37 @@ static int remote_channel_setup (flux_subprocess_t *p, | |||
if (wflag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message still mentions flux_subprocess_read_local_unbuf()
src/common/libsubprocess/remote.c
Outdated
/* N.B. we technically should clear/memset c->unbuf_data | ||
* before returning EOF, as a user could read from it during | ||
* the eof callback. But for performance reasons, we won't do | ||
* that and assume the caller is smart enough not to do that. | ||
*/ | ||
c->read_eof_received = true; | ||
c->unbuf_len = 0; | ||
c->unbuf_eof = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the remote returns data + EOF, doesn't setting c->unbuf_len
to zero cause the data to be lost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
N.B. we technically should clear/memset c->unbuf_data before returning EOF, as a user could read from it during the eof callback. But for performance reasons, we won't do that and assume the caller is smart enough not to do that.
Well also it's a const char *
so it's not possible to memset it.
If setting c->unbuf_len
to zero why not set c->unbuf_data
to NULL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh that comment was leftover from a prior iteration of this PR. I should remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If setting c->unbuf_len to zero why not set c->unbuf_data to NULL?
I guess it wasn't completely necessary, but you are correct, it wouldn't hurt.
If the remote returns data + EOF, doesn't setting c->unbuf_len to zero cause the data to be lost?
Within this function, all the data is returned in the first block in the following code. Then the EOF in the following block. So the second setting of unbuf_len = 0 is intentional, to clear out that data and send EOF to the caller.
if (data && len) {
c->unbuf_data = data;
c->unbuf_len = len;
c->unbuf_consumed = false;
if (eof)
c->read_eof_received = true;
c->unbuf_eof = false;
c->output_cb (c->p, c->name);
c->unbuf_consumed = false;
}
/* N.B. any data not consumed by the user is lost, so if eof is
* seen, we can send it in the local unbuf immediately */
if (eof) {
c->read_eof_received = true;
c->unbuf_data = NULL;
c->unbuf_len = 0;
c->unbuf_eof = true;
c->output_cb (c->p, c->name);
c->eof_sent_to_caller = true;
c->p->channels_eof_sent++;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! My bad, I did not catch the output_cb()
call in there.
/* If user called this function twice, return nothing. They | ||
* will hopefully call flux_subprocess_read_stream_closed() to | ||
* verify EOF was not yet reached. | ||
*/ | ||
if (c->unbuf_consumed) | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag and the check seem unnecessary. We can probably just document that reading twice returns the same data and it won't be very surprising given the lack of a buffer.
See doc comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag and the check seem unnecessary. We can probably just document that reading twice returns the same data and it won't be very surprising given the lack of a buffer.
I was trying to be consistent to how the buffered case works. But yeah, we could tweak that.
/* flux_rexec(): In order to improve performance, do not locally | ||
* copy and buffer output from the remote subprocess. Immediately | ||
* call output callbacks. Users can directly retrieve data via | ||
* flux_subprocess_read(). Data will not NUL terminated. Use of | ||
* other read functions will result in a EPERM error. | ||
*/ | ||
FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF = 8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users can directly retrieve data via
flux_subprocess_read()
.
We probably should specify that flux_subprocess_read()
must be called exactly once when the output callback is invoked. If not called, data is lost. If called more than once, the same data is returned.
Data will not NUL terminated.
Missing a "be".
Use of other read functions will result in a EPERM error.
EPERM is kind of strange here. Would EINVAL be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EPERM is kind of strange here. Would EINVAL be better?
I currently use EPERM for another case where it is "illegal" to call a certain function. It seems an ok choice. To me EINVAL just makes you think bad input. Hmmm, a better option isn't coming to me. Will sleep on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No big deal I suppose!
return subprocess_read (p, stream, bufp, false, false, false, NULL); | ||
return subprocess_read (p, stream, bufp, false, false, false, true, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method of reducing duplicated code with a common function with booleans that alter its behavior is hard to follow. Instead of adding one more boolean to this function, how about adding a check to flux_subprocess_read_line()
et al like this:
if (p && (p->flags & FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF)) {
errno = EINVAL;
return -1;
}
src/common/libsubprocess/remote.c
Outdated
c->read_eof_received = true; | ||
c->unbuf_len = 0; | ||
c->unbuf_eof = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have both c->eof_received
and c->unbuf_eof
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unbuf_eof
is left over from a prior attempt. Should remove it.
len = flux_subprocess_read_line (p, stream, &line); | ||
ok (len < 0 | ||
&& errno == EPERM, | ||
"flux_subprocess_read_line fails w/ EPERM w/ LOCAL_UNBUF"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests like this should zero errno
before testing that the function sets it.
src/modules/job-exec/bulk-exec.c
Outdated
/* bulk-exec defaults to use unbuffered for performance */ | ||
c->flags = flags | FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just say "bulk-exec always uses unbuffered reads for performance" since calling it a default sort of implies that there could be a way to change it.
ea18008
to
f3562da
Compare
re-pushed w/ tweaks per comments above. Only notable thing is kept EPERM vs EINVAL for the "can't use these functions" error case. |
Sounds good. I'll get this on my test system that is configured for sdexec and see how it goes. One concern is the lack of output buffering in |
The shell barrier protocol is fine since I believe it's the only user of stdout, and the protocol consists of the shell sending a line of output and then job exec sending one line back. There is not much opportunity for buffering (or lack thereof) to screw that up. @grondo suggested I print something from a userrc to check how -- foo.lua
io.stderr:write("foo bar\n")
io.stderr:write("foo bar\n")
io.stderr:write("foo bar\n")
io.stderr:write("foo bar\n") With master
With this PR there is evidence that lines are not being buffered.
|
Oh! Even though the remote subprocess is line buffered (the default), we are using |
This seems to fix that problem. diff --git a/src/common/libsubprocess/server.c b/src/common/libsubprocess/server.c
index fca01444e..139923f23 100644
--- a/src/common/libsubprocess/server.c
+++ b/src/common/libsubprocess/server.c
@@ -246,7 +246,10 @@ static void proc_output_cb (flux_subprocess_t *p, const char *stream)
const char *buf;
int len;
- if ((len = flux_subprocess_read (p, stream, &buf)) < 0) {
+ len = flux_subprocess_getline (p, stream, &buf);
+ if (len < 0 && errno == EPERM) // not line buffered
+ len = flux_subprocess_read (p, stream, &buf);
+ if (len < 0) {
llog_error (s,
"error reading from subprocess stream %s: %s",
stream, |
Hmmmm. There's a part of me wondering if I originally messed up and had intended to NOT line buffer on the server side, i.e. all line buffer handling occurs on the client side. But now that we have LOCAL_UNBUF, perhaps that no longer matters and the approach you have is what needs to be done. |
I guess it's a trade off. The client has turned out to be pretty heavy with the buffering on that end. OTOH if we buffer on the server end then we can end up with more message framing overhead when there are bursts of messages. But since it's a common idiom to have many clients in one place spread out across many servers, moving the buffering to the server side probably makes sense. Should we add the server side change in this PR and also maybe a test to show that we get one callback per line in UNBUF mode with the libsubprocess server? Then we could follow up with a PR for sdexec to add some buffering to it. I can look at that. |
1d82dc8
to
ef54471
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5975 +/- ##
==========================================
- Coverage 83.30% 83.29% -0.01%
==========================================
Files 518 518
Lines 83439 83481 +42
==========================================
+ Hits 69506 69534 +28
- Misses 13933 13947 +14
|
ef54471
to
e116ac8
Compare
re-pushed, adding commit to update server side to send each line buffered chunk and added a multiline test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we're good!
e116ac8
to
1a567ee
Compare
Problem: When output is returned from a remote process, data is copied into a local buffer before it is read by callers. This can become a performance issue at larger scales. Support a new FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF flag. When launching a remote subprocess, do not locally buffer the data. Instead, immediately call the respective output callback and give the data to the user via flux_subprocess_read().
Problem: If line buffering is enabled on a remote subprocess, the server does not respond with output containing a single line of data. This was not an issue when output data was buffered on the client side, but with the new LOCAL_UNBUF flag, multiple lines of data could be read in a single callback. In the server, get flux_subprocess_getline() over flux_subprocess_read(). Only use flux_subprocess_read() if line buffering was not enabled.
Problem: There is currently not testing for the new FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF flag. Add coverage in test/remote.c. Add test in test/subprocess.c to ensure the flag only works with remote subprocesses.
Problem: Tests show that when launching many remote subprocesses through job-exec, there can be performance slowdown due to libsubprocess. Use the new libsubprocess FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF to improve performance. This flag will skip the buffering of data before being sent to the caller. In additiona, it will not bother to NUL terminate the data, which job-exec does not need since it is immediately sent to the eventlog.
hmmm alpine builder failed
these are all job-exec reload module tests. Wondering if the recent additions kill-timeout, etc. have issues when the module is reloaded. on alpine, the globals may not be reset. Here's one troubling failure
I'll restart the builder, this obviously passed on other runs allowing this to be merged. Edit: failed again same way |
Problem: Per comment from trws in 97421e8 The musl libc loader doesn't actually unload objects on dlclose, so a subsequent dlopen doesn't re-clear globals and similar. Since we support config reloading in job-exec, we need to re-initialize globals on each load, otherwise invalid data will exist in globals. The new signaling globals are not re-initialized on reload. Solution: Re-initialize signaling globals in job_exec_set_config_globals().
1a567ee
to
a35f615
Compare
without any better ideas, re-pushed with an extra commit that re-initializes the new signaling globals. I'm guessing my changes to job-exec tickled things enough to make things fail. |
Ah, yeah, that makes sense. Probably we were only lucky to get away with it previously. |
oh whoops, i should have removed |
Well that change overwrites the previous setting on any configuration reload with the defaults. However, probably not a big deal since these are mostly used only for testing at this point... |
Gah ... that's right, should have passed the saved argc/argv into it like we do with the exec_config ... I'll do a fix.
|
Per discussion in #5919, some notes
flux_subprocess_get_flags()
function and calling an appropriate "read" function depending on flags. This could be done differently of course, perhaps the flags aren't passed in from the API and are internalized withinbulk-exec
?