Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shell: Support stdout/stderr redirect to a file #2395

Merged
merged 11 commits into from Sep 28, 2019

Conversation

@chu11
Copy link
Contributor

commented Sep 25, 2019

This PR adds support to the shell to output stdout/stderr to a file instead of writing to the KVS. All task stdout/stderr is written to a single file.

Notes:

  • To set this, the user must set the attributes.system.shell.options.output.<stream>.type field to "file". The setting defaults to "kvs".

  • The file to be written uses attributes.system.shell.options.output.<stream>.path.

  • attributes.system.shell.options.output.<stream>.label will label each line of output with the task number it came from.

  • I wasn't sure what the mode should be on files created. I just did 644 for now.

  • Writes to files just calls write(2) everytime there is data to write. It'll block if the filesystem is busy. Could attach to a FD watcher instead?

Copy link
Contributor

left a comment

Nice! Tried this out and it seems to work well!

Just some fairly minor comments and questions inline.

One larger issue is that the common case for a job output file will be a single file for both stdout and stderr, e.g. this is the default with srun and wreckrun --output. I'm probably reading the code wrong, but it looks like currently if stdout and stderr point the same file, you have two separately opened file descriptors with O_TRUNC pointing to the same file.

We may want to think about how to specify the output options spec such that, unless specified, stderr goes to the same file as stdout, and data is written to the same fd for both streams.

Finally, as you suggest, even though write(2) would probably rarely block, I think we do want to do buffered output here since repercussions are severe if the shell blocks.

}
else if (!strcmp (name, "output-kvs-ready")) {
/* Note that this event will only be output if atleast one of
* stdout/stderr will output to the kvs */

This comment has been minimized.

Copy link
@grondo

grondo Sep 25, 2019

Contributor

Instead of a slew of output-*-ready events in the guest.exec.eventlog, what if we keep the current approach of a single output-ready event, and then amend RFC 24 to include a new redirect event for output streams that have been diverted out the kvs? Output consumers can treat a redirect as an EOF, and could optionally print an informational message about where output went.

This seems like it would reduce the amount of code handling the special exec events, and the redirect event would also log what file output was sent to for provenance.

This comment has been minimized.

Copy link
@garlick

garlick Sep 26, 2019

Member

I opened #2399 for this idea.

@@ -944,6 +984,15 @@ void attach_exec_event_continuation (flux_future_t *f, void *arg)
context);
}

if (ctx->output_stdout_parsed && ctx->output_stderr_parsed) {
/* If either stdout or stderr aren't logged to KVS, nothing to

This comment has been minimized.

Copy link
@grondo

grondo Sep 25, 2019

Contributor

I think the comment should read If neither stdout or stderr are logged to KVS, nothing to...

@@ -85,9 +86,9 @@ struct shell_task *shell_task_new (void)
return NULL;
}

struct shell_task *shell_task_create (struct shell_info *info,
int index)
struct shell_task *shell_task_create (flux_shell_t *shell, int index)

This comment has been minimized.

Copy link
@grondo

grondo Sep 25, 2019

Contributor

Why the arbitrary change in shell_task_create here? Rebase issue?

This comment has been minimized.

Copy link
@chu11

chu11 Sep 25, 2019

Author Contributor

ahh, it's no longer necessary b/c "per-task" output is all in output.c now

task_output_cb, out) < 0)
return -1;
if (output_type_requires_service (out->stdout_type)) {
if (flux_shell_task_channel_subscribe (task, "stdout",

This comment has been minimized.

Copy link
@grondo

grondo Sep 25, 2019

Contributor

Couldn't there be a case where the output_type_requires_service returns false, but the output plugin still needs to subscribe to stdout/err? E.g. output file per shell?

This comment has been minimized.

Copy link
@chu11

chu11 Sep 25, 2019

Author Contributor

I suppose so, although none of the currently supported ones will have that pattern. Perhaps punt to when the situation arises?

@@ -4,6 +4,12 @@ test_description='Test flux-shell in --standalone mode'

. `dirname $0`/sharness.sh

jq=$(which jq 2>/dev/null)
if test -z "$jq" ; then
skip_all 'no jq found, skipping all tests'

This comment has been minimized.

Copy link
@grondo

grondo Sep 25, 2019

Contributor

All tests don't need to be skipped if jq not found.

Instead use

test -n "$jq" && test_set_prereq HAVE_JQ

Then use HAVE_JQ prereq on only the tests that use jq.

@@ -4,13 +4,20 @@ test_description='Test flux-shell'

. `dirname $0`/sharness.sh

jq=$(which jq 2>/dev/null)
if test -z "$jq" ; then
skip_all 'no jq found, skipping all tests'

This comment has been minimized.

Copy link
@grondo

grondo Sep 25, 2019

Contributor

Same comment, I don't think all tests need jq in this test script.

@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 25, 2019

during meeting discussion determined highest priority for this PR is

  • stdout & stderr redirect going to single file

  • adding special case mustache template of {{id}}

  • If it can be completed in time, the nice to haves would be:

    • "amend RFC 24 to include a new redirect event for output streams" per @grondo comments above
    • non-blocking writes

but these can go in after this PR if necessary, need to create issues for them if they don't.

Thinking about it for a bit, I think "amend RFC 24 ..." is probably enough work that it should be punted to after this PR. It requires mods to RFC24, behavior of shell, behavior / assumptions of flux-job, and maybe some tests.

@chu11 chu11 force-pushed the chu11:shell_output_file branch from b7dee4f to 92a8e21 Sep 26, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

re-pushed

  • if stdout output type is specified, stderr defaults to it as well, including redirecting to the same file
  • added mustanche template support for {{id}}
  • fixed comment nit per @grondo comments
  • fixed tests using HAVE_JQ pre-req

will try and do non-blocking writes tomorrow

@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

re-pushed with some minor refactoring and fixing a mem-leak, just in case I can't finish non-blocking writes in a timely manner.

@chu11 chu11 force-pushed the chu11:shell_output_file branch from 93abb1f to 139ed06 Sep 26, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

re-pushed with non-blocking write support. Added support by using a fd watcher and storing off data to be written onto a list. I dislike the fact I'm mallocing a new buffer to store this data, but seems like the only way to handle partial writes that can block.

Also did some cleanup along the way. Hopefully all the code makes sense :-)

@chu11 chu11 force-pushed the chu11:shell_output_file branch 2 times, most recently from 80e0df8 to 1e5d3c4 Sep 26, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

re-pushed, forgot a PATH_MAX check.

@chu11 chu11 force-pushed the chu11:shell_output_file branch from 1e5d3c4 to ca91585 Sep 26, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

re-pushed, missed a chain-lint &&

@chu11 chu11 force-pushed the chu11:shell_output_file branch from ca91585 to 7f244b5 Sep 26, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

re-pushed, minor cleanup in tests, I don't need to generate a new R1 & R2, I can just use the ones created by other tests.

@chu11 chu11 force-pushed the chu11:shell_output_file branch 2 times, most recently from 15d1194 to 7c27373 Sep 26, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2019

restart builders, hit two spurious valgrinds

==10086== 
==10086== HEAP SUMMARY:
==10086==     in use at exit: 22,311 bytes in 59 blocks
==10086==   total heap usage: 101,119 allocs, 101,060 frees, 58,728,217 bytes allocated
==10086== 
==10086== LEAK SUMMARY:
==10086==    definitely lost: 0 bytes in 0 blocks
==10086==    indirectly lost: 0 bytes in 0 blocks
==10086==      possibly lost: 0 bytes in 0 blocks
==10086==    still reachable: 22,311 bytes in 59 blocks
==10086==         suppressed: 0 bytes in 0 blocks
==10086== Reachable blocks (those to which a pointer was found) are not shown.
==10086== To see them, rerun with: --leak-check=full --show-leak-kinds=all
@chu11 chu11 force-pushed the chu11:shell_output_file branch from 7c27373 to 1a02ee4 Sep 27, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2019

oops, fixed a chain lint

@grondo

This comment has been minimized.

Copy link
Contributor

commented Sep 27, 2019

added support by using a fd watcher and storing off data to be written onto a list. I dislike the fact I'm mallocing a new buffer to store this data, but seems like the only way to handle partial writes that can block.

One question -- would it be simpler to use a flux_buffer_write_watcher instead? (This is exactly its intended use anyway.)

@chu11 chu11 referenced this pull request Sep 27, 2019
@chu11 chu11 force-pushed the chu11:shell_output_file branch from 1a02ee4 to 64e7673 Sep 27, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2019

Per discussion, re-did PR to output "redirect" events to the guest.output:

  • Notes

  • I output "0" if there's only 1 task, but "[0-N]" for > 1 task.
    Should I do brackets regardless?

  • When a file is redirected, you can get a message like "0: stdout
    redirected to foo.out" by default. A new '--quiet' option can
    suppress this output.

  • There's a lot of tests, I'm going to put them in some new tests
    files. If I can't finish that by the end of work, I'll do a push
    when I get home.

@chu11 chu11 force-pushed the chu11:shell_output_file branch from 64e7673 to f2b51f5 Sep 27, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2019

re-pushed, all the new tests are in two new files now. Also, one additional note I forgot to mention. If both stdout & stderr are redirected to the same file, flux job attach outputs two messages indicating the redirection:

[0-1]: stdout redirected to foo
[0-1]: stderr redirected to foo

I couldn't think of an obvious clean way to deal with for the time being.

@chu11 chu11 force-pushed the chu11:shell_output_file branch from f2b51f5 to bf19063 Sep 27, 2019
@chu11 chu11 force-pushed the chu11:shell_output_file branch from bf19063 to fcd5e54 Sep 28, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2019

re-pushed, fixed a mem-leak code coverage found

Copy link
Member

left a comment

A few comments/questions inline.

if (!strcmp (stream, "stdout"))
ctx->stdout_redirect_count += idset_count (idset);
else
ctx->stderr_redirect_count += idset_count (idset);

This comment has been minimized.

Copy link
@garlick

garlick Sep 28, 2019

Member

It seems like this code shouldn't actually need to count redirects and cancel the RPC. The RPC stream will eventually end with ENODATA when the job ends.

Since we're ignoring EOFs it seems like we can ignore these too, except for printing the warning message.

The block comment at the top of the function should be updated to mention redirect events.

This comment has been minimized.

Copy link
@chu11

chu11 Sep 28, 2019

Author Contributor

ahh yes, i had done things a little differently in an earlier incarnation of the code.

int len;
int offset;
};

This comment has been minimized.

Copy link
@garlick

garlick Sep 28, 2019

Member

Is the "buffer watcher" not useful here? E.g. flux_buffer_write_watcher_create()?

This comment has been minimized.

Copy link
@chu11

chu11 Sep 28, 2019

Author Contributor

main issue was buffer watcher has to have a buffer size set when it's created. So I elected to go with the fd watcher.

Thinking this through, could possible save data from out->output somewhere else when buffer is full. Not sure if it's a net win code wise.

It did just occur me that I should probably add a completion reference for fd-watcher writes, since on exit, there's no way to know if there's still data for the fd watcher to keep on writing. As a side effect, would clean up some of the shell_output_destroy() path.

This comment has been minimized.

Copy link
@garlick

garlick Sep 28, 2019

Member

Ok, I'll have a closer look at this code then. Are there tests that can make write(2) return EAGAIN/EWOULDBLOCK (seems like that would be hard to do if output is always a file).

This comment has been minimized.

Copy link
@chu11

chu11 Sep 28, 2019

Author Contributor

Nah, no tests for that specific path. Hard to create a test.

BUT, I only checked for EAGAIN, totally forgot about EWOULDBLOCK. Need to fix that up.

This comment has been minimized.

Copy link
@chu11

chu11 Sep 28, 2019

Author Contributor

oh wait, EWOULDBLOCK is only for sockets. I had it right :-)

This comment has been minimized.

Copy link
@grondo

grondo Sep 28, 2019

Contributor

No problem!

I really like generalizing the "file" output to "fd" output and switch terminal output to that same handler. However, I wonder if we are running out of time for that approach in this PR?

We could delay merging the PR and pursue this cleanup, or merge the (working) PR soon and do the cleanup as a follow-on PR. (I also kind of want to investigate moving the "output modes" into their own plugins, but haven't thought to hard about how that would work yet. But the output plugin could change regardless..)

I leave the approach up to @chu11 and @garlick

This comment has been minimized.

Copy link
@garlick

garlick Sep 28, 2019

Member

That works for me. @chu11, if that makes sense to you, then my vote would be to drop the "support nonblocking writes" commit, and then this is probably good to go in IMHO.

This comment has been minimized.

Copy link
@chu11

chu11 Sep 28, 2019

Author Contributor

Wait a second, the data is already "buffered" in the output JSON array, so we shouldn't need to buffer it again in the zlist.

So the main reason I copy to another list is b/c both stdout and stderr both go to the same output JSON array, but the streams could go to different locations, like file/kvs/terminal. or different files One stream could be blocking and one not.

If I leave everything on the output array, then I have to iterate through the entire array everytime, even if one stream isn't blocked. Perhaps this isn't the end of the world and not as big a deal as I originally thought. Some logic has to be changed since I can't call json_array_clear() anymore, gotta remove elements from the output array as they are processed.

if partial write, add size to offset and keep writing until EAGAIN/EWOULDBLOCK

(Note, I now see a mistake I did in output.c, so following sentence is in principle) One of the side reasons I did the list is b/c when you call iodecode() the data from the log entry is malloced when returned to the caller. This decode+malloc has to be done. It seemed like little cost to save off this already-malloced region of memory, rather than leave it on the output array to be re-decoded/malloced the next time your try to read from the array.

(Edit: Whoops, I see you recognized this in the comments above).

This comment has been minimized.

Copy link
@garlick

garlick Sep 28, 2019

Member

Oh I missed that detail, sorry. The streams mixed up in one array does make my proposal a bit trickier.

This comment has been minimized.

Copy link
@chu11

chu11 Sep 28, 2019

Author Contributor

That works for me. @chu11, if that makes sense to you, then my vote would be to drop the "support nonblocking writes" commit, and then this is probably good to go in IMHO.

Ok, let me remove that commit and I'll re-push

id=$(cat j1echostdout-0 | flux job submit) &&
flux job wait-event $id clean &&
grep stdout:foo out0
'

This comment has been minimized.

Copy link
@garlick

garlick Sep 28, 2019

Member

Now that flux mini run has been merged, and has options for file redirection, I think many of these tests can be shortened and HAVE_JQ dropped, e.g. the above could be

flux mini run -n1 --output=out0 ${TEST_SUBPROCESS_DIR}/test_echo -P -O foo &&
grep stdout:foo out0

In addition to --output and --error, there is also -o, --setopt to directly set shell option attributes.

chu11 added 7 commits Sep 18, 2019
In the shell output builtin, split up the primary shell_output_flush()
and shell_output_commit() functions into "init" and "core" functions.
The "init" functions handle the output header and/or any eventlog entries
that need to be written out.  The "core" functions handle the actual
processing of data.

As a result, the output header does not need to be written to the output
array.  Other flag variables such as "output_ready_sent" are
no longer needed as well.
Rename all "flush" functions to "term", as in terminal output.
Rename all "commit" functions to "kvs", as these functions commit to
the kvs.
In preparation for supporting multiple output types in the shell,
support an internal configuration for stdout/stderr output types.  The
output types are either "term" (for writing to terminal) or "kvs".
There is no behavior change with the current implementation.

Adjust the output builtin to not assume it is always handling stdout
and stderr.  Only perform setup/config if data will be going to the
terminal or KVS.  Adjust functions to handle the possibility that
they may handle only one of stdout/stderr, but not both.

Adjust logic to call shell_output_kvs_init() as long as we're not
standalone.
Support stdout/stderr file redirection in jobs via the

attributes.system.shell.options.output.<stream>.type

jobspec setting and the type "file".  This option will redirect all
stdout/stderr from all tasks into a single file.  A path to the file can be
specified via:

attributes.system.shell.options.output.<stream>.path

As a custom, when a stdout type is specified, but a stderr type is not, the
stderr type will follow the stdout type.  This includes writing to the
same file.

Note that this initial implementation blocks on calls to write(2).
For stdout/stderr redirected files, support the ability to specify
if I/O should be labeled by rank via the:

attributes.system.shell.options.output.<stream>.label

boolean setting.
When specifying an output file name to redirect stdout/stderr to,
allow users to use the mustache template '{{id}}' to have the jobid
substituted into the filename.
@chu11 chu11 force-pushed the chu11:shell_output_file branch from fcd5e54 to 4d83030 Sep 28, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2019

re-based and re-pushed.

  • removed redirect counts in flux-job per comments above and adjusted some comments
  • added a completion ref and some comments for the fd watcher in shell output.c
  • updated tests in t2606 to use flux-mini.
chu11 added 4 commits Sep 27, 2019
Per RFC24, emit output redirect events to the guest.output eventlog
to indicate that stdout and/or stderr have been redirected, and indicate
the file it has been redirected to.

Fixes #2399
Per RFC24, handle redirect events in the guest.output log.  By
default information about which tasks redirected to files and
the path that output was redirected to is output.  A new option
"quiet" can be used to suppress this output.
Add new test file t2605-job-shell-output-redirection-standalone.t, which
will test output redirection in the standalone job shell.
Add new test file t2606-job-shell-output-redirection.t, which
will test output redirection in the job shell and output when
attaching to the job via flux job attach.
@chu11 chu11 force-pushed the chu11:shell_output_file branch from 4d83030 to 200c7d4 Sep 28, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2019

re-pushed, removing non-blocking writes per discussion above.

Shall we create an issue to track the blocking issue?

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 28, 2019

@codecov-io

This comment has been minimized.

Copy link

commented Sep 28, 2019

Codecov Report

Merging #2395 into master will increase coverage by 0.01%.
The diff coverage is 82.2%.

@@            Coverage Diff             @@
##           master    #2395      +/-   ##
==========================================
+ Coverage   81.09%   81.11%   +0.01%     
==========================================
  Files         224      224              
  Lines       35770    36022     +252     
==========================================
+ Hits        29009    29218     +209     
- Misses       6761     6804      +43
Impacted Files Coverage Δ
src/cmd/flux-job.c 85.26% <81.81%> (+0.23%) ⬆️
src/shell/output.c 80.35% <82.21%> (+6.76%) ⬆️
src/modules/connector-local/local.c 73.11% <0%> (-1.17%) ⬇️
src/modules/job-info/guest_watch.c 74.14% <0%> (-0.57%) ⬇️
Copy link
Member

left a comment

Thanks, just gave it another quick test and looks good!
I think this can go in.

@mergify mergify bot merged commit 14c4aeb into flux-framework:master Sep 28, 2019
4 checks passed
4 checks passed
Summary 1 rule matches
Details
codecov/patch 82.2% of diff hit (target 81.09%)
Details
codecov/project 81.11% (+0.01%) compared to fc88c25
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@garlick garlick referenced this pull request Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.