-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework output redirection #5084
Conversation
@alalazo: this reworks a lot of the major changes from your last reworking of I/O redirection. Can you take a look at it? |
Note that this currently includes #5016 but probably shouldn't. This should go in first -- I'll separate it out later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much agree with the design and changes here, but enabling / removing verbosity is not working for me (hitting v does not alter echoing - didn't have time to debug the cause yet).
EDIT: Once in a while it seems the logger daemon hangs, and stops the whole application.
lib/spack/llnl/util/tty/log.py
Outdated
|
||
or:: | ||
with logger.force_echo(): | ||
# things here will be echoed *and* logged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a silence_echo
method for consistency? Something like:
with log_output('logfile.txt', echo=True) as logger:
 # do things ... output will be logged and printed out
with logger.silence_echo():
# things here will not be echoed, but will be logged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost added this and with no_logging
, but I think we could easily add those when the need arises. We might need to adjust the protocol a bit, and instead of using XON/XOFF to delineate echo'd regions of actual log, we'd use XON/OFF to start/stop some control protocol. e.g.:
\xonTF\xoff\n # echo, no log
\xonTT\xoff\n # echo and log
\xonFT\xoff\n # no echo, log
Personally, I'd save this for later, as I don't think it is needed right now.
lib/spack/llnl/util/tty/log.py
Outdated
self.echo = not self.echo | ||
if stdin in rlist: | ||
if stdin.read(1) == 'v': | ||
echo = not echo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am currently not able to get this part working. I tested the PR on:
$ uname -a
Linux nuvolari 4.4.0-87-generic #110~14.04.1-Ubuntu SMP Tue Jul 18 14:51:32 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
using different versions of python for spack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look... wonder if this is a race to see who gets to EOF first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a case of using self.echo
instead of echo
down below 😄
When the application hangs and I hit Ctrl+C to quit, the stacktrace looks like: ==> WRITE LOCK: /home/mculpo/PycharmProjects/spack/var/spack/stage/.lock[6195093728828230899:1] [Released]
Traceback (most recent call last):
File "/usr/lib/python3.4/multiprocessing/process.py", line 254, in _bootstrap
self.run()
File "/usr/lib/python3.4/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/mculpo/PycharmProjects/spack/lib/spack/llnl/util/tty/log.py", line 314, in _writer_daemon
if stdin.read(1) == 'v':
KeyboardInterrupt
==> Error: Keyboard interrupt. |
7a784fc
to
625008e
Compare
@alalazo: this should be ready for another review. This fixes the I learned in the process that |
@alalazo: I also could not reproduce the hang you're seeing, but I put a generic exception handler around the main writer loop, so that if something goes wrong, we'll still get out. I'm not sure the best way to wait on the logger daemon. I couldn't get it to hang, and I think that |
625008e
to
86132f5
Compare
For the record: I re-started |
I was wracking my brains trying to figure that out. I installed a local python 3.5 to try to reproduce but couldn't. |
@tgamblin I tried a lot of times to reproduce this error locally and a few times on Travis (by restarting unit test in #4994 ), but was not able to do so. I also noticed that the error is not specific to a single python version. I really don't get why sometimes we end up with the separator being deduced as |
@alalazo: I was able to reproduce and (I think) fix it locally but I still don't know exactly why it happens. I can reproduce it running |
@tgamblin You're right. Before I tried this: $ for ii in {1..100}; do spack info openmpi 1> /dev/null ; done which for me calls $ for ii in {1..100}; do python3 bin/spack info openmpi 1> /dev/null ; done
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o'
==> Error: 'o' where python $ python3 --version
Python 3.4.3 |
See if this fixes it in - _separators = '[%s]' % ''.join(color_formats.keys())
+ _separators = '[\\%s]' % '\\'.join(color_formats.keys()) It seems to for me. That escapes all the special chars in the |
Before I dive into the debugger: are we relying on the order of keys in a dictionary when we iterate over them? |
@alalazo: oh interesting. That might be it. If the order of the characters in _separators = '[\\%s]' % '\\'.join(color_formats.keys()) I relied on it being last, because if it's first it means "match anything NOT in this list", which would include So escaping them as I did above will fix this... if |
@tgamblin Check this slide that shows a summary of how dictionaries are implemented in various python versions. The order of keys is scrambled for python 2.7, randomized for python 3.5, ordered for python 3.6. I always found that really evil on users 😄 At this point I wonder if the other issues we are frequently seeing (like |
Personally I like the randomized option :), because it makes you get rid of bugs like this. Go does this too -- it enables them to make more performant parallel map implementations, too. |
I'm sure someone would argue that "randomized keys for better parallel map implementations" violates some part of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I tried the PR again on hdf5~mpi+szip
, using both python 3.4.3 and python 2.7.6 and it worked correctly (no hangs, output echoing controlled by pressing v
).
86132f5
to
7a7a29f
Compare
One test failing: _____________________________ test_package_output ______________________________
install_mockery = None, mock_fetch = None
def test_package_output(install_mockery, mock_fetch):
"""Ensure output printed from pkgs is captured by output redirection."""
out, err = install('-v', 'printing-package')
> assert "BEFORE INSTALL\n==> './configure'" in out
E assert "BEFORE INSTALL\n==> './configure'" in ' Fetch: 0.07s. Build: 0.35s. Total: 0.42s.\n/tmp/pytest-of-travis/pytest-0/test_package_output0/test-debian6-x86_64/clang-3.3/printing-package-1.0-xpnz2byzii5mgzx62d2tpgfwffy7qbii\n'
/home/travis/build/LLNL/spack/lib/spack/spack/test/cmd/install.py:94: AssertionError |
7a7a29f
to
439525e
Compare
- add fail_on_error argument - record exception and return code when the command fails
- Simplify interface to log_output. New interface requires only one context handler instead of two. Before: with log_output('logfile.txt') as log_redirection: with log_redirection: # do things ... output will be logged After: with log_output('logfile.txt'): # do things ... output will be logged If you also want the output to be echoed to ``stdout``, use the `echo` parameter:: with log_output('logfile.txt', echo=True): # do things ... output will be logged and printed out And, if you just want to echo *some* stuff from the parent, use ``force_echo``: with log_output('logfile.txt', echo=False) as logger: # do things ... output will be logged with logger.force_echo(): # things here will be echoed *and* logged A key difference between this and the previous implementation is that *everything* in the context handler is logged. Previously, things like `Executing phase 'configure'` would not be logged, only output to the screen, so understanding phases in the build log was difficult. - The implementation of `log_output()` is different in two major ways: 1. This implementation avoids race cases by using only one pipe (before we had a multiprocessing pipe and a unix pipe). The logger daemon stops naturally when the input stream is closed, which avoids a race in the previous implementation where we'd miss some lines of output because the parent would shut the daemon down before it was done with all output. 2. Instead of turning output redirection on and off, which prevented some things from being logged, this version uses control characters in the output stream to enable/disable forced echoing. We're using the time-honored xon and xoff codes, which tell the daemon to echo anything between them AND write it to the log. This is how `logger.force_echo()` works. - Fix places where output could get stuck in buffers by flushing more aggressively. This makes the output printed to the terminal the same as that which would be printed through a pipe to `cat` or to a file. Previously these could be weirdly different, and some output would be missing when redirecting Spack to a file or pipe. - Simplify input and color handling in both `build_environment.fork()` and `llnl.util.tty.log.log_output()`. Neither requires an input_stream parameter anymore; we assume stdin will be forwarded if possible. - remove `llnl.util.lang.duplicate_stream()` and remove associated monkey-patching in tests, as these aren't needed if you just check whether stdin is a tty and has a fileno attribute.
- Previously we would use `os._exit()` in to avoid Spack error handling in the parent process when build processes failed. This isn't necessary anymore since build processes propagate their exceptions to the parent process. - Use `sys.exit` instead of `os._exit`. This has the advantage of automatically flushing output streams on quit, so output from child processes is not lost when Spack exits.
- 'v' toggle was previously only good for the current install. - subsequent installs needed user to press 'v' again. - 'v' state is now preserved across dependency installs.
- Python I/O would not properly interleave (or appear) with output from subcommands. - Add a flusing wrapper around sys.stdout and sys.stderr when redirecting, so that Python output is synchronous with that of subcommands.
439525e
to
aed383a
Compare
@tgamblin awesome thanks for this fix -- looks good:
vs before:
|
@tgamblin sounds good! On failure, I am also thinking of automatically copying the build log out from the stage somewhere into the spack space ('last.build.error.txt'), b/c if we fail while building in a temp space on a random node, we may not be able to access the build log. |
This reworks I/O redirection in Spack to avoid race cases, to make output more deterministic, and to ensure that everything gets logged in
build.out
. Previously output could be different if you just ranspack install
or ranspack install | cat
. This PR tries to make the output more deterministic and to simplify the interface to thelog_output
context handler.Simplify interface to
log_output
. New interface requires only one context handler instead of two. Before:After:
If you also want the output to be echoed to
stdout
, use theecho
parameter::And, if you just want to echo some stuff from the parent, use
force_echo
:A key difference between this and the previous implementation is that everything in the context handler is logged. Previously, things like
Executing phase 'configure'
would not be logged, only output to the screen, so understanding phases in the build log was difficult.The implementation of
log_output()
is different in two major ways:This implementation avoids race cases by using only one pipe (before we had a multiprocessing pipe and a unix pipe). The logger daemon stops naturally when the input stream is closed, which avoids a race in the previous implementation where we'd miss some lines of output because the parent would shut the daemon down before it was done with all output.
Instead of turning output redirection on and off, which prevented some things from being logged, this version uses control characters in the output stream to enable/disable forced echoing. We're using the time-honored xon and xoff codes, which tell the daemon to echo anything between them AND write it to the log. This is how
logger.force_echo()
works.Fix places where output could get stuck in buffers by flushing more aggressively. This makes the output printed to the terminal the same as that which would be printed through a pipe to
cat
or to a file. Previously these could be weirdly different, and some output would be missing when redirecting Spack to a file or pipe.Simplify input and color handling in both
build_environment.fork()
andllnl.util.tty.log.log_output()
. Neither requires an input_stream parameter anymore; we assume stdin will be forwarded if possible.remove
llnl.util.lang.duplicate_stream()
and remove associated monkey-patching in tests, as these aren't needed if you just check whether stdin is a tty and has a fileno attribute.fix exit call in
SpackError.die()
:Previously we would use
os._exit()
in to avoid Spack error handling in the parent process when build processes failed. This isn't necessary anymore since build processes propagate their exceptions to the parent process.Use
sys.exit
instead ofos._exit
. This has the advantage of automatically flushing output streams on quit, so output from child processes is not lost when Spack exits.add test for output capture
pytest
'scapsys
/capfd
where necessaryPreserve verbosity setting across installs
v
, and it's valid for the entire Spack run!