Support log rotation, don't append crash.log files but use per-peer.#1847
Conversation
d6be27f to
f5d5624
Compare
|
Rebased after applying #1863 |
| # determine whether we succeeded or failed. | ||
| if request.node.rep_call.outcome == 'passed': | ||
| shutil.rmtree(directory) | ||
| pass #shutil.rmtree(directory) |
There was a problem hiding this comment.
That's probably not intended to be in this queue
| logpath = os.path.join(l1.daemon.lightning_dir, 'logfile') | ||
| logpath_moved = os.path.join(l1.daemon.lightning_dir, 'logfile_moved') | ||
|
|
||
| # FIXME: I couldn't get super(TailableProc, l1.daemon).start() to work? |
There was a problem hiding this comment.
I found super to be rather obscure. This should work however:
TailableProc.start(l1.daemon)
Since it'll take the classmethod start from TailableProc and passes in l1.daemon as self
| static void handle_sighup(int sig) | ||
| { | ||
| /* This may fail if we're hammered with SIGHUP. We don't care. */ | ||
| if (write(signalfds[1], "", 1)); |
There was a problem hiding this comment.
Ok, this took me a while to figure out. This'll just write a 0x00 byte, since any C string is null-terminated, right?
| fd = open(logfile, O_WRONLY|O_CREAT, 0600); | ||
| } | ||
| /* We expect to be in config dir. */ | ||
| snprintf(logfile, sizeof(logfile), "crash.log.%u", getpid()); |
There was a problem hiding this comment.
I'm wondering if pid is the best identifier here, we might want to use something like strftime(buffer, 14, "%Y%m%d%H%M%S", tm_info); in order to easily identify the last crash or correlate a crash dump with external monitoring.
| if (write(signalfds[1], "", 1)); | ||
| /* Writes a single 0x00 byte to the signalfds pipe. This may fail if | ||
| * we're hammered with SIGHUP. We don't care. */ | ||
| if (write(signalfds[1], "", 1)) |
There was a problem hiding this comment.
clang will complain if this is not on a separate line to signal explicit opt-out of this case.
|
I took the liberty to add ACK 734b5f8 |
|
Actually Travis-CI keeps complaining about one of the two tests (it fails This happens both before and after my datetime crashlog commit, and I can't figure out why... |
|
This might actually be |
|
Backtrace does not play nicely with valgrind... We disable it in dev mode, but not in non-dev mode: I'll work around it in the test itself... |
Closes: ElementsProject#1623 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Someone had a 21GB crash.log, which doesn't help anyone! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This should make it easier to identify the latest crash file and correlate crashes with external monitoring tools.
|
The remaining test failure seems to be an instance of the flaky test tracked in #1866, restarting to see if it unflakes :-) |
|
ACK d5425bc |
This is based on ~~~#1846~~~ #1863 because I needed the start flag to get_node(), but it's basically the last two commits.