Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logger.c: Logging can get stuck on remote consoles #13

Closed
InterLinked1 opened this issue Jul 4, 2023 · 0 comments
Closed

logger.c: Logging can get stuck on remote consoles #13

InterLinked1 opened this issue Jul 4, 2023 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@InterLinked1
Copy link
Owner

InterLinked1 commented Jul 4, 2023

Split off from #12, since this is a separate issue:

Logging can also get stuck here due to write blocking forever:

RWLIST_RDLOCK(&remote_log_fds);
RWLIST_TRAVERSE(&remote_log_fds, rfd, entry) {
	if (fd_logging[rfd->fd]) {
		write(rfd->fd, fullbuf, (size_t) bytes);
	}
}
RWLIST_UNLOCK(&remote_log_fds);

A somewhat easy way to reproduce this (a scenario where this happens frequently) is if the consoles are being spammed with log messages and you exit a remote sysop console using ^C. Perhaps the console file descriptors are going away while they are being logged to, but that doesn't entirely make sense either.

This causes a deadlock, but only at the thread level, i.e. not all logging is broken. Because threads get stuck with a RDLOCK held on the remote logger fd's, it becomes impossible to obtain a WRLOCK, which blocks sysop console registration/unregistration. Otherwise, other logging and other threads remain nominally unaffected.

@InterLinked1 InterLinked1 added the bug Something isn't working label Jul 4, 2023
@InterLinked1 InterLinked1 self-assigned this Jul 4, 2023
InterLinked1 added a commit that referenced this issue Jul 5, 2023
When a large amount of data is being logged,
libc_write will get stuck because the file
descriptors for remote console logging are
blocking. We now make them nonblocking
whenever we write to them, and this resolves
that issue.

Granted, we may lose some messages on remote
consoles by doing this, but if that's happening,
there's likely another bug triggering an
avalanche of log messages that needs investigating.

Other related improvements that improve dealing
with large amounts of I/O:

* bbs_write can fail to write all of the bytes.
  We retry this in a loop (as we should), but
  instead of a tight loop or an arbitrarily
  less tight loop, use the POLLOUT event
  to time the write optimally.
* If a pseudoterminal write operation returns -1,
  shut down the PTY.

Fixes #13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant