New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't login via SSH to Gentoo container on Gentoo host. Cant't reboot SSHD and OS from container. #3569
Comments
syslog-ng/syslog-ng#2595 look like this issue. |
steps to reproduce:
|
Indeed it does. In my case I'm also unable to log in on for example tty1, so it's particularly difficult to troubleshoot. We are also contemplating a kernel bug. Systems on 5.8.14 kernel seems to behave better, which we started to deploy a bit back. We did find one kernel backtrace on one host where we were privileged enough to have a terminal open at the time the host went into this state (which according to what I recall pointed at IO issues). Unfortunately I'm failing to find the dmesg we gathered at the time, but there were IO requests that were stuck. We do make heavy use of LVM and snapshots (both thin and traditional/exception based), the above kernel version included a number of dm-mapper deadlock fixes which we suspect may relate. Scanning through the later kernel changelogs I do see tty: serial: fixes, and some ext4 stuff, which is always possible may be the actual underlying stuff ... always hard to say. But seeing that we've noticed in that one case that other IO was also blocked, but just syslog-ng, it tends to hint at a larger underlying problem potentially, but it could also be that we're looking at multiple issues. I really find it hard to trouble-shoot these problems. You are however able to simply reboot the container - which seems to hint at not a kernel issue ... especially since if you disable tty output it works correctly (assuming I understand you correctly?). What further bugs me around the tty output theory ... yes, tty's are extremely slow when being viewed, but when they're not I get the impression they're really fast (eg, if you're watching tty12 you can sometimes see lag, switch away and back and the backlog is "instantly" cleared - we actually were able to measure the difference in time on a kernel compile on a tty when it was the active tty vs not). |
Required information
Distribution:
Gentoo
Distribution version:
Latest
The output of
lxc-start --version
lxc-checkconfig
uname -a
Linux mainserver 5.4.72-gentoo-x86_64 #1 SMP Sun Oct 25 14:41:42 MSK 2020 x86_64 Intel(R) Xeon(R) CPU E5640 @ 2.67GHz GenuineIntel GNU/Linux
cat /proc/self/cgroup
cat /proc/1/mounts
Issue description
After some days from rebot, I can't login via SSH to privileged container,
lxc-attache
works correctly.Steps to reproduce
I setup new Gentoo container one year ago. One-two montth ago, I was can't login to my container via SSH. I connected to container via
lxc-attach
and try to reboot SSHD viarc-service sshd restart
, but I was can't to do this, command frozen. I try to reboot container viareboot
command from new bash vialxc-attach
, but the same situation was.Ok, then I tryed to stop container via host
lxc-stop
, after long long time my container was stoped.Then I tryed to start container again, and all worked correctly for some days.
This situation was repeats again and again.
I updated all apps in container, nothing helped me.
I updated all apps incuding latest LXC at the host, but nothing helped.
I updated my kernel to the latest green version from gentoo-sources, but my sshd on container not working as expeced yet.
Via
ssh -vvv
I see that my sshd works, respond for packets, autorize, but not works for login up me to system and can't reboot.At the sshd logs - nothing. :(
Whats wrong?
Information to attach
dmesg
)lxc-start -n <c> -l TRACE -o <logfile>
)The text was updated successfully, but these errors were encountered: