-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enhancement]: Deal with empty ssh key files #5305
Comments
I am working on a patch. Will post once I get some positive test results. |
Sometimes, due to file system corruption from a linux kernel crash or for some other reasons, the system may be left with zero byte ssh keyfiles. These zero byte files are of no use and generate errors like the following upon next reboot: sshd[1454]: Unable to load host key: /etc/ssh/ssh_host_rsa_key sshd[1454]: Unable to load host key "/etc/ssh/ssh_host_ecdsa_key": invalid format Therefore, we should check for the presence of zero-byte files and if found delete and regenerate the key files in order to recover from the error. Fixes: canonicalGH-5305 Signed-off-by: Ani Sinha <anisinha@redhat.com>
Sometimes, due to file system corruption from a linux kernel crash or for some other reasons, the system may be left with zero byte ssh keyfiles. These zero byte files are of no use and generate errors like the following upon next reboot: sshd[1454]: Unable to load host key: /etc/ssh/ssh_host_rsa_key sshd[1454]: Unable to load host key "/etc/ssh/ssh_host_ecdsa_key": invalid format Therefore, we should check for the presence of zero-byte files and if found delete and regenerate the key files in order to recover from the error. Fixes: canonicalGH-5305 Signed-off-by: Ani Sinha <anisinha@redhat.com>
Thank you @ani-sinha for both filing this issue and pursuing an upstream fix for this issue. I think the consensus reached on the related PR was that this error condition, where the kernel hasn't sync'd multiple files to disk due to a reboot in early boot, is something beyond a typical use-case that cloud-init needs to cope with as there are likely many such files corrupted byond just SSH host keys. Adding a single fix to ssh host key checking not a complete solution and appears to add logic around a case that is not aligned with the majority of support cases cloud-init should be resolving. I'll close this issue as "NOT planned" in case there is a significant objection that I missed. |
No objection in closing the ticket. |
Enhancement
Start an instance running centos on aws t2.large system, after reboot system via sysrq 'b', the system is not accessible via ssh
Checking the disk we found that all ssh host key files are empty after reboot,
Version-Release number of selected component (if applicable):
cloud-init 22.1-8
How reproducible:
the image centos on aws t2.large can 100% reproduce
Steps to Reproduce:
reproduce in auto
$ os-tests --user ec2-user --keyfile /home/virtqe_s1.pem --platform_profile /home/aws.yaml -p test_reboot_simultaneous
Actual results:
cannot access system via ssh after boot up
Expected results:
system can access normally
Additional info:
Add a command 'sudo sync' before running 'echo b > /proc/sysrq-trigger & echo b > /proc/sysrq-trigger', and then run the case, all PASS.
Let's discuss if cloud-init can deal with this issue.
The text was updated successfully, but these errors were encountered: