-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: LVH VM provisioning failed (kex_exchange_identification: read: Connection reset by peer) #26012
Comments
Hit another couple of times: https://github.com/cilium/cilium/actions/runs/5240409273/attempts/1 |
I have submitted a PR to LVH that should prevent the race condition with SSH server startup. cilium/little-vm-helper#75 |
Merging #26425 (Updating LVH GitHub Action to version LVH 0.0.7 is related to cilium/little-vm-helper#77 which includes the actual fix cilium/little-vm-helper#75. |
update: it looks like there are cases where re-trying 5 times isn't enough
-> going to re-open. there might be an actual issue why it's not possible to connect via SSH or the waittime of 5s (5 retries * 1s wait) is too short example: https://github.com/cilium/cilium/actions/runs/5383703702/jobs/9770799136 |
Hit again in #26662 |
I'm working on a PR to lvh to allow for dmesg logs to be exported to a file, which we can then tail to get SSH status and collect more information on what is going on. |
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This commit updates the GHA action to watch for sshd status within the VM's console log file. Success and failure messages for sshd are continually searched for in the console log file over the course of 30s. If a failure is observed, then the step fails. If success is observed, then the step moves on to attempting to ssh into the VM. Continuing to check if ssh access is available is important, because at the end of the day, this is the functionality that is needed to continue. Parsing through the console log file is an added step to help gain insight into what is going on in the VM. Regex match-alls is used for white-space in-between keywords when searching, as console color characters may be printed between. Additionally, '\.service' is not appended to the end of 'ssh' in these search strings, as sometimes this part of the string gets truncated in the log file. Ref: cilium/cilium#26012 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
This has been implemented with cilium/little-vm-helper#89 and integrated with #26819 (currently on Thanks @learnitall Please continue reporting detected failures related to this issue so we can analyze the log 🙏 |
Observed here again: https://github.com/cilium/cilium/actions/runs/5589813647/jobs/10218698894 dmesg output:
|
@gandro Thanks! The console.log output doesn't include any actual information beside the fact that the LVM wasn't ready after 30s. I addressed some of its issues (proper waiting, configurable wait duration, increased default wait-time to 300s...) in the action with cilium/little-vm-helper#96. The next integration into cilium contains more information: https://github.com/cilium/cilium/actions/runs/5619475924/job/15226671415
so far no better news than this - might be related to a kernel version. |
Do we know on which kernels it has happened so far? |
Mostly 5.4 what i've analyzed so far.
|
Another occurrence: https://github.com/cilium/cilium/actions/runs/5644600029/job/15288779889 |
@brb can you take a look at this? 🥺 |
Sorry, no cycles, as I am involved in other testing projects. |
This issue has been automatically marked as stale because it has not |
This issue has not seen any activity since it was marked stale. |
CI failure
Hit a few times, for instance: https://github.com/cilium/cilium/actions/runs/5200516663/jobs/9379487447
My guess is that this is due to a race condition, with the Set DNS resolver step attempting to connect through ssh while the ssh server in the VM has not yet fully started.
The text was updated successfully, but these errors were encountered: