-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci(deploy): retry gcloud
ssh connections if it fails
#5292
Conversation
Previous behavior From time to time SSH connections to deployed VMs fails with the following error: `kex_exchange_identification: Connection closed by remote host` Expected behavior If the connection fails, attempt to reconnect once again (or multiple times) Solution Add the `ConnectionAttempts` and `ConnectTimeout` with 20 and 5 values respectively, which attempst to reconnect 19 more times every 5 seconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this seems like it might work, I guess we have to merge to find out.
Next time we need to change these parameters, let's put the settings in one place using a SSH config file or workflow environmental variables?
That way we don't have to modify the workflow in 10 places for the same change.
Yeah, I think we should even see how to accomplish a DRY(er) approach on this workflows. |
Another thing we can do is move workflow and Docker scripts into |
Previous behavior: From time to time SSH connections to deployed VMs fails with the following error: `kex_exchange_identification: Connection closed by remote host` This was still happening after implementing #5292 Excpected behavior: Ensure we're not creating SSH key pairs on the fly to improve our connections guarantees Solution: - Enable the Cloud Identity-Aware Proxy API in GCP - Create a firewall rule to enable connections from IAP - Grant the required IAM permissions to enable IAP TCP forwarding - Generate an SSH keys pair and set a private key as an input param - Set the GitHub Action SA to have authorized ssh connection to the VMs - Implement the `google-github-actions/ssh-compute` action to connect
* refactor(ssh): connect using `ssh-compute` action by Google Previous behavior: From time to time SSH connections to deployed VMs fails with the following error: `kex_exchange_identification: Connection closed by remote host` This was still happening after implementing #5292 Excpected behavior: Ensure we're not creating SSH key pairs on the fly to improve our connections guarantees Solution: - Enable the Cloud Identity-Aware Proxy API in GCP - Create a firewall rule to enable connections from IAP - Grant the required IAM permissions to enable IAP TCP forwarding - Generate an SSH keys pair and set a private key as an input param - Set the GitHub Action SA to have authorized ssh connection to the VMs - Implement the `google-github-actions/ssh-compute` action to connect * fix(ssh): id `compute-ssh` cannot be used more than once within the same scope * fix(ci): try to enclose commands to override parsing issues * tmp: remove ssh_args * fix(action): secrets must be inherited to be used * tmp: validate command enclosing fixes executin * fix(ssh): ssh_args are not implemented correctly * fix(ssh): login with the root user * fix(privelege): uso sudo with docker commands * tmp: add sudo * fix(ssh): use sudo for all docker commands * fix(ssh): add missing `sudo` commands * fix(ssh): get sync height from ssh stdout * fix(height): get the height correctly
Previous behavior
From time to time SSH connections to deployed VMs fails with the following error:
kex_exchange_identification: Connection closed by remote host
Expected behavior
If the connection fails, attempt to reconnect once again (or multiple times)
Solution
Add the
ConnectionAttempts
andConnectTimeout
with 20 and 5 values respectively, which attempst to reconnect 19 more times every 5 secondsReview
Anyone can review this
Reviewer Checklist
Follow Up Work
If this issue keeps raising, even after the retries, we might want to apply some best SSH practices for Google Cloud (which we might want to apply anyways) as stated here https://cloud.google.com/compute/docs/instances/connecting-advanced