Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Support for reconnects in the Gateway port forwarder #736

Merged
merged 19 commits into from Jun 21, 2023

Conversation

laurentluce
Copy link
Contributor

@laurentluce laurentluce commented Jun 15, 2023

Description:

The Gateway port forwarding logic does not try to reconnect when there are network interruptions. The current workaround is to restart the Gateway. This PR adds reconnect logic around the Gateway port forwarder. The retries will stop when a service is removed via RemoveService or a Starlark script.

Is this change user facing?

YES

References (if applicable):

Closes #726

@laurentluce laurentluce changed the title feat: Support for reconnects in the Gateway port forwarder fix: Support for reconnects in the Gateway port forwarder Jun 15, 2023
Copy link
Contributor

@adschwartz adschwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laurentluce Looks good to me. Have you tested this change in a k8s cluster (or some other way) where you can test the impact of the change, and verify that the reconnection indeed works? Running in GKE showed many instances of pods being rescheduled so I imagine we can try it there if nothing else.

@laurentluce laurentluce marked this pull request as draft June 16, 2023 12:14
@laurentluce laurentluce removed the request for review from gbouv June 16, 2023 12:15
@laurentluce
Copy link
Contributor Author

laurentluce commented Jun 20, 2023

@laurentluce Looks good to me. Have you tested this change in a k8s cluster (or some other way) where you can test the impact of the change, and verify that the reconnection indeed works? Running in GKE showed many instances of pods being rescheduled so I imagine we can try it there if nothing else.

I ran some tests against minikube by modifying the pod restart policy, killing the container and issuing some requests to the locally mapped port. The gateway is able to reconnect when a new container is up. The gateway is also re-using the same local ports. I also tested with services removal and made sure the reconnect logic does not happen.

@laurentluce laurentluce marked this pull request as ready for review June 20, 2023 12:12
@laurentluce laurentluce merged commit 4944ccd into main Jun 21, 2023
27 checks passed
@laurentluce laurentluce deleted the laurent/gateway_port_forwarder_reconnect branch June 21, 2023 09:32
h4ck3rk3y pushed a commit that referenced this pull request Jun 21, 2023
🤖 I have created a release *beep* *boop*
---


##
[0.79.0](0.78.5...0.79.0)
(2023-06-21)


### ⚠ BREAKING CHANGES

* removed workdir from run_sh and fixed some typos on the doc
([#739](#739))

### Features

* allow to pop a shell on Kubernetes
([#748](#748))
([3c706e5](3c706e5))


### Bug Fixes

* removed workdir from run_sh and fixed some typos on the doc
([#739](#739))
([6406f10](6406f10))
* Support for reconnects in the Gateway port forwarder
([#736](#736))
([4944ccd](4944ccd)),
closes [#726](#726)
* typos ([#742](#742))
([800e523](800e523))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: kurtosisbot <kurtosisbot@users.noreply.github.com>
laurentluce added a commit that referenced this pull request Jun 23, 2023
## Description:
Most of the work for the K8S backend was done as part of #694. In
addition, the port forwarder reconnect feature reduced the amount of
work we had to finish. Here are the changes made in this PR:
- We keep the K8S service untouched when we stop the user service so it
can be restarted easily by creating a new pod. This change does not
impact the remove service expected behavior.
- We don't try to create a port forwarder connection if the service is
stopped.

A service stopped here is a k8s service with no pod running. I think it
is time to return the service status (registered, started, stopped) as
part of the GetServices response so we can simplify the second item
above. I will do that in a follow-up PR.

## Is this change user facing?
YES

## References (if applicable):
Closes #673 
#694 
#736
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The gateway port forwarder should handle network interruptions
2 participants