Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peering: support peering establishment over mesh gateways, remove ability to pass external addresses to token generation endpoint #1610

Merged
merged 22 commits into from Oct 20, 2022

Conversation

ndhanushkodi
Copy link
Contributor

@ndhanushkodi ndhanushkodi commented Oct 11, 2022

Changes proposed in this PR:

In Consul 1.13, the way to establish a peering across 2 different networks was to use the expose-servers service to make the Consul servers reachable for peering establishment, and pass the externally reachable addresses of that service into the generate-token endpoint. This way, the peering token would contain these externally reachable server addresses rather than the Consul server pod ips.

Now, with support for peering establishment over mesh gateways, we can use the mesh gateways to reach the servers instead. So this PR removes support entirely for using the expose-servers service for peering. However, the service itself should remain since that will be used for admin partitions.

Now, the peering test uses mesh gateways for peering establishment by configuring the mesh config entry.

How I've tested this PR:

Ran the test locally, and verified via logs that peerings are being established both through the local and remote mesh gateways.

Ran the acceptance tests against all the clouds, passing build: https://app.circleci.com/pipelines/github/hashicorp/consul-k8s/8588/workflows/8fdab6f0-0fc6-48d4-abbc-5d86535ce55b

How I expect reviewers to test this PR:
👀

Checklist:

  • Tests added
  • CHANGELOG entry added

    HashiCorp engineers only, community PRs should not add a changelog entry.
    Entries should use present tense (e.g. Add support for...)

when running with these images:
go test -run TestPeering_Connect -p 1 -timeout 15m -kubeconfig=/Users/nitya/.kube/gke-peering-mesh-1 -secondary-kubeconfig=/Users/nitya/.kube/gke-peering-mesh-2 -enable-enterprise -enable-multi-cluster -consul-k8s-image=ndhanushkodi/consul-k8s-dev:pmgw1 -consul-image=ndhanushkodi/consul-dev:peermeshgw2 -failfast -no-cleanup-on-failure

on clusters peering-mesh-1 and peering-mesh-2
@@ -237,7 +237,7 @@ spec:
{{- if .Values.global.adminPartitions.enabled }}
-service-partition={{ .Values.global.adminPartitions.name }} \
{{- end }}
-log-level={{ default .Values.global.logLevel }} \
-log-level="trace" \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove

@ndhanushkodi ndhanushkodi changed the title changes to support mesh gw peering: support peering establishment over mesh gateways, remove ability to pass external addresses to token generation endpoint Oct 20, 2022
@ndhanushkodi ndhanushkodi marked this pull request as ready for review October 20, 2022 16:05
@ndhanushkodi ndhanushkodi requested review from a team, curtbushko and kschoche and removed request for a team October 20, 2022 16:06
@@ -96,7 +96,7 @@ func CheckStaticServerConnectionMultipleFailureMessages(t *testing.T, options *k
expectedOutput = expectedSuccessOutput
}

retrier := &retry.Timer{Timeout: 160 * time.Second, Wait: 2 * time.Second}
retrier := &retry.Timer{Timeout: 320 * time.Second, Wait: 2 * time.Second}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was because I noticed that by the time the peering is established and the exported service is actually replicated to the other side, this curl was timing out. It is taking a bit longer for the peering to get set up-- things that cause this are dialing a follower node and needing to retry the peering, a delta xds bug that causes the mesh gateways to take extra time to get their config.

@@ -53,10 +53,14 @@ func TestPeering_Connect(t *testing.T) {

for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
fmt.Println("starting test")
Copy link
Contributor

@kschoche kschoche Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we replace the Println's with Logger.Log or t.Log or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops thanks i forgot to remove those, I was just using those locally so i could see what was going on while running the test!

@@ -90,14 +95,13 @@ func TestPeering_Connect(t *testing.T) {
staticServerPeerHelmValues["server.exposeGossipAndRPCPorts"] = "true"
staticServerPeerHelmValues["meshGateway.service.type"] = "NodePort"
staticServerPeerHelmValues["meshGateway.service.nodePort"] = "30100"
staticServerPeerHelmValues["server.exposeService.type"] = "NodePort"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: why are we removing the grpc port configs for each of these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer need the exposeService at all for peering connections, so this removes any configuration of it.

@@ -237,7 +237,7 @@ spec:
{{- if .Values.global.adminPartitions.enabled }}
-service-partition={{ .Values.global.adminPartitions.name }} \
{{- end }}
-log-level={{ default .Values.global.logLevel }} \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitya mentioned above that she will remove it before merge. I have not idea why it didn't show up in the diffs for review. 🤷

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird!

Copy link
Contributor

@curtbushko curtbushko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fantastic! I just have a few comments but nothing breaking/blocking!

staticServerPeerClusterContext := env.DefaultContext(t)
staticClientPeerClusterContext := env.Context(t, environment.SecondaryContextName)

commonHelmValues := map[string]string{
"global.imageK8S": "ndhanushkodi/consul-k8s-dev:pmgw1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget these before merge!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

CHANGELOG.md Outdated
* Peering: Add support for `PeerThroughMeshGateways` in Mesh CRD. [[GH-1478](https://github.com/hashicorp/consul-k8s/pull/1478)]
* Peering:
* Add support for `PeerThroughMeshGateways` in Mesh CRD. [[GH-1478](https://github.com/hashicorp/consul-k8s/pull/1478)]
* move support for customizing the server addresses in peering token generation. Instead, mesh gateways should be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure but should the token changes from the helm chart be listed as a breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup added it to breaking instead!

Copy link
Contributor

@kschoche kschoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I have not had a chance to test it yet but assuming we're 🍏 everywhere I don't have any blocking comments beyond seconding Curt's comment about probably marking this as a breaking change in the changelog!

@ndhanushkodi ndhanushkodi merged commit 5668d4c into main Oct 20, 2022
@ndhanushkodi ndhanushkodi deleted the nd/peering-mesh-gw branch October 20, 2022 23:56
t-eckert pushed a commit that referenced this pull request Oct 31, 2022
…lity to pass external addresses to token generation endpoint (#1610)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants