Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation that externalServers.hosts is not set to HCP-managed cluster's addresses when global.cloud.enabled #3315

Merged
merged 2 commits into from Dec 12, 2023

Conversation

zalimeni
Copy link
Member

@zalimeni zalimeni commented Dec 6, 2023

This reintroduces #3218 with an additional small change to ensure Helm template validation is applied with necessary short-circuits. I've added the fix as a separate commit to simplify review based on the original.

Original PR description below:


Overview

If we set externalServers.hosts to the public or private address of an HCP-manged cluster, we run into the following because we're hardcoding -tls-server-name to server.{{ .Values.global.datacenter}}.{{ .Values.global.domain}}:

2023-11-10T01:32:31.386Z [ERROR] consul-server-connection-manager: connection error: error="rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate is valid for 517429dd-fcab-4fe6-bc4a-57750f69df7e.aws.hashicorp.cloud, consul-cluster.consul.e4065767-bd61-4f00-8b2b-81f631f7d4d1.aws.hashicorp.cloud, consul-cluster.private.consul.e4065767-bd61-4f00-8b2b-81f631f7d4d1.aws.hashicorp.cloud, not server.consul-cluster.consul\”"

We hard-code -tls-server-name in many places like:

{{- if .Values.global.cloud.enabled }}
-tls-server-name=server.{{.Values.global.datacenter}}.{{.Values.global.domain}} \
{{- end}}

Changes proposed in this PR:

  • Add validation that users don't set global.cloud.enabled while also using externalServers.hosts that point at an HCP-managed cluster (any address that has the suffix ".hashicorp.cloud")
  • Add note to values.yaml describing how global.cloud is really for linking clusters, not for HCP-managed cloud clusters.

How I've tested this PR:

  • bats, new bats test, ran it locally

How I expect reviewers to test this PR:

Checklist:

@zalimeni zalimeni force-pushed the zalimeni/fix-acceptance-externalserver-hosts-template branch from aae9b78 to 8089391 Compare December 6, 2023 19:37
@zalimeni zalimeni added backport/1.1.x Backport to release/1.1.x branch backport/1.2.x 1.2.x release branch backport/1.3.x labels Dec 6, 2023
@zalimeni zalimeni changed the title fix: reorder template validation of externalHosts Add validation that externalServers.hosts is not set to HCP-managed cluster's addresses when global.cloud.enabled Dec 6, 2023
@zalimeni zalimeni force-pushed the zalimeni/fix-acceptance-externalserver-hosts-template branch from 8089391 to a1768e7 Compare December 6, 2023 19:46
Comment on lines +9 to +11
{{- if and .Values.externalServers.enabled .Values.global.cloud.enabled }}
{{- if and (gt (len .Values.externalServers.hosts) 0) (regexMatch ".+.hashicorp.cloud$" ( first .Values.externalServers.hosts )) }}{{fail "global.cloud.enabled cannot be used in combination with an HCP-managed cluster address in externalServers.hosts. global.cloud.enabled is for linked self-managed clusters."}}{{- end }}
{{- end }}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the check directly above this one, we should never attempt to evaluate the inner if unless .Values.externalServers.hosts has been set.

Joshua Timmons and others added 2 commits December 6, 2023 14:50
…luster's addresses when global.cloud.enabled (#3218)
This ensures that we validate prerequisite conditions before applying
the more specific validation for `global.cloud.enabled`.
@zalimeni zalimeni force-pushed the zalimeni/fix-acceptance-externalserver-hosts-template branch from a1768e7 to 2dd26bf Compare December 6, 2023 19:50
@zalimeni
Copy link
Member Author

zalimeni commented Dec 6, 2023

Updated changelog filename to match this PR

Copy link
Contributor

@jjti jjti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @zalimeni for this fix! I'm checking since I didn't last time: are these acceptance test failures an issue or possibly broken because of this change? https://github.com/hashicorp/consul-k8s-workflows/actions/runs/7119514288

@zalimeni zalimeni marked this pull request as ready for review December 12, 2023 18:59
@zalimeni
Copy link
Member Author

@jjti sure thing! Re: the acceptance test failure, I think you are ok. Given the vanilla acceptance suite passed and this change isn't specific to CNI, I expect you just hit a flake, which unfortunately isn't uncommon r/n w/ the acceptance tests (particularly CNI).

Consider me "approved" even though I opened the PR, and feel free to merge + backport when it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.1.x Backport to release/1.1.x branch backport/1.2.x 1.2.x release branch backport/1.3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants