-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for transparent connect proxies #10628
Comments
Hi @apollo13! Much of what transparent proxy provides is for filling gaps in k8s (because it's a bit more "kit of parts" than combining Nomad and Consul). There's a conflict here in the security model between the two: Consul transparent proxy assumes and requires 1 service per network namespace, whereas Nomad supports n services per network namespace. We already provide most of what transparent proxies support, with the major exception of preventing service mesh circumvention. So we're still brainstorming what exactly transparent proxy support would look like in Nomad. It's likely to be implemented a set of features that together make up all the same things as the Consul tproxy, rather than a standalone "support Consul tproxy" feature. Some ideas we've considered, not all of which are compatible with each other:
In any case, just a heads up that this isn't on our very near-term roadmap. |
Ok, my main goal is to get rid of the manual declaration of all the connect services that I need. If there are ideas on how to do that without manually specifying all, then it doesn't have to be the transparent proxy. Thank you for the clarifications! |
@apollo13 if you're talking about no longer needing to define |
Declaring them only once would imo already be a win. connect-native is certainly one way of doing things but often not easily possible for existing code (For example changing a database connection in Django to be connect native is probably close to not possible). So in that sense the proxy would still be a massive improvement (at least for me). I usually do not care about inbound connections that much (they are solved differently). But it would be really great if I could limit outgoing connections of a group to whatever the intentions allow (and not more).
…On Thu, May 20, 2021, at 20:53, Seth Hoenig wrote:
@apollo13 <https://github.com/apollo13> if you're talking about no
longer needing to define `upstreams`, then transparent proxy only helps
in that regard in that upstreams are inherited from `intentions`, so
that you only need to declare them once. (Unless intentions are
disabled, in which case the default behavior is a free-for-all). Making
services connect-native
<https://www.consul.io/docs/connect/native#connect-native-app-integration> is another way to eliminate the need for upstreams, but of course that only helps if you own the code. (And you'd still need to declare intentions)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10628 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAT5CZWSPUIV3OF5BR3YIDTOVLDZANCNFSM45GVLYYA>.
|
We would like to see this implemented for Nomad as well. Mainly echoing @apollo13's comment from above, 100%. |
I'll be picking this work up after the new year. The rough plan of action is going to be:
|
That sounds great!
Assuming this is not ending up as an enterprise only feature, do you mind sharing the design doc publicly (once you have written it ;))? Maybe the community can spot some errors etc in there? |
This is planned for Nomad CE. We don't make a practice of sharing the design docs directly because they usually have a bunch of private background info in there like specific customer requests or internal business requirements. But I'd love to extract what I can and share it ahead of implementation. That'd be a great habit for us to get into! |
As mentioned above, I wanted to surface our internal design doc ("RFC") here. Note that this is an excerpt with internal discussions removed. Please note this is a design document that we use to build rough consensus within the team and with our sibling teams (like the Consul folks). It may not 100% match up with the final implementation as I progress through it. No promises here! So if you're reading this in the future after the feature has shipped, please instead refer to the documentation! 😀 BackgroundOn Linux, both Kubernetes and Nomad typically run workloads in network namespaces. Both orchestrators use Container Network Interface (CNI) plugins to configure the network namespace by creating network bridges, iptables rules, etc. In Kubernetes, containers within a pod share the same network namespace. In Nomad, tasks within an allocation share the same network namespace. This arrangement is what allows Consul Connect to run an Envoy proxy in one task while the user's application runs in another task in the same allocation. Consul Connect provides for secure communication between workloads in the service mesh. However, it does not by default enforce that workloads only communicate over the mesh. Transparent proxy mode directs all inbound and outbound traffic for a workload through the Envoy sidecar via iptables. This forces workloads to use only the service mesh, with the option to configure exceptions for specific CIDRs, ports, and non-mesh destinations (or to enforce mesh destinations only). The primary benefit of transparent proxy is the ability for applications to use Consul DNS URLs to access upstreams, e.g. The diagram below shows a typical Connect allocation with two tasks. The group has a Nomad clients run a series of "allocrunner hooks" for the allocation as a whole before running a task runner for each task within the allocation, and the task runner has its own set of per-task "taskrunner hooks". Since Nomad 0.10, network configuration is setup in one of the allocrunner hooks. Nomad network blocks can have one of 4 modes: Nomad clients have a hard-coded CNI configuration template that includes calls to the When an allocation with bridge networking starts, the
On success, the resulting Note that Connect is not involved in this existing networking workflow at any point. When the job is submitted, the server automatically adds an Envoy proxy sidecar task to each group that requires one for Connect. The taskrunner for that task invokes Consul's Under Kubernetes, Consul CNI plugin receives its configuration from the k8s API. The consul-k8s control plane annotates pods with the expected iptables configuration. When the CNI plugin is run, it makes requests to the k8s API to determine how to configure iptables rules for the pod, and to update the k8s control plane with status of that work. The configuration is a JSON-encoded blob that deserializes into the Consul SDK’s ProposalNomad will support Connect transparent proxy mode by invoking Consul CNI during network setup when requested by the user. This will require changes to both Nomad and the Consul CNI plugin. Updates to Nomad Job SpecThe Nomad job’s A minimum proxy {
transparent_proxy {}
} Applications may need to expose additional ports for health checks or other external traffic. A proxy {
transparent_proxy {
uid = 101 # default, see iptables.Config
outbound_port = 15001 # default
exclude_inbound_ports = [] # default, can be set with a name
# that matches a network.port
# label or a port number.
exclude_outbound_ports = [] # default
exclude_outbound_cidrs = [] # default
exclude_uids = [] # default
}
expose {
path {
path = "/metrics"
protocol = "http"
local_path_port = 9001
# Any expose.path.listener_port will be automatically
# added to the exclude_inbound_ports set.
listener_port = "metrics"
}
}
# Note that when using tproxy, upstreams blocks are no longer
# required. But a user might want to have both while migrating
# their services to use tproxy. Nomad does not automatically create
# Consul intentions from the upstream blocks.
upstreams {
destination_name = "count-api"
local_bind_port = 8080
}
} Any task group that includes a Updates to Consul CNIThe Consul CNI plugin currently accepts
Nomad has no need for "pod annotations" from the CNI plugin, so the workflow where the CNI plugin updates its status is unnecessary. This allows for a much simpler implementation for Nomad and the Consul CNI plugin. The Nomad client will provide the JSON-encoded
Updates to Nomad bridge networking and CNI configurationThe
|
my rough implementation checklist:
|
Hi tgross, this all sounds very exciting. A few small questions/remarks:
Probably the only thing limiting theory from becoming reality here is that nomad hardcodes and checks for
It would be great if it would be possible to disable this behavior (or at least also have it for normal allocations as well?). As it currently stands I am already pushing a DNS server that is aware of the Can you expand a little bit on how the transparent proxy works (I guess I haven't understood it fully yet) and especially what the virtual addresses do? Am I correct that it works somewhat like this:
Assuming I am using virtual services where I guess each service gets it's own IP: Can my application now access What would Sorry if the questions about tproxy itself are kinda out of scope, but I am trying hard to understand what is happening here :) |
Good call. We discussed that a bit but didn't have a reasonable use case in mind, so having your example is valuable.
Right. From a high-level view, it's adding some iptables rules to the existing Connect implementation. Those rules are applied internal to the network namespace, so that outbound traffic from the task flows through the Envoy proxy we've configured for Connect, instead of just wherever the task wants.
The virtual IP address in this case is just Envoy load balancing between the real IP addresses that Nomad is advertising for the service to Consul. The |
Understood, the missing link for me is if I request |
Okay, learned something. There is SO_ORIGINAL_DST: https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/listener_filters/original_dst_filter EDIT:// And even more interesting with the TPROXY target https://blog.cloudflare.com/how-we-built-spectrum |
When transparent proxy is enabled for a service, Consul allocates a virtual IP (VIP) for that service from the 240.0.0.0/4 address range. The outbound listener on the downstream service's Envoy proxy has a series of filter chains that match destination VIP addresses to the corresponding Envoy cluster for the target upstream service. For example, {
"filter_chains": [
{
"filter_chain_match": {
"prefix_ranges": [
{
# Consul assigned virtual IP for the service `fake-service`
"address_prefix": "240.0.0.1",
"prefix_len": 32
}
]
},
"filters": [
{
"name": "envoy.filters.network.tcp_proxy",
"typed_config": {
"@type": "type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy",
"stat_prefix": "upstream.fake-service.default.default.dc2",
"cluster": "fake-service.default.dc2.internal.fabb7415-0d8e-5230-5455-d31c35d0ddd9.consul"
}
}
]
}
]
} The cluster contains a list of each endpoint / service instance for the logical service and their actual addresses in the cluster. {
"dynamic_endpoint_configs": [
{
"endpoint_config": {
"@type": "type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment",
"cluster_name": "fake-service.default.dc2.internal.fabb7415-0d8e-5230-5455-d31c35d0ddd9.consul",
"endpoints": [
{
"locality": {},
"lb_endpoints": [
{
"endpoint": {
"address": {
"socket_address": {
# IP address of the upstream service instance
"address": "10.42.1.78",
"port_value": 20000
}
},
"health_check_config": {}
},
...snip... In this configuration, downstream applications need to use the When {
"filter_chains": [
{
"filter_chain_match": {
"prefix_ranges": [
{
# upstream instance 1
"address_prefix": "10.42.1.78",
"prefix_len": 32
},
{
# upstream instance 2
"address_prefix": "10.42.1.79",
"prefix_len": 32
}
]
},
"filters": [
{
"name": "envoy.filters.network.tcp_proxy",
"typed_config": {
"@type": "type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy",
"stat_prefix": "upstream.fake-service.default.default.dc2",
"cluster": "passthrough~fake-service.default.dc2.internal.fabb7415-0d8e-5230-5455-d31c35d0ddd9.consul"
}
}
]
...snip... Instead of load balancing traffic across available upstream instances, Envoy is configured to send connections to a passthrough / original destination cluster that forwards the connection directly to the original destination IP and port. In this configuration, downstream applications need to use the |
Thanks for the assist @blake! 😀 |
In order to provide a DNS address and port to Connect tasks configured for transparent proxy, we need to fingerprint the Consul DNS address and port. The client will pass this address/port to the iptables configuration provided to the `consul-cni` plugin. Ref: #10628
In order to provide a DNS address and port to Connect tasks configured for transparent proxy, we need to fingerprint the Consul DNS address and port. The client will pass this address/port to the iptables configuration provided to the `consul-cni` plugin. Ref: #10628
While working on #10628 I discovered that the `expose` block was missing an implementation of Diff, which means it doesn't show up correctly in `job plan` output.
While working on #10628 I discovered that the `expose` block was missing an implementation of Diff, which means it doesn't show up correctly in `job plan` output. Also, fix field comparison in `ServiceCheck.Equal`. This is a bug in the method but it doesn't look like it impacts production code.
Add a transparent proxy block to the existing Connect sidecar service proxy block. This changeset is plumbing required to support transparent proxy configuration on the client. Ref: #10628
When `transparent_proxy` block is present and the network mode is `bridge`, use a different CNI configuration that includes the `consul-cni` plugin. Before invoking the CNI plugins, create a Consul SDK `iptables.Config` struct for the allocation. This includes: * Use all the `transparent_proxy` block fields * The reserved ports are added to the inbound exclusion list so the alloc is reachable from outside the mesh * The `expose` blocks and `check` blocks with `expose=true` are added to the inbound exclusion list so health checks work. The `iptables.Config` is then passed as a CNI argument to the `consul-cni` plugin. Ref: #10628
Nomad will implement support for Connect transparent proxy. Unlike in K8s, the CNI plugin can't contact the Nomad API to read allocation metadata (pod labels) to get the iptables configuration, and doesn't use the rest of the Consul-K8s control plane to inject that metadata. Instead, Nomad will pass the iptables configuration JSON-serialized in the CNI arguments. This changeset implements the behavior switch by detecting the `CONSUL_IPTABLES_CONFIG` argument in the CNI arguments. This hypothetically allows for non-Nomad workflows to use the same code path, if desired. Ref: hashicorp/nomad#10628
Add a transparent proxy block to the existing Connect sidecar service proxy block. This changeset is plumbing required to support transparent proxy configuration on the client. Ref: #10628
When `transparent_proxy` block is present and the network mode is `bridge`, use a different CNI configuration that includes the `consul-cni` plugin. Before invoking the CNI plugins, create a Consul SDK `iptables.Config` struct for the allocation. This includes: * Use all the `transparent_proxy` block fields * The reserved ports are added to the inbound exclusion list so the alloc is reachable from outside the mesh * The `expose` blocks and `check` blocks with `expose=true` are added to the inbound exclusion list so health checks work. The `iptables.Config` is then passed as a CNI argument to the `consul-cni` plugin. Ref: #10628
Add a transparent proxy block to the existing Connect sidecar service proxy block. This changeset is plumbing required to support transparent proxy configuration on the client. Ref: #10628
When `transparent_proxy` block is present and the network mode is `bridge`, use a different CNI configuration that includes the `consul-cni` plugin. Before invoking the CNI plugins, create a Consul SDK `iptables.Config` struct for the allocation. This includes: * Use all the `transparent_proxy` block fields * The reserved ports are added to the inbound exclusion list so the alloc is reachable from outside the mesh * The `expose` blocks and `check` blocks with `expose=true` are added to the inbound exclusion list so health checks work. The `iptables.Config` is then passed as a CNI argument to the `consul-cni` plugin. Ref: #10628
Add support for Consul Connect transparent proxies Fixes: #10628
This has been merged to |
Nice work, and I thought 1.8 would be a stabilization release 😜
…On Wed, Apr 10, 2024, at 17:04, Tim Gross wrote:
This has been merged to `main` and will ship in Nomad 1.8.0. I'm
working on Tutorial updates now in the private repository for those.
—
Reply to this email directly, view it on GitHub
<#10628 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAT5CYOTJ2WSZGMBYQ75LLY4VIGRAVCNFSM45GVLYYKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBUG44DAMZRGQZQ>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Awesome |
Add a transparent proxy block to the existing Connect sidecar service proxy block. This changeset is plumbing required to support transparent proxy configuration on the client. Ref: #10628
When `transparent_proxy` block is present and the network mode is `bridge`, use a different CNI configuration that includes the `consul-cni` plugin. Before invoking the CNI plugins, create a Consul SDK `iptables.Config` struct for the allocation. This includes: * Use all the `transparent_proxy` block fields * The reserved ports are added to the inbound exclusion list so the alloc is reachable from outside the mesh * The `expose` blocks and `check` blocks with `expose=true` are added to the inbound exclusion list so health checks work. The `iptables.Config` is then passed as a CNI argument to the `consul-cni` plugin. Ref: #10628
…ycleShutdown… into release/1.4.x (#4007) * Fix meshgw tests (#3532) * Fix meshgw tests * change protocol on mesh gw tests to tcp from mesh * add nightly for rc branch (#3533) * [NET-7243] Stub APIGateway Controller for v2 (#3507) * stub api-gateway-controller * Add setup to v2 controller * Net 7376 Status struct on api gateway with required info from kubesig (#3530) * add status structs * update status * updated script to point at RC version correctly (#3541) * updated script to point at RC version correctly * Mw/prepare main for 1.5 dev (#3535) * bump versions to next version * updated script to handle new Consul-k8s images * [COMPLIANCE] Add Copyright and License Headers (#3499) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * Net 7279 consul k8s write failing acceptance test for tcp route (#3540) * add status structs * update status * fixtures for v2 * checkpoint * add hook to only run test when flag is enabled * clean up reversions, delte extra files * remove http listeners * delete extra file * revert accidental IDE changes * clean up lint issues * Add json tags to api-gateway types (#3550) * reconcile consul-k8s with changes made in Consul (#3543) * [NET-7656] Add GatewayClassConfig watch for MeshGateway controller (#3537) * Add GatewayClass[Config] watches for MeshGateway controller * Update merge logic for deployment + service * Add test coverage for MergeDeployment * Add test coverage for MergeService * Copy over owner references to new Service + Deployment * Ensure signals are passed to commands (#3548) * Ensure signals are passed to commands Change `/bin/sh -ec "<command>"` to `/bin/sh -ec "exec <command>"`. Adding `exec` ensures that `<command>` is not executed as a child process but replaces the `/bin/sh` process. This ensure that `<command>` receives any signals. Specifically this is an issue when attempting to trap SIGTERMs as part of graceful pod shutdown. Without this change, we weren't receiving any signals because they aren't passed down by `/bin/sh -c`. * Fix broken bats tests and add changelog Signed-off-by: Ashwin Venkatesh <ashwin.what@gmail.com> --------- Signed-off-by: Ashwin Venkatesh <ashwin.what@gmail.com> Co-authored-by: Ashwin Venkatesh <ashwin.what@gmail.com> * [NET-7158] CRUD hooks for api gateway v2 (#3519) * Add hooks for CRUD side effects for apigateway controller * Added tests for controller * [NET-6465] Respect connectInject.initContainer.resources for v1 API gateways (#3531) * Respect connectInject.initContainer.resources for v1 API gateways * Add changelog entry * Add test coverage for init container resources on API gateway Pods * Add NET_BIND_SERVICE to the security context in the deployment of Mesh Gateway (NET-6463) (#3549) * Add NET_BIND_SERVICE to the security context in the deployment of Mesh Gateway * [NET-7657,NET-6934] Define v2 GatewayClass + GatewayClassConfig locally (#3559) * Define GatewayClass's spec model locally instead of consuming proto from Consul * Update gateway resources job to use new types, constants * Make description optional, regenerate CRD definitions * Remove GatewayClass columns related to syncing into Consul * [NET-7156] Gateways Controllers Reusability (#3574) * make controller setup for gateway controllers generic and reusable, add indices onto gateway resources in k8s for more efficient lookups * cleanup from PR review * Update control-plane/controllers/resources/gateway_controller_setup.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update control-plane/controllers/resources/gateway_indices.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update control-plane/controllers/resources/gateway_controller_setup.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update control-plane/controllers/resources/gateway_controller_setup.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * clean up from PR review --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * [NET-6465] Consider init container resources when determining if existing + desired deployments are equal (#3575) * Consider init container resources when determining if existing + desired deployments are equal * Add test coverage for compareDeployments * Update control-plane/api-gateway/gatekeeper/deployment_test.go * [NET-7657] Consume version of proto-public with GatewayClass[Config] removed (#3581) [NET-7657] Consume version of proto-public with GatewayClass + GatewayClassConfig removed * Update multicluster v2beta1 to v2 (#3560) Co-authored-by: skpratt <sarah.pratt@hashicorp.com> * [NET-7156] Generalize MeshGatewayBuilder to just GatewayBuilder (#3538) * update gateway builder to be generic * Add api gateway to gateway builder * Updated service test for gateway listeners/ports * update test names * update listener functions * remove check for listener name * fix tests * release: Update 10-util.sh to adjust formatting (#3588) Update 10-util.sh * use go 1.21.7 (#3591) * add make target script (#3596) add new make target for go mod tidy check * v2tenancy: namespace mirroring acceptance tests (#3590) * add linting back (#3603) added linting back * [COMPLIANCE] Add Copyright and License Headers (#3610) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * Datadog Integration (#3407) * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push * changelog entry update * datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config * curt pr review changes (minus extraConfig templating verification changes) * global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics * dogstatsd and otlp mutually exclusive verification checks * breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck * extraConfig hash updates post merge conflict update * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update changelog .txt to match new PR number * updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides) * update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior * fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1 * add in server-statefulset bats test for extraConfig validation testing * Net 7238 - consul k8s modify gateway resources job to create apigw gatewayclass and gatewayclassconfig (#3564) * configmap update * udpate chart to respect api-gateway-config * fix typo * added unit tests, added some stuff missed in initial pass * added thorough unit tests for gateway-resources-configmap.yaml * remove unneeded extra line * additional debugging * test * test * remove extra escapes * final test * test again * one more test * this should work * fix spacing issue * Fix logic on apigateway that ignores current annotations on services (#3597) * [NET-7449] Generalize CRUD hooks for Gateways (#3576) Generalize the crud hooks for gateways * [NET-5932] chore: remove comment from closed ticket (#3636) chore: remove comment from closed ticket * [NET-2420] security: Upgrade helm containerd and several other dependencies (#3625) * security: upgrade helm/v3 to 3.13.3 Addresses multiple CVEs: - CVE-2023-25165 - CVE-2022-23524 - CVE-2022-23526 - CVE-2022-23525 * chore: upgrade k8s dependencies to match controller-runtime * security: upgrade containerd to latest Addresses GHSA-7ww5-4wqc-m92c (GO-2023-2412) * security: upgrade docker/docker to latest Addresses GHSA-jq35-85cj-fj4p * security: upgrade docker/distribution to latest Addresses CVE-2023-2253 * security: upgrade filepath-securejoin to latest patch Addresses GHSA-6xv5-86q9-7xr8 (GO-2023-2048) * chore: upgrade oras-go to fix docker incompatibility * Add changelog * build: Create arm64 packages as well (#3428) During the CRT on-boarding, packaging for other Linux architectures (arm64) was not enabled. This change adds packaging support for those architectures. I've specifically opted not to include 32-bit. See #1132. Related to hashicorp/releng-support#178. Other related updates: - To make future support a bit easier, I've enabled the build workflow from releng prefixed branches. - Using qemu emulation for testing package installs on other architectures, thus allowing us to validate the binaries work as intended - Minor alteration to the package install tests to use yum instead of rpm Co-authored-by: David Yu <dyu@hashicorp.com> * [NET-2420] security: re-enable security scan release block (#3628) * security: upgrade helm/v3 to 3.13.3 Addresses multiple CVEs: - CVE-2023-25165 - CVE-2022-23524 - CVE-2022-23526 - CVE-2022-23525 * chore: upgrade k8s dependencies to match controller-runtime * security: upgrade containerd to latest Addresses GHSA-7ww5-4wqc-m92c (GO-2023-2412) * security: upgrade docker/docker to latest Addresses GHSA-jq35-85cj-fj4p * security: upgrade docker/distribution to latest Addresses CVE-2023-2253 * security: upgrade filepath-securejoin to latest patch Addresses GHSA-6xv5-86q9-7xr8 (GO-2023-2048) * chore: upgrade oras-go to fix docker incompatibility * Add changelog * security: re-enable security scan release block This was previously disabled due to an unresolved false-positive CVE. Re-enabling both secrets and OSV + Go Modules scanning, which per our current scan results should not be a blocker to future releases. Also add security scans on PR and merge to protected branches to allow proactive triage going forward. See hashicorp/consul#19978 for similar change in that repo, adapted here. * [NET-8174] security: add scan triage for CVE-2024-25620 (helm/v3) (#3657) security: add scan triage for CVE-2024-25620 (helm/v3) Triage this scan result as `consul-k8s` should not be directly impacted and it is medium severity. Follow-up ticket filed for remediation. Also improve formatting of scan config since this change will be backported. * Update main changelog for 1.1.10, 1.2.6 and 1.3.3 (#3662) * Update main changelog for 1.1.10, 1.2.6 and 1.3.3 * include previous missed releases * [COMPLIANCE] Add Copyright and License Headers (#3654) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * [NET-7450] setup crud hooks for APIGateway v2 (#3580) * setup crud hooks for APIGateway v2 * update CRDS and reorganize code in api gateway type * pass in gateway kind for annotations * Fix tests * Fix tests * register all types needed for test * values.yaml - tlsServerName docs (#3656) * Update values.yaml Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * [NET-6741] make: Add target for updating dependencies across all modules (#3669) make: Add target for updating dependencies across all modules To enable more consistent and error-proof dependency management, add a Make target that will set a dependency version across all submodules that require it. Also runs `go mod tidy`. This first ensures the dependency addition is reverted if the module in question does not require it; it also ensures that any additional cleanup needed in `go.mod`/`go.sum` is applied. * build.yml: Add ECR images back (#3668) * Update build.yml * Create 3668.txt * build.yml: typo on tags (#3681) * bump kind to v0.22.0 and update k8s support (#3675) * bump kind to v0.22.0 and update k8s support * Create 3675.txt * Update README.md * [NET-8174] security: add scan triage for CVE-2024-26147 (helm/v3) (#3688) security: add scan triage for CVE-2024-26147 (helm/v3) * chore: upgrade Consul dependencies to latest (#3695) * chore: upgrade Consul dependencies to latest * chore: upgrade control-plane submodule dependencies to latest * fix: update GatewayClass finalizer reference * release: add \n to end of NOTE for releases (#3700) * Update 10-util.sh * Update control-plane/build-support/functions/10-util.sh Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> --------- Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> * chore: upgrade `consul/api` to latest (#3702) chore: upgrade consul/api to latest v1.28.0 was retracted due to double-publish. * [NET-8174] security: add triage alias for GO-2024-2554 (#3705) security: add triage alias for GO-2024-2554 This vulnerability was already triaged via its GHSA alias, but the scanner is flagging it under this name, so adding an explicit entry. * docs: update `CHANGELOG` for K8s 1.4.0 release (#3710) docs: update CHANGELOG for K8s 1.4.0 release * docs: update 1.4.0 Helm docs per Docs team feedback (#3714) * [NET-8367] security: upgrade google.golang.org/protobuf to 1.33.0 (#3719) * update protobuf lib * add changelog * NET-6878: Fix Flake API Gateway Acceptance (#3717) * test upgraded library * remove toolchain reference * add toolchain * NET-8391: fix cleanup script (#3725) * NET-8391: fix cleanup script * cleanup testing comments * NET-8391: fix cleanup script - remove network interface(s) (#3730) * cleanup network interfaces * clean up test * updates k8s version (#3731) * fix(control-plane): acl tokens deleted while pods in graceful shutdown (#3736) * NET-6878: Remove finalizers from CRDs during test resource cleanup (#3739) * remove finalizers from crds * add comments * Upgrade to go 1.21.8 (#3741) * Upgrade to use Go `1.21.8`. This resolves CVEs [CVE-2024-24783](https://nvd.nist.gov/vuln/detail/CVE-2024-24783) (`crypto/x509`). [CVE-2023-45290](https://nvd.nist.gov/vuln/detail/CVE-2023-45290) (`net/http`). [CVE-2023-45289](https://nvd.nist.gov/vuln/detail/CVE-2023-45289) (`net/http`, `net/http/cookiejar`). [CVE-2024-24785](https://nvd.nist.gov/vuln/detail/CVE-2024-24785) (`html/template`). [CVE-2024-24784](https://nvd.nist.gov/vuln/detail/CVE-2024-24784) (`net/mail`). Update the Consul Build Go base image to `alpine3.19`. This resolves CVEs [CVE-2023-52425](https://nvd.nist.gov/vuln/detail/CVE-2023-52425) [CVE-2023-52426](https://nvd.nist.gov/vuln/detail/CVE-2023-52426) * Add changelog * Fix typo in values file for sync catalog test (#3760) * upgraded helm v3 to address GHSA-jw44-4f3j-q396 (#3768) * disable scan for "GHSA-jw44-4f3j-q396" until patch fix in helm v3 * addressed comments * Net 6821 - Regenerate Terminating Gateway CRD with new field (#3737) * initial updates * regen crds * Add fixes for flaky-cni and failing cloud-nightly tests (#3764) Add fixes for flaky-cni * Catalog: Use EndpointSlice and propagate Kubernetes Topology information to synced consul service (#3693) * Use EndpointSlice and propagate zone metadata to consul service * Fix tests * Add test for zone metadata * Cleanup and changelog entry * Fix clusterrole permissions and type on Informer * Include region info for NodePort services * Include topology region for all service types * Update release note * Fix tests * fix sync-catalog-clusterrole and tests * fix stash conflict * adding endpoints permission back to sync catalog since it still uses it. * Fix endpointslice map * Fix topology region * Remove region lookups, remove endpoints permissions, use pointers for endpointslice map * Drop region test --------- Co-authored-by: John Murret <john.murret@hashicorp.com> * Increase timeout for running commands in acceptance test (#3784) increase timeout for running commands * Bugfix: Don't recreate servicemap for catalog sync (#3785) * test: fix TestConnectInject_ProxyLifecycleShutdown (#3774) * Removes Legacy API Gateway Stanza that was deprecated in Consul 1.16 (#3718) * Removes Legacy API Gateway Stanza that was deprecated in Consul 1.16 * remove unit test for previously removed `consul-cni` validation (#3794) In #1527, we added support for OpenShift and Multus, which meant that the `consul-cni` plugin was no longer necessarily the final CNI plugin run. While working on a patch to allow compatibility with Nomad transparent proxy, I discovered we'd never removed a now-failing unit test of the plugin for the validation step. It looks like the remaining unit tests still cover the remaining validation, so we can safely remove this test. Ref: #1527 Ref: hashicorp/nomad#10628 * [NET-8412] Fix order of APIGW ACL policy/role creation (#3779) * Reorder gateway policy and role creation to avoid error messages in consul when policy/role already exists * refactor for readability * fix spacing * Added changelog * improve reliability of acceptance tests (#3800) * improve reliability of acceptance tests * remove update to timeout * add output to error * [net-8411] bug: fix premature token and service instance deletion due to pod fetch errors (#3758) * API gateway metrics (#3811) * First metrics pass * Fix up build * move to non-deprecated chart options * Fix up charts and defaults * Add changelog * Fix bad merge * Fix test * fix linter error * Fix extra yaml block from bad merge * Switch == true check to use ParseBool * Add support for Nomad transparent proxy (#3795) Nomad will implement support for Connect transparent proxy. Unlike in K8s, the CNI plugin can't contact the Nomad API to read allocation metadata (pod labels) to get the iptables configuration, and doesn't use the rest of the Consul-K8s control plane to inject that metadata. Instead, Nomad will pass the iptables configuration JSON-serialized in the CNI arguments. This changeset implements the behavior switch by detecting the `CONSUL_IPTABLES_CONFIG` argument in the CNI arguments. This hypothetically allows for non-Nomad workflows to use the same code path, if desired. Ref: hashicorp/nomad#10628 * fix version output for `consul-cni` (#3829) The `consul-cni` plugin emits "version unknown" because the CNI library's `PluginMain` uses a global variable that isn't being set as part of our build process. Import the `control-plane/version` package so that we have an identical version in builds across both binaries. * [NET-8601] Upgrade `vault/api` and `docker/docker` to resolve open CVEs (#3837) * security: upgrade vault/api to remove go-jose.v2 * security: upgrade docker/docker to v25.0.5 * add changelog * Remove anyuid SCC requirement for OpenShift (#3813) Remove SCC requirement for anyuid for OpenShift * Cleanup formatting to follow consul-k8s standard (#3852) * Datadog Unix Socket Path Custom Path fix (#3635) * Update dogstatsd hostPath rendering for Unix domain sockets -- override customizable and volumeMount/volume should align * changelog update * changelog: reviewer update to include datadog specific context * readd dev image tags for fips ubi (#3881) * readd dev image tags for fips ubi * fix up bad copy paste * [net-7710] don't overwrite prometheus path annotation if it's already been specified (#3846) don't overwrite prometheus path annotation if it's already been specified * feat: Add startup-grace-period-seconds and graceful-startup-path (#3878) * feat: Add startup-grace-period-seconds and graceful-startup-path * Add changelog --------- Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> * NET-8594: Disable TestSyncCatalog (#3815) * [NET-8946 NET-8947 NET-8948] security: bump go, x/net and envoy versions (#3893) security: bump go and x/net * NET-8594: Disable TestSyncCatalogIngress (#3904) * Helm: support sync-lb-services-endpoints for sync catalog (#3905) * Helm: support sync-lb-services-endpoints for sync catalog * add test * fix template tag order --------- Co-authored-by: jukie <10012479+Jukie@users.noreply.github.com> * Datadog Integration Acceptance Tests / Bug fixes (#3685) * datadog: acceptance tests - initial commit (not fully working yet) * server-statefulset: update logic for prometheus annotations (only enabled if using dogstatsd, otherwise disabled) * datadog: acceptance test working with dd-client api and operator deployment frameword * datadog-acceptance: main branch rebase merge conflict cherry-pick * datadog: acceptance testing update to metric name matching using regex * datadog: acceptance testing helper update for backoff retry * datadog: acceptance testing working timeseries query verification udp + uds * datadog: update helpers for /v1/query * server-statefulset.yaml: update to correct release name prepend to consul-server URL * datadog: acceptance testing consul integration checks working * server-statefulset: yaml and bats updates for datadog openmetrics and consul integration check URLs to use consul.fullname-server * PR3685: changelog update * datadog: openmetrics acceptance test update * datadog: added OTEL_EXPORTER_OTLP_ENDPOINT to consul telemetry collector deployment for dd-agent ingestion (passes tag info to DD) * otlp: datadog otlp acceptance test updates for telemetry-collector (grpc => http prefix) | staged otlp acceptance test * datadog-acceptance: fake-intake fixture addition * datadog-acceptance: update _helpers.tpl for consul version sanitization (truncate to <64) * datadog-acceptance: update base fixture for fake-intake * datadog-acceptance: add DogstatsD stats enablement (required for curling agent local endpoint) * datadog-acceptance: add DogstatsD stats enablement (required for curling agent local endpoint) * datadog-acceptance: first-round fake-intake testing - works but is innaccurate * datadog-acceptance: datadog framework - remove dd client agent requirement (fake-intake) * datadog-acceptance: update flags to not require API and APP key (fake-intake) * datadog-acceptance: go mod updates for uuid downgrade * acceptance-test: remove otlp acceptance test -- no fake-intake or agent endpoint to verify * datadog-acceptance: acceptance test lint fixes * acceptance-test: update control-plane/cni/main.go l:272 comment with period for lint testing. * acceptance-test: retry lint fixes * acceptance-test: correct telemetry collector URL from grpc:// to http:// * [NET-8412] Fix APIGW policy creation ordering for upgrade path (#3918) * fix policy creation for upgrading * Added changelog * Add post-release changelogs (#3867) Add changelogs * GH-3406 - Only error for config entries from different datacenters when the config entries are different (#3873) * GH-3406 - Only error for config entries from different datacenters when the config entries are different * add changelog * fixing tests and logic * refactoring code to make tests pass and also use a switch statement for readability and also get rid of intermediate state flag of requireMigration in a long iterative section of code. * add missing license file (#3921) * add missing license file * missed copying the license file to workdir * make up missing value and remove redundant directory creation * [COMPLIANCE] Add Copyright and License Headers (#3936) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * Net 9069/xw add license file to all bin (#3942) * debug: missing LICENSE * use abs path * [NET-6466] Remove secrets from termgw role (#3928) * remove unnecessary permissions for terminating gateways * add changelog * Net 9069/fix local brokerage (#3948) * make copy of license file into control plane * remove redundant copy in gh workflow * use env instead of arg * [NET-8091] Use file-system-certificate in Consul instead of inline-certificate (#3767) * Use file-system-certificate in Consul instead of inline-certificate * Actually update correctly from merges * Adds changelog * Updates go.mod in acceptance tests with latest consul api, updates the acceptance gateway lifecycle test * Small updates * Update comment --------- Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com> * chore: remove workstream from JIRA sync (#3960) * NET-9154: Update Kubernetes version (#3958) Update Kubernetes version * chore: fix JIRA workflow (#3965) * [NET-9097, NET-8174] Upgrade controller-runtime (#3935) * Consume controller-runtime v0.16.3 This is the version required by gateway-api v1.0.0, which will be consumed in a future PR * Reconcile breaking changes in controller-runtime * Fix linter errors * gofmt * Update controller tests to handle new fake client requirements * Update test assertion to handle changes in controller-runtime * Restore incorrectly-removed flags * Use a proper delete on the fake client since DeletionTimestamp is immutable * Update enterprise tests to specify status subresources * Update controller-runtime dependency for acceptance tests * Explicitly inject decoder into webhooks * Appease the linter * Use SetupWithManager pattern from controllers for webhook setup * Consume consistent version of k8s.io/client-go everywhere * Upgrade related dependencies for CLI, including helm/v3 * Consume latest release of helm/v3 * changelog * Inline function calls for testing * Consume controller-runtime v0.16.5 --------- Co-authored-by: Ronald Ekambi <ronekambi@gmail.com> * Fix a panic in connect-inject when the provided upstreams list is malformed (#3956) * Check if an upstream is malformed, if so ignore it. * support multiple upstreams separator (<space>, <comma>) add tests * add /n as a separator * add changelog * added log when upstream is skipped * [NET-9152] CRD for service registeration (#3943) * service is registering * add all the fields * health checks working * handle finalizers to clean up * Add status to registration CRD * Added initial unit test for reconcile * success paths for registration and deregistration * added failure tests, moved finalizer removal logic so it occurs after service is successfully deregistered * first test for to catalog registration type * maximal registration to catalog test * test all the things * deregistration tests * update some comments and fields, re-run generators * Added changelog * linting all the things * fixing test setup for new controller runtime * Handle errors for parsing duration * Add ReadOnlyRootFilesystem to Security Context (#2909) * Add readOnlyRootFilesystem to security context (#2771) * readOnlyRootFilesystem * Add mount for /tmp * Add /tmp mountpoint * Update ingress-gateways-deployment.yaml * Update terminating-gateways-deployment.yaml * Update helm unit tests * Create 2781.txt * rename changelog file * rename changelog file * Mount /tmp to volume for snapshots * rename changelog * changelog --------- Co-authored-by: mr-miles <miles.waller@gmail.com> Co-authored-by: Paul Glass <pglass@hashicorp.com> Co-authored-by: Sarah Alsmiller <sarah.alsmiller@hashicorp.com> * activate tproxy mode even when a cluster IP is not assigned to pod (#3974) * activate tproxy mode even when a cluster IP is not assigned to pod. * add changelog * fix failing tests * security: Upgrade Go to 1.21.10 (#3980) * NET-9178-Consul-api-gateway-not-starting-after-restart (#3978) * don't error if role already exists on restart * changelog * lint * [NET-9153] Handle Terminating Gateway ACL Setup (#3975) * first pass at creating write policy for service and updating term gw acl role * handle deregistering, update tests for registering with acls * existing deregister tests passing * failures with term gw role not existing * clean up * reorg code * Move to own package * watch for terminating gateways * move files back, handle multiple terminating gateways * handle errors and ensure finalizer is set * Add tests for finalizers * remove unused file * fix import naming * linting * fix comment, extract constant * [NET-9201] Validating webhook for registrations (#3990) * Add validating webhook for registrations * cleaned up registration webhook setup * fix setup for webhook, updated docs * fix typo, remove debugging log, rename variables for readability * Updating GitHub action versions to the latest TSCCR approved version (#3979) * test: fix PeeringGateway acceptance (#3992) * Adds ability to set the imagePullPolicy for all Consul images (consul… (#3991) * Adds ability to set the imagePullPolicy for all Consul images (consul, consul-dataplane, consul-k8s, consul-telemetry-collector) * [NET-9155] Cache resources for Registrations (#3993) * Add set for adding and removing services * remove service add * first pass at populating cache * cache is working, need to fix how statuses are handled * move to new directory, fix up the status conditions (still todos on this), handle results * updated tests * unexport methods that don't need to be exported * handle consul deregistrations * clean up before code review * show ACLUpdate as false if consul deregistered service * fix issue with updating acl status on consul deregistration * fix linting errors * FLAKEY_TEST: Add retry to outbound request for ProxyLifecycleShutdownTest * increase retry count for TestAPIGateway_GatewayClassConfig test * backport of commit b7ecab4 * backport of commit 2fcccd2 --------- Signed-off-by: Ashwin Venkatesh <ashwin.what@gmail.com> Co-authored-by: John Maguire <john.maguire@hashicorp.com> Co-authored-by: Michael Wilkerson <62034708+wilkermichael@users.noreply.github.com> Co-authored-by: sarahalsmiller <100602640+sarahalsmiller@users.noreply.github.com> Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Anita Akaeze <anita.akaeze@hashicorp.com> Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> Co-authored-by: Luke Kysow <1034429+lkysow@users.noreply.github.com> Co-authored-by: Ashwin Venkatesh <ashwin.what@gmail.com> Co-authored-by: Melisa Griffin <missylbytes@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: skpratt <sarah.pratt@hashicorp.com> Co-authored-by: David Yu <dyu@hashicorp.com> Co-authored-by: Semir Patel <semir.patel@hashicorp.com> Co-authored-by: natemollica-dev <57850649+natemollica-nm@users.noreply.github.com> Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> Co-authored-by: Daniel Kimsey <90741+dekimsey@users.noreply.github.com> Co-authored-by: Curt Bushko <cbushko@gmail.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: NicoletaPopoviciu <87660255+NicoletaPopoviciu@users.noreply.github.com> Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> Co-authored-by: Isaac Wilson <10012479+jukie@users.noreply.github.com> Co-authored-by: John Murret <john.murret@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com> Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com> Co-authored-by: Alvin Huang <17609145+alvin-huang@users.noreply.github.com> Co-authored-by: Andrea Scarpino <andrea@scarpino.dev> Co-authored-by: Deniz Onur Duzgun <59659739+dduzgun-security@users.noreply.github.com> Co-authored-by: wangxinyi7 <xinyi.wang@hashicorp.com> Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com> Co-authored-by: Ronald Ekambi <ronekambi@gmail.com> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> Co-authored-by: mr-miles <miles.waller@gmail.com> Co-authored-by: Paul Glass <pglass@hashicorp.com> Co-authored-by: Sarah Alsmiller <sarah.alsmiller@hashicorp.com>
Hello, are there any plans for detailed examples on how to use the transparent proxy in a Nomad job? |
@koder29406 the Service Mesh tutorial https://developer.hashicorp.com/nomad/tutorials/integrate-consul/consul-service-mesh has been updated to use transparent proxy now. (Small warning that there are a minor issues int the setup in that Tutorial which I've got a PR up to fix... that should land later today.) |
Hi @tgross, I use nomad I'm also try to validate my job that required consul-cni grather than |
Hi @rahadiangg! By documentation you're talking about here: https://developer.hashicorp.com/nomad/tutorials/integrate-consul/consul-service-mesh#verify-nomad-client-consul-configuration right? That's definitely just a typo, which I'll fix. But is there another place in the docs I'm missing? |
Hi, I know that https://www.consul.io/docs/connect/transparent-proxy is still beta, but it would be sooooo great if Nomad could support that in the next releases :)
The text was updated successfully, but these errors were encountered: