Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections bypass ACL security in multi-port #1606

Open
darkn3rd opened this issue Oct 9, 2022 · 10 comments
Open

Connections bypass ACL security in multi-port #1606

darkn3rd opened this issue Oct 9, 2022 · 10 comments
Labels
type/question Question about product, ideally should be pointed to discuss.hashicorp.com waiting-reply Waiting on the issue creator for a response before taking further action

Comments

@darkn3rd
Copy link

darkn3rd commented Oct 9, 2022

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

When using multi-port service with ACLs + TLS verify, traffic between upstream via localhost to the server is protected, but traffic from outside to the service endpoint is not protected. I also tried from container outside of the service mesh completely, and I am able to connect to service endpoint. Essentially, security mechanism is bypassed completely when using multi-port, and it seems only the explicit upstream access through localhost is secured. This defeats the purpose of using the service mesh in the first place.

If there is any solution or hack to ameliorate this into this behavior is fixed, that would be great.

Reproduction Steps

I followed guides related from:

  1. Deploy consul with these values:
    global:
      name: consul
      enabled: true
      datacenter: dc1
      gossipEncryption:
        autoGenerate: true
      tls:
        enabled: true
        enableAutoEncrypt: true
        verify: true
      acls:
        manageSystemACLs: true
    server:
      replicas: 1
      securityContext:
        runAsNonRoot: false
        runAsUser: 0
    connectInject:
      enabled: true
    controller:
      enabled: true
  2. Deploy server:
    # server.yaml
    ---
    apiVersion: consul.hashicorp.com/v1alpha1
    kind: ServiceDefaults
    metadata:
      name: static-server
    spec:
      protocol: 'http'
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: web
    spec:
      selector:
        app: web
      ports:
        - protocol: TCP
          port: 80
          targetPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: web-admin
    spec:
      selector:
        app: web
      ports:
        - protocol: TCP
          port: 80
          targetPort: 9090
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: web
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: web-admin
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: web
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: web
      template:
        metadata:
          name: web
          labels:
            app: web
          annotations:
            'consul.hashicorp.com/connect-inject': 'true'
            'consul.hashicorp.com/transparent-proxy': 'false'
            'consul.hashicorp.com/connect-service': 'web,web-admin'
            'consul.hashicorp.com/connect-service-port': '8080,9090'
        spec:
          containers:
            - name: web
              image: hashicorp/http-echo:latest
              args:
                - -text="hello world"
                - -listen=:8080
              ports:
                - containerPort: 8080
                  name: http
            - name: web-admin
              image: hashicorp/http-echo:latest
              args:
                - -text="hello world from 9090"
                - -listen=:9090
              ports:
                - containerPort: 9090
                  name: http
          serviceAccountName: web
  3. Deploy client:
    # client.yaml
    ---
    apiVersion: consul.hashicorp.com/v1alpha1
    kind: ServiceDefaults
    metadata:
      name: static-client
    spec:
      protocol: 'http'
    ---
    apiVersion: v1
    kind: Service
    metadata:
      # This name will be the service name in Consul.
      name: static-client
    spec:
      selector:
        app: static-client
      ports:
        - port: 80
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: static-client
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: static-client
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: static-client
      template:
        metadata:
          name: static-client
          labels:
            app: static-client
          annotations:
            'consul.hashicorp.com/connect-inject': 'true'
            consul.hashicorp.com/connect-service-upstreams: "web:1234,web-admin:2234"
        spec:
          containers:
            - name: static-client
              image: curlimages/curl:latest
              # Just spin & wait forever, we'll use `kubectl exec` to demo
              command: ['/bin/sh', '-c', '--']
              args: ['while true; do sleep 30; done;']
          # If ACLs are enabled, the serviceAccountName must match the Consul service name.
          serviceAccountName: static-client
  4. Exec into client container:
    export NS=${NS:-"default"}
    POD=$(kubectl get pods --namespace $NS --selector app=static-client --output name)
    kubectl exec -ti --container "static-client" --namespace $NS ${POD} -- /bin/sh
  5. Connect through upstream (run from within client container):
    curl localhost:1234 # fails as expected
    # curl: (7) Failed to connect to localhost port 1234 after 0 ms: Connection refused
    curl localhost:2234 # fails as expected
    # curl: (7) Failed to connect to localhost port 2234 after 0 ms: Connection refused
  6. Connect through service endpoint:
    export NS=${NS:-"default"}
    curl web.$NS.svc.cluster.local # this SHOULD FAIL
    # "hello world"
    curl web-admin.$NS.svc.cluster.local # this SHOULD FAIL
    # "hello world from 9090"

Logs

n/a

Expected behavior

I expected that when communicating through the service endpoints, e.g. web.default.svc.cluster.local, this would be blocked. However, this works fine despite ACLs enabled.

Environment details

Additionally, please provide details regarding the Kubernetes Infrastructure, as shown below:

  • Kubernetes version: v1.22
  • Cloud Provider: GKE
  • Networking CNI plugin in use: default whatever that is

Additional Context

This works fine in single-port scenario.

@darkn3rd darkn3rd added the type/bug Something isn't working label Oct 9, 2022
@ishustava
Copy link
Contributor

Hey @darkn3rd

This is a known limitation with the multi-port workaround. Specifically that transparent proxy is not supported (https://developer.hashicorp.com/consul/docs/k8s/connect#caveats-for-multi-port-pods). Transparent proxy is the feature that ensures that all traffic goes to the proxy so that you don't bypass the service mesh.

One way to ensure this without transparent proxy is to have your services only bind to localhost instead of 0.0.0.0 or the pod IP.

@ishustava ishustava added type/question Question about product, ideally should be pointed to discuss.hashicorp.com waiting-reply Waiting on the issue creator for a response before taking further action and removed type/bug Something isn't working labels Oct 11, 2022
@darkn3rd
Copy link
Author

darkn3rd commented Oct 11, 2022

How then can an ingress controller get integrated to Consul, if the service only listens on localhost? Are there solutions where the ingress could communicate through the service mesh?

@ishustava
Copy link
Contributor

Yeah, we have some docs on that here: https://developer.hashicorp.com/consul/docs/k8s/connect/ingress-controllers. These docs are for when you are using transparent proxy, but in the case when tproxy is disabled, you just wouldn't need that additional configuration. We haven't tested it with multi-port though.

@darkn3rd
Copy link
Author

Conceptually, unless I am misunderstanding how this would work, I am not sure how it can work without transparent proxy, because the ingress automatically uses the internal service endpoint, which will not be secured. Somehow, there would need some advanced hacks to route traffic to localhost, instead of the normal service path, e.g. dgraph-alpha.dgraph.svc.cluster.local.

@ishustava
Copy link
Contributor

Yeah you're right! Ingress controllers would not work without tproxy. Sorry I was wrong in my above response.

In that case, a Consul API gateway would be a better choice for ingress instead of the ingress controllers because it can route to consul services.

@darkn3rd
Copy link
Author

darkn3rd commented Oct 12, 2022

@ishustava I looked at Consul API gateway as a possibility, but I have to pass for the moment because (1) it does not support gRPC, and (2) the current documentation is in Terraform/Kustomize and requires installing EKS; it would take some time to distill the material and extract I need.

It would be nice if someone tested multi-port on ingress controller that may provide some features where sending to traffic to localhost can be configured. The ingress-nginx may do this, but I have never had the need to go there. This certainly adds an extreme level of complexity.

The root cause of all these issues, is that out of the box, Consul Service Mesh should support K8S API, that is list of ports per service, as this is not an uncommon use case. ACLs is pretty much pointless without transparent proxy, SM can be bypass completely. :'(

@mikemorris
Copy link
Contributor

the current documentation is in Terraform/Kustomize and requires installing EKS

If you select the "Local" tab instead of "HashiCorp Cloud Platform (HCP)" in https://learn.hashicorp.com/tutorials/consul/kubernetes-api-gateway, the instructions are written for Kind but should be applicable to any generic Kubernetes environment - no Terraform or EKS needed, and the built-in kubectl apply --kustomize is only used for initially installing the CRDs.

Sorry to hear Consul API Gateway doesn't meet your needs at the moment, but hope you'll consider it in the future as we continue development!

@darkn3rd
Copy link
Author

darkn3rd commented Oct 14, 2022

Thank you, the local that's lot easier to synthesize and derive a solution, I will try this some time in the future when I get a chance.

Back to the original issue, it would be nice to have full functional support of multi-port, which would be parity with the Kubernetes service API that supports a list of ports, where Consul service registration, if I understand this correctly, only supports one port. There shouldn't have to be, for example, 4 envoy proxy sidecar containers for 4 ports, but rather a single envoy sidecar supporting four ports. The security features provided with transparent proxy should be afforded to multi-port configurations, where bypassing the service mesh for strict mTLS shouldn't even be a possibility, or if ACLs are enabled, it shouldn't be bypassed. Shifting left the to the application service itself (e.g. access or allow list), or requiring other non-Consul solutions as a way to ameliorate the lack of functionality (such as firewalls or network policy), especially in security, shouldn't be an acceptable baseline.

On top of this, the current multi-port, besides disabling a lot of core functionality mentioned above, and other things like metrics, adds a layers of complexity, already complex solution. Additionally in exchange for more complexity, lack of core functionality (such as observability), security vulnerability, more side-car proxy containers increases the footprint to use the service mesh.

Many of core Consul features, essential for cloud native solutions, are baked into Kubernetes itself, so outside the service mesh, the value prop is low for Consul itself; Consul service mesh with missing metrics and security and other features for multi-port, makes the solution non-competitive, at least in the scope of multi-port. This is a shame, given many of the more advanced features available now and planned won't be realized if it doesn't afford a basic functionality with multi-port.

So in conclusion, I think I would then make this a feature request that all traffic could have strict mTLS enforced and ACLs applied, so that only desired permitted traffic gets to the application service. I would like to see this prioritized on the roadmap. In the interim, more documentation around ACLs with multi-port would be desirable, as well as noting these limitations early on in the journey, reiterated in overview and Getting Started types of docs. Additionally, in the interim, testing solutions and further documentation on such integrations internal and external would be nice; this would require collaboration with other OSS projects. There's not a lot of material in this area w/ multiport; I don't know if others have gotten to a base level of success with multi-port to further explore such areas yet.

I hope this is helpful. Thanks for the advice and solutions for the workarounds around the current limitation.

@ishustava
Copy link
Contributor

Thanks so much for this feedback @darkn3rd !!

This is definitely something we're looking to improve. I don't have any specifics yet, but we're looking into having better support for multi-port in the near future, so I hope we'll have something that works better soon!

@darkn3rd
Copy link
Author

I look forward to the roadmap and/or any docs on this, especially as this makes Consul a non-starter. I hope it can be addressed soon, rather than supporting more advance features, when basic core features supported in K8S API (e.g. list of ports with service) is not supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question Question about product, ideally should be pointed to discuss.hashicorp.com waiting-reply Waiting on the issue creator for a response before taking further action
Projects
None yet
Development

No branches or pull requests

3 participants