Skip to content

Ingress disabled for non-mesh traffic once integrated #3039

Closed as not planned

Description

Describe the bug

After deploying nginx-service with integrated NGINX+ ingress controller, VirtualServer configured for services that that are not in the mesh will return 502 bad gateway. This is bad because I want to keep some solutions OUT OF THE MESH so they cannot access protected services.

The NSM is configured to have mTLS set to strict mode to drop traffic from outside of the service mesh as the cluster has both services that are part of the mesh and services that are not part of the mesh.

To Reproduce
Steps to reproduce the behavior:

I used helmfile to encapsulate and configure Helm charts.

  1. Install o11y
    URLS=(https://docs.nginx.com/nginx-service-mesh/examples/{prometheus,grafana,otel-collector,jaeger}.yaml)
    for URL in ${URLS[*]}; do curl -sOL $URL; done
    for FILE in {prometheus,grafana,otel-collector,jaeger}.yaml; do kubectl apply -f $FILE; done
  2. Install NSM
    cat << EOF > nsm.yaml
    repositories:
      # https://artifacthub.io/packages/helm/nginx/nginx-service-mesh
      - name: nginx-stable
        url: https://helm.nginx.com/stable
    
    releases:
      - name: nsm
        namespace: nginx-mesh
        chart: nginx-stable/nginx-service-mesh
        values:
          - prometheusAddress: prometheus.nsm-monitoring.svc:9090
            telemetry:
              exporters:
                otlp:
                  host: otel-collector.nsm-monitoring.svc
                  port: 4317
              samplerRatio: 1
            tracing: null
            mtls:
              mode: strict
            autoInjection:
              disable: true
    EOF
    helmfile -f nsm.yaml apply
  3. Install NGINX+ IC
    # assume nginx-plus images are in local accessible GCR
    cat << EOF > nginx_ic.yaml
    repositories:
      # https://artifacthub.io/packages/helm/nginx/nginx-ingress
      - name: nginx-stable
        url: https://helm.nginx.com/stable
    
    releases:
      # NOTE: tutorial online uses 'nginx-ingress' for namespace
      - name: nginx-ingress
        namespace: kube-addons
        chart: nginx-stable/nginx-ingress
        version: 0.14.0
        values:
          - controller:
              nginxplus: true
              image:
                repository: gcr.io/{{ requiredEnv "GCR_PROJECT_ID" }}/nginx-plus-ingress
                tag: 2.3.0
              # NGINX Configmap
              config:
                entries:
                  ssl-redirect: "True"
                  http2: "True"
              ingressClass: nginx
              # NGINX IC CRDs
              enableCustomResources: true
              enableCertManager: true
              enableExternalDNS: true
              # Prometheus must be installed
              enableLatencyMetrics: true
            nginxServiceMesh:
              enable: true
              enableEgress: true
    EOF
    helmfile -f nginx_ic.yaml apply
  4. Install External DNS and Cert-Manager
    NOTE: For real DNS + ACME DNS01 challenge to work, services must have access to r/w DNS (route53, Cloud DNS, Azure DNS, etc). The snippet below is oriented to GKE with GCR + Cloud DNS
    export DNS_PROJECT_ID="<your-cloud-dns-zone-project>"
    export DNS_SA_EMAIL="<your-gsa-with-access-to-cloud-dns-zone>"
    export DNS_DOMAIN="example.com" # replace me
    
    cat << EOF > kube_addons.yaml
    repositories:
      # https://artifacthub.io/packages/helm/cert-manager/cert-manager
      - name: jetstack
        url: https://charts.jetstack.io
      # https://artifacthub.io/packages/helm/bitnami/external-dns
      - name: bitnami
        url: https://charts.bitnami.com/bitnami
    
    releases:
      - name: external-dns
        namespace: kube-addons
        chart: bitnami/external-dns
        version: 6.8.1
        values:
          - provider: google
            google:
              zoneVisibility: public
              project: {{ env "DNS_PROJECT_ID" }}
            sources:
              - crd
              - service
              - ingress
            # use with NGINX VirtualServer CRD
            crd:
              create: false
              apiversion: externaldns.nginx.org/v1
              kind: DNSEndpoint
            serviceAccount:
              annotations:
                # google workgroup identity annotation
                iam.gke.io/gcp-service-account: {{ requiredEnv "DNS_SA_EMAIL" }}
            nodeSelector:
              # deploy on nodes that support workgroup identity
              iam.gke.io/gke-metadata-server-enabled: "true"
            logLevel: {{ env "EXTERNALDNS_LOG_LEVEL" | default "debug" }}
            domainFilters:
              - {{ requiredEnv "DNS_DOMAIN" }}
            txtOwnerId: external-dns
            rbac:
              create: true
              apiVersion: v1
            policy: upsert-only
    
      - name: cert-manager
        namespace: kube-addons
        chart: jetstack/cert-manager
        version: 1.9.1
        values:
          - installCRDs: true
            extraArgs:
              - --cluster-resource-namespace=kube-addons
            global:
              logLevel: 2
            serviceAccount:
              annotations:
                # google workgroup identity annotation
                iam.gke.io/gcp-service-account: {{ requiredEnv "DNS_SA_EMAIL" }}
            nodeSelector:
              # deploy on nodes that support workgroup identity
              iam.gke.io/gke-metadata-server-enabled: "true"
    EOF
    
    cat << EOF > issuers.yaml
    repositories:
      # https://artifacthub.io/packages/helm/itscontained/raw
      - name: itscontained
        url: https://charts.itscontained.io
    
    releases:
      - name: cert-manager-issuers
        chart: itscontained/raw
        namespace: kube-addons
        version:  0.2.5
        disableValidation: true
        values:
          - resources:
              - apiVersion: cert-manager.io/v1
                kind: ClusterIssuer
                metadata:
                  name: letsencrypt-staging
                spec:
                  acme:
                    server: https://acme-staging-v02.api.letsencrypt.org/directory
                    email: {{ requiredEnv "ACME_ISSUER_EMAIL" }}
                    privateKeySecretRef:
                      name: letsencrypt-staging
                    solvers:
                      - dns01:
                          cloudDNS:
                            project: {{ env "DNS_PROJECT_ID" }}
    
              - apiVersion: cert-manager.io/v1
                kind: ClusterIssuer
                metadata:
                  name: letsencrypt-prod
                spec:
                  acme:
                    server: https://acme-v02.api.letsencrypt.org/directory
                    email: {{ requiredEnv "ACME_ISSUER_EMAIL" }}
                    privateKeySecretRef:
                      name: letsencrypt-prod
                    solvers:
                      - dns01:
                          cloudDNS:
                            project: {{ env "DNS_PROJECT_ID" }}
    EOF
    
    helmfile -f kube_addons.yaml apply
    helmfile -f issuers.yaml apply
  5. Install Ratel outside of mesh
    cat << EOF > ratel.yaml
    repositories:
      # https://artifacthub.io/packages/helm/itscontained/raw
      - name: itscontained
        url: https://charts.itscontained.io
    
    releases:
      - name: ratel
        chart: itscontained/raw
        namespace: ratel
        version:  0.2.5
        disableValidation: true
        values:
          - resources:
              - apiVersion: apps/v1
                kind: Deployment
                metadata:
                  name: dgraph-ratel
                spec:
                  selector:
                    matchLabels:
                      app: dgraph
                      component: ratel
                  replicas: 1
                  template:
                    metadata:
                      labels:
                        app: dgraph
                        component: ratel
                    spec:
                      containers:
                        - name: dgraph-ratel
                          image: docker.io/dgraph/ratel:v21.03.2
                          imagePullPolicy:
                          command:
                            - dgraph-ratel
                          ports:
                            - name: http-ratel
                              containerPort: 8000
    
              - apiVersion: v1
                kind: Service
                metadata:
                  name: dgraph-ratel
                  labels:
                    app: dgraph
                    component: ratel
                spec:
                  type: ClusterIP
                  ports:
                    - port: 80
                      targetPort: 8000
                      name: http-ratel
                  selector:
                    app: dgraph
                    component: ratel
    EOF
    
    cat << EOF > ratel_vs.yaml
    repositories:
      # https://artifacthub.io/packages/helm/itscontained/raw
      - name: itscontained
        url: https://charts.itscontained.io
    
    releases:
      - name: ratel-virtualserver
        chart: itscontained/raw
        namespace: ratel
        version:  0.2.5
        disableValidation: true
        values:
          - resources:
              - apiVersion: k8s.nginx.org/v1
                kind: VirtualServer
                metadata:
                  name: dgraph-http
                spec:
                  host: ratel.{{ requiredEnv "DNS_DOMAIN" }}
                  tls:
                    secret: tls-secret
                    cert-manager:
                      cluster-issuer: {{ requiredEnv "ACME_ISSUER_NAME" }}
                  externalDNS:
                    enable: true
                  upstreams:
                    - name: ratel
                      service: dgraph-ratel
                      port: 80
                  routes:
                    - path: /
                      action:
                        pass: ratel
    EOF
    
    helmfile -f ratel.yaml apply
    helmfile -f ratel_vs.yaml apply
  6. Access the website, for example:
    curl https://ratel.$DNS_DOMAIN

Expected behavior

I expected that the gateway (NGINX+ IC) would route traffic to back-end services that are not meshed in addition to services that are meshed. The reason why this is important, it because ratel is only a client application, and should it ever be compromised, it should NOT be able to reach the private database cluster or any other services on the mesh.

Actual behavior
I globally search/replace my registered domain for example.com.

2022/09/15 03:33:42 [error] 47#47: *378 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET / HTTP/2.0", upstream: "https://10.104.0.40:8000/", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:42 +0000] "GET / HTTP/2.0" 502 157 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15" "-"
2022/09/15 03:33:42 [error] 47#47: *380 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET /apple-touch-icon-precomposed.png HTTP/2.0", upstream: "https://10.104.0.40:8000/apple-touch-icon-precomposed.png", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:42 +0000] "GET /apple-touch-icon-precomposed.png HTTP/2.0" 502 157 "-" "Safari/15608.4.9.1.3 CFNetwork/1121.1.2 Darwin/19.2.0 (x86_64)" "-"
2022/09/15 03:33:42 [error] 47#47: *380 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET /apple-touch-icon.png HTTP/2.0", upstream: "https://10.104.0.40:8000/apple-touch-icon.png", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:42 +0000] "GET /apple-touch-icon.png HTTP/2.0" 502 157 "-" "Safari/15608.4.9.1.3 CFNetwork/1121.1.2 Darwin/19.2.0 (x86_64)" "-"
2022/09/15 03:33:45 [error] 47#47: *383 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET / HTTP/2.0", upstream: "https://10.104.0.40:8000/", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:45 +0000] "GET / HTTP/2.0" 502 157 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15" "-"
2022/09/15 03:33:47 [error] 47#47: *378 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET / HTTP/2.0", upstream: "https://10.104.0.40:8000/", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:47 +0000] "GET / HTTP/2.0" 502 157 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15" "-"
2022/09/15 03:33:47 [error] 47#47: *383 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET / HTTP/2.0", upstream: "https://10.104.0.40:8000/", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:47 +0000] "GET / HTTP/2.0" 502 157 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15" "-"
2022/09/15 03:33:51 [error] 47#47: *383 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET / HTTP/2.0", upstream: "https://10.104.0.40:8000/", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:51 +0000] "GET / HTTP/2.0" 502 157 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15" "-"
2022/09/15 03:33:52 [error] 47#47: *383 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET / HTTP/2.0", upstream: "https://10.104.0.40:8000/", host: "ratel.example.com"
135.180.100.148 - - [15/Sep/2022:03:33:52 +0000] "GET / HTTP/2.0" 502 157 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15" "-"
2022/09/15 03:33:55 [error] 47#47: *383 SSL_do_handshake() failed (SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number) while SSL handshaking to upstream, client: 135.180.100.148, server: ratel.example.com, request: "GET / HTTP/2.0", upstream: "https://10.104.0.40:8000/", host: "ratel.example.com"

Your environment

  • Version of the Ingress Controller - nginx/1.21.6 (nginx-plus-r27)
  • Version of Kubernetes: 1.22.11
  • Kubernetes platform GKE
  • Using NGINX Plus

Additional context

I can provide scripts to provision Cloud DNS, GKE, GCR, and configure access with Google Service Accounts and Workload Identity using gcloud and gsutil if needed.

I also deployed a backend distributed graph database Dgraph, but since that was suppose to be in the mesh and works fine, I didn't include it here. The Ratel is a client only to bootstrap the client, so it shouldn't have access to the strict service mesh.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    proposalAn issue that proposes a feature request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions