Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to integrate with GCP CAS #87

Open
xunholy opened this issue Aug 23, 2021 · 8 comments
Open

Failing to integrate with GCP CAS #87

xunholy opened this issue Aug 23, 2021 · 8 comments

Comments

@xunholy
Copy link

xunholy commented Aug 23, 2021

Description

We're using GCP CAS as our Cluster Issuer, which has been setup and works completely fine e2e using the traditional certificate resources to request certs for Ingress TLS certs. However, we'd like to use the same issuer with istio-csr to be able to issue these certs to our services running in the mesh so we can provide mTLS with our services in the cluster with the same CA which will also allow for traffic east <--> west to also be mTLS with our multi-cluster multi-mesh topology.

I'm deploying the default httpbin service through the istio examples and I can see the following logs:

2021-08-23T05:58:52.349778Z	warn	ca	ca request failed, starting attempt 1 in 103.08759ms
2021-08-23T05:58:52.453167Z	warn	ca	ca request failed, starting attempt 2 in 205.253929ms
2021-08-23T05:58:52.658658Z	warn	ca	ca request failed, starting attempt 3 in 391.517219ms
2021-08-23T05:58:53.050399Z	warn	ca	ca request failed, starting attempt 4 in 720.225847ms
2021-08-23T05:58:53.770959Z	error	googleca	Failed to create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
2021-08-23T05:58:53.771020Z	warn	sds	failed to warm certificate: failed to generate workload certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
2021-08-23T05:58:55.449855Z	warn	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected

I can see my CA being loaded in istiod correctly as I'd expect. (DUMMY CA)

istiod-asm-196-1-75b87cc7f-bgzx4 discovery 2021-08-23T06:13:08.757739Z	info	Istiod certificates are reloaded
istiod-asm-196-1-75b87cc7f-bgzx4 discovery 2021-08-23T06:13:08.757902Z	info	x509 cert [0] - Issuer: "CN=my-root,OU=my-ou,O=my-ca", Subject: "", SN: <>, NotBefore: "2021-08-23T06:12:52Z", NotAfter: "2021-08-23T07:12:51Z"

This lead me to believe that maybe something on the istio-csr side may be causing the issue but I've set verbosity to 5 and still don't have anything further to add context. Any further insights would be greatly appreciated.

I've also drafted up a diagram with the high level integration of the services and how it works together at present - maybe this will highlight something is either missing or incorrect.

Screen Shot 2021-08-23 at 4 48 19 pm

@JoshVanL
Copy link
Collaborator

JoshVanL commented Aug 23, 2021

Thanks for the detailed issue @xunholy!

I've drawn over your diagram a bit which hopefully should help things (apologies for my use of paint 😂).

istio-csr-diagram-labelled

The flow for a istio workload is 1. mount in the CA certificates from the config-map that istio-csr manages, 2. the workload requests the certificate from istio-csr, 3. the workload can talk to istiod for config etc.

What is happening in your case is that I think either:

  1. The istio workload is not pointing to istio-csr
  2. istio-csr itself has a wrong cert
  3. The CA mounted into the workload is wrong.

Would you be able to share the relevant bits of the istio config you used for setting up your mesh?

I would also always recommend setting the CA bundle to come from a static file on istio-csr (--root-ca-file). This will then populate the configmaps with exactly that file, rather than relying on istio-csr to use the ca field on its serving CertificateRequest which is TOFU.

@xunholy
Copy link
Author

xunholy commented Aug 23, 2021

Thanks @JoshVanL that makes a lot more sense, I did observe that the istio-proxy was trying to connect to the cert-manager-istio-csr when I had a deny by default network policy which is fixed.

My error from the istio-proxy is as follows:

2021-08-23T23:56:50.427579Z	warn	sds	failed to warm certificate: failed to generate workload certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"

This suggests to me the istio-proxy side car isn't loaded with the CA that is being used with the cert-manager-istio-csr workload, what would cause that? My istioOperator mounts are all correct, and I can see istiod running with the correct CA, what else could i be missing?

It sounds like your option 3 is the issue but how can I correct that - I've followed the instructions for configuration, is there something I've not seen that needs to be configured specifically for the httpbin/istio-proxy pod?

@JoshVanL
Copy link
Collaborator

@xunholy that does seem odd to me. Could you have a look at the root CA that is passed to the istio proxy to make sure it is the root CA for your Google CAS?

kubectl exec -n foo httpbin-74fb669cc6-jf67w -c istio-proxy -- cat /var/run/secrets/istio/root-cert.pem

@xunholy
Copy link
Author

xunholy commented Aug 24, 2021

@JoshVanL I indeed think this is the error, the token isn't being mounted.

root@httpbin-85ddb98ffd-4kshd:/# ls -la /var/run/secrets/
total 12
drwxr-xr-x 3 root root 4096 Aug 23 06:00 .
drwxr-xr-x 1 root root 4096 Aug 23 06:00 ..
drwxr-xr-x 3 root root 4096 Aug 23 06:00 kubernetes.io

I don't fully understand how the side car envoy gets this root cert in this model, I would have assumed cert-manager-istio-csr would have supplied it? Or is that still suppose to happen through istiod?

@JoshVanL
Copy link
Collaborator

@xunholy Indeed it is istio-csr which is populating this configmap in all namespaces.

This should be on the istio proxy container rather than the main httpbin one:

 $ kubectl exec -n sandbox httpbin-74fb669cc6-6tm4x -c istio-proxy -- ls /var/run/secrets/istio
root-cert.pem

It should come from a mount which gets patched in on the pod by the istio mutating webhook:

  - configMap:
      defaultMode: 420
      name: istio-ca-root-cert
    name: istiod-ca-cert
....
    volumeMounts:
    - mountPath: /var/run/secrets/istio
      name: istiod-ca-cert

You can also check via the configmap itself too:

$ kc get cm istio-ca-root-cert -o yaml
apiVersion: v1
data:
  root-cert.pem: |
    -----BEGIN CERTIFICATE-----
    MIIDTDCCAjSgAwIBAgIQAPCeBptQRpatRpMVkHrhKzANBgkqhkiG9w0BAQsFADBA
...

@JoshVanL
Copy link
Collaborator

@xunholy if it helps, these are the manifests which I used when setting up with google-cas if you want to compare notes 🙂

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  namespace: istio-system
spec:
  profile: "demo"
  hub: gcr.io/istio-release
  meshConfig:
    trustDomain: foo.bar
  values:
    global:
      caAddress: cert-manager-istio-csr.cert-manager.svc:443
  components:
    pilot:
      k8s:
        env:
          # Disable istiod CA Sever functionality
        - name: ENABLE_CA_SERVER
          value: "false"
        overlays:
        - apiVersion: apps/v1
          kind: Deployment
          name: istiod
          patches:

            # Mount istiod serving and webhook certificate from Secret mount
          - path: spec.template.spec.containers.[name:discovery].args[-1]
            value: "--tlsCertFile=/etc/cert-manager/tls/tls.crt"
          - path: spec.template.spec.containers.[name:discovery].args[-1]
            value: "--tlsKeyFile=/etc/cert-manager/tls/tls.key"
          - path: spec.template.spec.containers.[name:discovery].args[-1]
            value: "--caCertFile=/etc/cert-manager/ca/root-cert.pem"

          - path: spec.template.spec.containers.[name:discovery].volumeMounts[-1]
            value:
              name: cert-manager
              mountPath: "/etc/cert-manager/tls"
              readOnly: true
          - path: spec.template.spec.containers.[name:discovery].volumeMounts[-1]
            value:
              name: ca-root-cert
              mountPath: "/etc/cert-manager/ca"
              readOnly: true

          - path: spec.template.spec.volumes[-1]
            value:
              name: cert-manager
              secret:
                secretName: istiod-tls
          - path: spec.template.spec.volumes[-1]
            value:
              name: ca-root-cert
              configMap:
                defaultMode: 420
                name: istio-ca-root-cert
app:
  certmanager:
    preserveCertificateRequests: true
    issuer:
      name: istio-ca
      kind: GoogleCASIssuer
      group: cas-issuer.jetstack.io
  tls:
    trustDomain: foo.bar
    rootCAFile: /etc/tls/root-cert.pem

volumes:
- name: root-ca
  secret:
    secretName: istio-ca-root-cert

volumeMounts:
- name: root-ca
  mountPath: /etc/tls

@xunholy
Copy link
Author

xunholy commented Aug 24, 2021

@JoshVanL Thanks for the tips.

➜  ~ k exec -it httpbin-85ddb98ffd-4kshd -c istio-proxy -- ls /var/run/secrets/istio
ls: cannot access '/var/run/secrets/istio': No such file or directory
command terminated with exit code 2

The configmap in the namespace is correct

➜  ~ k get cm istio-ca-root-cert -oyaml
apiVersion: v1
data:
  root-cert.pem: |
    -----BEGIN CERTIFICATE-----
    MIICCDCCAY6gAwIBAgITF9Zy+X50nLueIn2I4PB+e6jHQTAKBggqhkjOPQQDAzAy

I suppose I'll take a look and compare notes against your configuration and check, I should probably mention we're using ASM vs OSS Istio, but essentially the same config is mirrored through the istioOperator.

I can mention I didn't have

volumes:
- name: root-ca
  secret:
    secretName: istio-ca-root-cert

volumeMounts:
- name: root-ca
  mountPath: /etc/tls

I'll add these values though and check, but i don't know if this would impact the istio-proxy per se.

@JoshVanL
Copy link
Collaborator

@xunholy aha- this sounds like the culprit. I'm not very familiar with ASM myself so am not entirely sure how it is propagating the CA certificate, but would expect it to be possible to re-enable the sidecar injector to have those mounts there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants