Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using workload identity instead of exporting service account keys on GKE #3009

Closed
akhamar opened this issue Jun 15, 2020 · 25 comments
Closed
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@akhamar
Copy link

akhamar commented Jun 15, 2020

Is your feature request related to a problem? Please describe.

For security and ease of use, it would be greatly preferable to allow us to use workload identity.
It link a kubernetes serviceaccount (ksa) to a GCP serviceaccount (gsa).

Describe the solution you'd like

Instead of creating an Issuer or ClusterIssuer by giving the content of the gcp service account as a json (key.json) it would be greatly appreciate to either give a serviceAccountSecretRef or simply a serviceaccount (ksa). Since gcp serviceaccount are limited to 10 keys, for convenience and for security reason it would be preferable to not use a json file containing the credentials.

Currently that's how you you create an Issuer (ClusterIssuer)

clouddns:
  # The ID of the GCP project
  project: $PROJECT_ID
  # This is the secret used to access the service account
  serviceAccountSecretRef:
    name: clouddns-dns01-solver-svc-acct
    key: key.json

cf. https://cert-manager.io/docs/configuration/acme/dns01/google/

With a kubernetes serviceaccount (ksa) it could be

Issuer or ClusterIssuer

clouddns:
  # The ID of the GCP project
  project: $PROJECT_ID
  # This is the secret used to access the service account
  serviceAccount:
    name: dns01_ksa

Kubernete service account (ksa)

apiVersion: v1
kind: ServiceAccount
metadata:
  namesapce: cert-manager
  name: dns01_ksa
  annotations:
    iam.gke.io/gcp-service-account: dns01_gsa@$PROJECT_ID.iam.gserviceaccount.com

GCP service account

gcloud iam service-accounts add-iam-policy-binding \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:$PROJECT_ID.svc.id.goog[cert-manager/dns01_ksa]" \
  dns01_gsa@$PROJECT_ID.iam.gserviceaccount.com

Describe alternatives you've considered
N/A so far

Additional context
N/A

Environment details (if applicable):

  • Kubernetes version (e.g. v1.10.2): 1.16.8-gke.15
  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): GKE
  • cert-manager version (e.g. v0.4.0): 0.15.1
  • Install method (e.g. helm or static manifests): static manifests

/kind feature

@jetstack-bot jetstack-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 15, 2020
@akhamar akhamar changed the title Using workload identity instead of exporting service account keys on GCP Using workload identity instead of exporting service account keys on GKE Jun 15, 2020
@akhamar
Copy link
Author

akhamar commented Jun 15, 2020

The pod issuing the API request to interact with clouddns (gcp) just need to be deploy with the serviceaccount named dns01_ksa.

When you do that, GCP api library should been able to use the provided token. You can start a pod and check with gcloud / gsutil that it's working properly :
kubectl run -i --tty --image SOME_DOCKER_IMG_WITH_GCLOUD i_run_with_workload_Yay --restart=Never --rm --serviceaccount dns01_ksa /bin/zsh

gcloud auth list

Credentialed Accounts
    ACTIVE  ACCOUNT
    *       dns01_gsa@$PROJECT_ID.iam.gserviceaccount.com

You should see the GCP service account (gsa)

@bergemalm
Copy link

Hi @akhamar ,
You are able achieve this by doing the IAM preparations and then annotate the k8s service account for the cert-manager-controller pod and just provide:

      solvers:
      - dns01:
          clouddns:
            project: my-project-id

Leaving out the serviceAccountSecretRef: part.

@akhamar
Copy link
Author

akhamar commented Jul 10, 2020

Hi @bergemalm
Won't the process not work because I believe I saw some checks similar to serviceAccountSecretRef != null ?

I'll have a go at your solution though, to try it out ^^
I suppose the process is :

  1. create the iam service account (gsa)
  2. create the k8s service account (ksa)
  3. link both gsa and ksa (workload identity)
  4. install cert-manager
  5. annotate cert-manager-controller pods by adding the service account (ksa) to the pods
  6. create cert-manager issuer
  7. create a certificate using the issuer

@bergemalm
Copy link

I didn't dig in too deep, just saw this if statement and left it out.
https://github.com/jetstack/cert-manager/blob/d4a743f91ab1fdfefd21354c82aaa311888afa02/pkg/internal/apis/certmanager/validation/issuer.go#L292

Sounds like a plan!
(I used the helm chart for for your step 2, 4, 5 after having terraform setup 1 and 3. Step 7 is magically happening in my case (ingress-gce)).

:)

@akhamar
Copy link
Author

akhamar commented Jul 10, 2020

Yea cool. Indeed if you already have a service account available the next checks won't be done, so it should work properly.

Would be great though to have a nice solution by including either the json file or the serviceaccount name in the issuer configuration. Way more cleaner than doing it in multiple steps.
Also the concern I have is that if the pod is deleted, the service account will have to be annotated again. If this solution works for me, I'll have a look at patching the deployment to automatically inject the service account on pod creation.

@akhamar
Copy link
Author

akhamar commented Jul 10, 2020

@bergemalm I just saw that I don't have any pods named cert-manager-controller
Only :
cert-manager <= judging by the docker img it's probably this one
cert-manager-cainjector
cert-manager-webhook

@akhamar
Copy link
Author

akhamar commented Jul 10, 2020

So to confirm what @bergemalm said, the solution of patching the service account for cert-manager pods works great.

I've patched the serviceaccount by doing so :
kubectl patch serviceaccount cert-manager -p '{"metadata": {"annotations":{"iam.gke.io/gcp-service-account":"YOUR_GOOGLE_SERVICE_ACCOUNT"}}}' -n cert-manager

Thanks @bergemalm for the tips and the good direction.

@meyskens maybe you could add a temporary explanation/solution (as provided here) on the main DNS01 google page (https://cert-manager.io/docs/configuration/acme/dns01/google/) for supporting workload identity.


Set up a Service Account (supporting workload identity)

$ export PROJECT_ID=myproject-id
$ export $GOOGLE_SERVICE_ACCOUNT_NAME=myserviceaccount-name

Create service account
$ gcloud iam service-accounts create $GOOGLE_SERVICE_ACCOUNT_NAME --display-name "$GOOGLE_SERVICE_ACCOUNT_NAME"

DNS role
$ gcloud projects add-iam-policy-binding $PROJECT_ID \
   --member serviceAccount:$GOOGLE_SERVICE_ACCOUNT_NAME@$PROJECT_ID.iam.gserviceaccount.com \
   --role roles/dns.admin

Workload identity
$ gcloud iam service-accounts add-iam-policy-binding \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:$PROJECT_ID.svc.id.goog[cert-manager/cert-manager]" \
  $GOOGLE_SERVICE_ACCOUNT_NAME@$PROJECT_ID.iam.gserviceaccount.com

Patching cert-manager service account (enabling workload identity)

kubectl patch serviceaccount cert-manager -p '{"metadata": {"annotations":{"iam.gke.io/gcp-service-account":"$GOOGLE_SERVICE_ACCOUNT_NAME@$PROJECT_ID.iam.gserviceaccount.com"}}}' -n cert-manager

Create a Service Account Secret
Not needed anymore !!!

Create an Issuer That Uses CloudDNS

apiVersion: cert-manager.io/v1alpha2
kind: (Cluster)Issuer
metadata:
  name: letsencrypt-issuer-production
spec:
  acme:
    ...
    solvers:
      - dns01:
          clouddns:
            project: $PROJECT_ID

@meyskens
Copy link
Contributor

@akhamar looking good, PRs welcome at https://github.com/cert-manager/website/edit/master/content/en/docs/configuration/acme/dns01/google.md (gets you some credit for it in the Git history 😉)
However it should come with a warning that it will be overwritten in an upgrade. Maybe better to use the Helm chart values for this instead of a kubectl patch: https://github.com/jetstack/cert-manager/blob/master/deploy/charts/cert-manager/values.yaml#L60

@meyskens
Copy link
Contributor

meyskens commented Aug 4, 2020

/help
/priority important-longterm

@jetstack-bot
Copy link
Contributor

@meyskens:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help
/priority important-longterm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jetstack-bot jetstack-bot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Aug 4, 2020
@LeoVerto
Copy link

I'm trying to use workload identities with the helm chart (by overwriting the serviceAccount settings) and have gotten to the point where my KSA has the annotation and should be linked to the GSA (same config works for external-dns) but the order is failing with rror presenting challenge: error instantiating google clouddns challenge solver: unable to construct clouddns provider: empty credentials; perhaps you meant to enable ambient credentials?

This error message seems to be caused by the ambient flag not being set when checked ħere.

Is there anything I'm doing obviously wrong? Do I somehow need to set the ambient flag?

@mansona
Copy link

mansona commented Aug 29, 2020

I have come across this recently too 🤔 I was looking at some of the other dns01 providers and they have the idea of adding ambient: true to the config something like this:

    solvers:
    - dns01:
        clouddns:
          project: my-project
          ambient: true

but if you do that you get the following error:

error: error validating "issuer.yml": error validating data: ValidationError(Issuer.spec.acme.solvers[0].dns01.clouddns): unknown field "ambient" in io.cert-manager.v1alpha2.Issuer.spec.acme.solvers.dns01.clouddns; if you choose to ignore these errors, turn validation off with --validate=false

I tried with --validate=false but then when you describe the issuer it doesn't seem to have the metadata saved.

Could this be as simple as updating the schema for the issuer type for clouddns? I ask if this is "simple" but I have no idea where the schema that the request is being validated against is actually created 😂 🙈

@ddgenome
Copy link
Contributor

Here are detailed instructions for setting up cert-manager with Google Cloud DNS on GKE using workload identity: https://blog.atomist.com/kubernetes-ingress-nginx-cert-manager-external-dns/

@mansona
Copy link

mansona commented Aug 31, 2020

yikes @ddgenome that seems very complex 🙈

I was trying to get a simpler solution using "ambient permissions" because I happen to be using this cluster to do other things in my gcloud account so the permissions are setup correctly already, but I would say that following the cert-manager docs to create a new service account and save that json in a secret is much simpler than setting up a whole new subsystem to side-step this small missing feature

@ddgenome
Copy link
Contributor

I'm about to create a PR updating the cert-manager docs with workload identity details.

ddgenome added a commit to ddgenome/website-1 that referenced this issue Aug 31, 2020
Add documentation and link to blog post for using GKE workload
identity to authenticate against Google Cloud DNS.

Per request in cert-manager/cert-manager#3009

Signed-off-by: David Dooling <dooling@gmail.com>
@ddgenome
Copy link
Contributor

@mansona I'm not sure what you mean by "setting up a whole new subsystem". To use "ambient" permissions, i.e., the default credentials, just do not put a serviceAccountSecretRef in the issuer spec. How you provide the default credentials is where some complexity comes in. If this is a personal cluster where you have given all instances/nodes permissions to do everything, then nothing more need be done. If this is a cluster where security is a concern, you can use workload identity to confine the elevated privileges to the workloads that need them.

@txomon
Copy link

txomon commented Nov 10, 2020

Not sure why this isn't mentioned anywhere else, but in 1.0.4 at least, cert-manager needs to be started with --issuer-ambient-credentials=true and/or --cluster-issuer-ambient-credentials=true to enable the creation of the GCP CloudDNS client without explicit credentials.

https://github.com/jetstack/cert-manager/blob/v1.0.4/pkg/issuer/acme/dns/clouddns/clouddns.go#L48-L50 <= Ambient flag needs to be enabled

@ekimia
Copy link

ekimia commented Jan 25, 2021

@txomon did you manage to figure out how to enable issuer-ambient-credentials with the helm chart?

@txomon
Copy link

txomon commented Jan 25, 2021

@ekimia yes, although redundant, I found it difficult to undertand how exactly things are connected, so I will write here just all the steps to get it working. You need:

  • GCP Service account (gsa)
  • Kubernetes service account (ksa)

I use the default ksa of the namespace to set up the permissions. The regular steps to use Workload Identity:

  1. Create the gsa
  2. Give GCP IAM role roles/iam.workloadIdentityUser to service account serviceAccount:GCP_PROJECT_ID.svc.id.goog[K8S_NAMESPACE/KSA_NAME] => serviceAccount:GCP_PROJECT_ID.svc.id.goog[K8S_NAMESPACE/default] (fill in GCP_PROJECT_ID and K8S_NAMESPACE) in the gsa you created
  3. Annotate the ksa with "iam.gke.io/gcp-service-account" : <your-gsa-email-here>

And now, the stuff specific to cert-manager:

  • Add extraArgs config with both options
  • Override the chart's service account creation to make sure it uses the annotated one

Config looks like this (using default service account):

extraArgs:
  - --issuer-ambient-credentials=true
  - --cluster-issuer-ambient-credentials=true
serviceAccount:
  create: false
  name: default

@ryo-egch
Copy link

I also got issue when using workloadIdentity on GKE, with DNS-01

error instantiating google clouddns challenge solver: unable to construct clouddns provider: empty credentials; perhaps you meant to enable ambient credentials?

version

  • cert-manager-v1.1.0

but worked fine with @txomon 's step
#3009 (comment)

change all serviceAccount to default

serviceAccount:
  # Specifies whether a service account should be created
  create: false
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  # name: ""
  name: default  

add extraArgs


# extraArgs: []
extraArgs: 
  - --issuer-ambient-credentials=true
  - --cluster-issuer-ambient-credentials=true

And binding GoogleServiceAccount & default (kubernetes serviceaccount), add annotation

gcloud iam service-accounts add-iam-policy-binding
--role roles/iam.workloadIdentityUser
--member "serviceAccount:GCP_PROJECT_ID.svc.id.goog[default/default]"
gsa@$GCP_PROJECT_ID.iam.gserviceaccount.com

kubectl annotate serviceaccount
--namespace default
default
"iam.gke.io/gcp-service-account=gsa@GCP_PROJECT_ID.iam.gserviceaccount.com"

@ejose19
Copy link

ejose19 commented Mar 1, 2021

For those using helm, just adding --set extraArgs={--issuer-ambient-credentials=true} to the install script made the Issuer to work correctly (besides following the docs of course)

rossigee added a commit to rossigee/website-1 that referenced this issue Sep 13, 2021
This should help reduce the amount of time people might waste trying to figure out how to resolve the following error:

```
error instantiating route53 challenge solver: unable to construct route53 provider: empty credentials; perhaps you meant to enable ambient credentials?
```

A couple of related bug reports:

* cert-manager/cert-manager#3009
* cert-manager/cert-manager#3079
@jetstack-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle stale

@jetstack-bot jetstack-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 16, 2021
@jetstack-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale

@jetstack-bot jetstack-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 16, 2021
@jetstack-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

@jetstack-bot
Copy link
Contributor

@jetstack-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

rossigee added a commit to rossigee/website-1 that referenced this issue Apr 13, 2022
This should help reduce the amount of time people might waste trying to figure out how to resolve the following error:

```
error instantiating route53 challenge solver: unable to construct route53 provider: empty credentials; perhaps you meant to enable ambient credentials?
```

A couple of related bug reports:

* cert-manager/cert-manager#3009
* cert-manager/cert-manager#3079
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests