Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically provision ACM certificates and attach to ALB based on spec #2509

Open
ohookins opened this issue Feb 15, 2022 · 23 comments
Open
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@ohookins
Copy link

Is your feature request related to a problem?

Right now if you want a dynamic environment based on configured Ingresses, but also includes TLS protection (for example development or demo environments provisioned on-demand) it seems the only option is to use a combination of the Nginx Ingress Controller, cert-manager (using LetsEncrypt) and external-dns. This works but is clunky and it would be nicer to use the AWS Load Balancer Controller, ALBs and ACM. It seems that this controller is most of the way there, but just needs the ability to provision a cert in ACM based on the configured host names - attaching it to the ALB is already possible.

Describe the solution you'd like

When an Ingress object is created, the controller inspects the configured hostnames and requests a certificate be provisioned in ACM. The certificate is then attached to the configured load balancer.

Describe alternatives you've considered

As mentioned, the only workable alternative is using another Ingress controller altogether. In our case Ingresses are created by some other tooling that doesn't have the capability of creating ACM certificates and providing the ARN to this controller. This functionality is already present when using the combination of nginx-ingress/cert-manager/external-dns so there is clearly a use case that has been solved by other projects.

@olemarkus
Copy link
Contributor

I am also looking for this (also for use with NLBs). Also see aws-controllers-k8s/community#482 which may be a better separation of concerns.

@ohookins
Copy link
Author

that does sound interesting, if it gets off the ground

@kishorj
Copy link
Collaborator

kishorj commented Feb 16, 2022

we had a PR for a similar feature in the past. The reason we didn't include it was due to the AWS certificate import quota, it was about 20/year - which would count both import and delete, and it wouldn't be based on the number of active certificates. I see the new quota is around 1k. However, the original concern is still there since the feature would stop working after the quota is hit.

@olemarkus
Copy link
Contributor

Importing certificate is not an option for us. We need ACM to provision the certificate.

@erhudy
Copy link

erhudy commented Mar 16, 2022

This is also of interest to me, in particular using it with ACM Private CA. I'm unclear on how import quotas matter for this feature, since the certificates would be created on-demand through the controller making the request to ACM to provision the certificate; I don't think anyone would have to import a certificate to make this work.

@matthewhembree
Copy link

I don't think anyone is asking for import. The quotas are the same for import and create, so maybe that's where the two words got conflated.

I see the ACM quotas have been further increased to 2500 active (5000/yr). IMO, It seems to be a bit more viable if you consider that's an account quota and blast radius could further be reduced with multiple accounts (e.g. dev and prod).

An additional quota limit risk mitigation could be to add a parameter of max certificates. For instance, if you could specify a Helm value of maxConcurrentAcmCertificates: 100 that would limit the maximum concurrent ACM certificates of that cluster to 100. You'd still have to be mindful of revoked and expired certificates counting toward that 5000/yr number.

As for that historical PR, 20 certificates is extremely different than 5000. I think it's time to reconsider past decisions. If you put a mitigation limit like I suggested above and provide ample documentation, I think organizations can make their own informed decision about adopting this controller.

Limits are also a concern with orgs that have adopted ingress-nginx/cert-manager/let's encrypt/external-dns and they operate comfortably within those limits. LE pins you at 50 certificates per registered domain per week (2600/yr).

@matthewhembree
Copy link

I see the new quota is around 1k. However, the original concern is still there since the feature would stop working after the quota is hit.

We have to ask ourselves if we'd keep referring to the original concern. If the limit was 10k, would we still cite the original concern? What about 10M?

I think the current limit of 2500/5000 nullifies the original concern. Organizations can make their own decisions at this point.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 29, 2022
@dudicoco
Copy link

It can take time for the ACM certificate validation to complete (sometimes even hours).
Should be taken into consideration.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 13, 2022
@JanGe
Copy link

JanGe commented Oct 13, 2022

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 13, 2022
@tpzumezawa
Copy link

It would be interesting to have this. This will make easier to configure a new site with DNS and certificates without using external and complicated tools and without the need to separate the flow from Kubernetes annotation to use Terraform or AWS console.

@kishorj
Copy link
Collaborator

kishorj commented Dec 29, 2022

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 29, 2022
@matthewhembree
Copy link

It can take time for the ACM certificate validation to complete (sometimes even hours). Should be taken into consideration.

I've never seen this. I would say that hours is way outside of a reasonable SLO for ACM DNS certificate validation. Even tools like AWS Amplify and AWS Copilot will timeout within a few minutes for DNS certificate validation. The controller should have a timeout and surface errors, just like every controller should.

@migueleliasweb
Copy link

Hi all, just wanted to say I think this feature would be very welcomed by the community.

Sadly, I see quite a few things going on this thread at the same time and that seems to just slow down the process of getting anything going. From my eyes, it all comes down to trying to make too much of something that shouldn't be this complicated.

Instead of talking about attaching to LBs, importing custom certs, ACM PCA, and some other (although valid) advanced usecases, can we just agree on just starting with something simple that would already yield a LOT OF VALUE to the community?

My two cents here.

The initial realease for this feature should only contain creation public certs certs with DNS validation.

This thread has been going for almost 1 year. That would be more than enough time to implement the forementioned initial release and gather lots of extra feedback on how it's been and improve on it.

@migueleliasweb
Copy link

Another points, already mentioned:

It can take time for the ACM certificate validation to complete (sometimes even hours). Should be taken into consideration.

Actually, I don't think so. Not a lot of considerarion is necessary here.

From the client side, all one would need is a status in the resource in Kubernetes saying the Cert is ready or still being provisioned. Until the cert is marked as ready, just keep waiting. That's how many other implementation do for resources that have some background provisioning time. Just saying this here to avoid even more overthinking of a simple solution.

Regarding of cert limits in ACM, I don't think is up to this repo (or anyone other than the client itself) to keep an eye on what the numbers look like. Have said that, the current limits are very relaxed and shouldn't be a problem for the majority of the usecases.

Screen Shot 2023-01-09 at 9 03 16 am

@ArjunDandagi
Copy link

We need this feature too.

@Timtech4u
Copy link

+1

@selfisch
Copy link

selfisch commented Apr 4, 2023

Good day,
take a look at https://github.com/aws-controllers-k8s/acm-controller.
In combination with the alb controller, it works almost flawless for us. You only have to define an extra resource in your k8s deployment, for the wanted certificate.
Only missing thing at this time is, that the ACM controller will not auto generate Route53 entries, for validation. One have to got to the aws console and click a button for that. Should be an improvement in ACM controller. But I also dont know, if the AWS api is already that far to support those requests.
Have a nice one

@chacocr
Copy link

chacocr commented Apr 25, 2023

I had this issue.

I uninstalled the helm chart.
Manually removed old secrets.

I reinstalled the helm chart and it is now working.

I also removed this from my yaml file:
enableCertManager: true

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 19, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 18, 2024
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 18, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests