Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support to prepare CoreDNS for Helm management, enable cluster proportional autoscaler for CoreDNS #1033

Merged
merged 8 commits into from
Oct 11, 2022

Conversation

bryantbiggs
Copy link
Contributor

@bryantbiggs bryantbiggs commented Oct 7, 2022

What does this PR do?

  • Add support to prep CoreDNS for Helm management within the CoreDNS addon. This improves the usability experience when deploying self-managed CoreDNS via the addon
  • Add support for CoreDNS cluster proportional autoscaler. This is now enabled by default with some sane default settings when using either EKS managed CoreDNS or self-managed via the addon
  • Add time_sleep resource to CoreDNS addon to ensure proper sequencing of events within the addon (all of the following are subject to being enabled, but regardless the order still stands now):
    1. Remove the default deployment of CoreDNS
    2. Deploy the self-managed (Helm addon) CoreDNS deployment
    3. Enable cluster proportional autoscaler
  • Add new variable data_plane_wait_arn which can be used to coordinate the dependency ordering between the EKS cluster creation and addon creation. By passing in an ARN from a node group or Fargate profile (or an array of ARNs using join(), the time_sleep resource will wait for the ARN(s) passed before starting a brief sleep duration and then allowing the deployment of the addons. This makes it possible to do a single terraform apply with the proper ordering and at a later date we can pass additional information through time_sleep to ensure proper ordering when performing cluster upgrades without the need for additional information from users (*I believe - TBD).
  • Add support for the 2048 sample/demo app as an "addon" to be able to re-use in other examples. Removing the use of the kubectl provider for this and only using the Kubernetes provider reduces the number of providers needed in the examples for this demo app as well
  • Update Fargate example blueprint
    • Enable 2048 demo app via addon
    • Update CoreDNS to be self-managed to run on Fargate
    • Show example of Fargate wildcard profile usage for the demo app
    • Enable fargate-fluentbit addon for Fargate logging (removes warnings from pods + we should have logging)

Motivation

More

  • Yes, I have tested the PR using my local account setup (Provide any test evidence report under Additional Notes)
  • Yes, I have added a new example under examples to support my PR
  • Yes, I have created another PR for add-ons under add-ons repo (if applicable)
  • Yes, I have updated the docs for this feature
  • Yes, I ran pre-commit run -a with this PR

fargate_profile_namespaces = [
{
namespace = local.sample_app_namespace
namespace = "app-*"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolves #891

@@ -103,6 +94,9 @@ module "eks_blueprints_kubernetes_addons" {
eks_oidc_provider = module.eks_blueprints.oidc_provider
eks_cluster_version = module.eks_blueprints.eks_cluster_version

# Wait on the `kube-system` profile before provisioning addons
data_plane_wait_arn = module.eks_blueprints.fargate_profiles["kube_system"].eks_fargate_profile_arn
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty nifty - addons don't start provisioning until the profile has provisioned

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you find this approach better than the use of depends_on ?
I assume it replaces it, corrrect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - because

  1. You can specify for create or destroy operations separately
  2. You can specify individual triggers which will cause the resource to re-evaluate https://registry.terraform.io/providers/hashicorp/time/latest/docs/resources/sleep#triggers-usage
  3. By using the triggers, you use their outputs downstream to better control what is updated whereas with depends_on its applied to all downstream computed values when used with modules Using depends_on in a module definition fails terraform plan when using data source aws_partition in the module hashicorp/terraform#30340
  4. I can variablize what the dependent arguments are - whereas, with depends_on I cannot

In short, it has much better control than what depends_on offers, especially from a module usage perspective.

# Modifying CoreDNS for Fargate
#---------------------------------------------------------------

locals {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this lives in the CoreDNS addon now so users do not need to re-create if they want to use this method of assuming control of CoreDNS with Helm

@@ -252,31 +177,12 @@ module "vpc" {
default_security_group_tags = { Name = "${local.name}-default" }

public_subnet_tags = {
"kubernetes.io/cluster/${local.name}" = "shared"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer required with recent versions of ALB controller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting. where is this documented?
we are on the latest release of AWS LB Controller, and still tag out SGs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security group tags is separate, and that is still required. This is just subnet tags

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right right right!

@@ -22,7 +22,5 @@ variable "addon_context" {
eks_oidc_issuer_url = string
eks_oidc_provider_arn = string
tags = map(string)
irsa_iam_role_path = string
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used

@bryantbiggs bryantbiggs temporarily deployed to EKS Blueprints Test October 7, 2022 00:10 Inactive
Copy link
Contributor

@askulkarni2 askulkarni2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, other wise the PR is rock solid.

# This simply maps a dependency for Terraform to ensure the default deployment is
# removed first (if enabled) before provisioning self-managed version (if enabled)
name = "blueprints.connection"
value = try(null_resource.remove_default_coredns_deployment[0].id, "none")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@@ -15,6 +15,12 @@ variable "eks_worker_security_group_id" {
default = ""
}

variable "data_plane_wait_arn" {
description = "Addon deployment will not proceed until this value is known. Set to node group/Fargate profile ARN to wait for data plane to be ready before provisioning addons"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a list? What if I have both node groups and fargate profiles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A list can still be used if you wrap it in a join() function. Ideally, you wouldn't want to wait for all of the node groups and Fargate profiles to come up, you just need one where addons can start deploying. Also, using one over using many will keep the potential for updates to a minimum (the more that are provided in the trigger, the more times there could be a re-trigger when its not necessary)

@bryantbiggs bryantbiggs temporarily deployed to EKS Blueprints Test October 10, 2022 16:55 Inactive
@askulkarni2 askulkarni2 temporarily deployed to EKS Blueprints Test October 11, 2022 04:58 Inactive
Copy link
Contributor

@askulkarni2 askulkarni2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@askulkarni2 askulkarni2 merged commit 37be09b into main Oct 11, 2022
@askulkarni2 askulkarni2 deleted the feat/coredns-opinions branch October 11, 2022 05:30
allamand pushed a commit to allamand/terraform-aws-eks-blueprints that referenced this pull request Dec 15, 2022
…ter proportional autoscaler for CoreDNS (aws-ia#1033)

Co-authored-by: Apoorva Kulkarni <kuapoorv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Opinionated defaults for CoreDNS Fargate wildcard profile support
3 participants