-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gitops apply
fails to get cluster autoscaler working
#1237
Comments
I'd say this should be an issue in the profile repo. |
Yep, I agree with @errordeveloper |
FWIW, it is a matter of having the right IAM policies in place when creating the cluster, namely: nodeGroups:
- name: ng-1
instanceType: m5.large
- desiredCapacity: 1
+ minSize: 1
+ maxSize: 2
+ iam:
+ withAddonPolicies:
+ albIngress: true
+ autoScaler: true
+ cloudWatch: true
cloudWatch:
clusterLogging: How do we want to proceed there?
|
We should add this to the quickstart guide for now. We'll fix it as part of a different issue later on. |
I'll document this in the quickstart profile's repository, but I thought I'd also provide an example |
@marccarre should this issue be closed actually? |
Not yet, only once we've merged weaveworks/eks-quickstart-app-dev#22 |
Fixed by weaveworks/eks-quickstart-app-dev#22 |
happened to me today.. when applying profile app-dev when creating a new cluster from gitops... how do I fix? |
@ilanpillemer, did you have the required IAM roles in place in your cluster? If you have them, then this should work fine. See also the steps in this collapsible. (I just re-ran this myself to be sure it still does work as expected. It does.)$ git diff
diff --git a/examples/eks-quickstart-app-dev.yaml b/examples/eks-quickstart-app-dev.yaml
index 487cb46b..5783c605 100644
--- a/examples/eks-quickstart-app-dev.yaml
+++ b/examples/eks-quickstart-app-dev.yaml
@@ -5,8 +5,8 @@ apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
- name: cluster-12
- region: eu-north-1
+ name: mc-1237-testing-with-iam
+ region: ap-northeast-1
nodeGroups:
- name: ng-1
$ eksctl create cluster -f examples/eks-quickstart-app-dev.yaml
[ℹ] eksctl version 0.11.1
[ℹ] using region ap-northeast-1
[ℹ] setting availability zones to [ap-northeast-1c ap-northeast-1d ap-northeast-1a]
[ℹ] subnets for ap-northeast-1c - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ] subnets for ap-northeast-1d - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ] subnets for ap-northeast-1a - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ] nodegroup "ng-1" will use "ami-02e124a380df41614" [AmazonLinux2/1.14]
[ℹ] using Kubernetes version 1.14
[ℹ] creating EKS cluster "mc-1237-testing-with-iam" in "ap-northeast-1" region with un-managed nodes
[ℹ] 1 nodegroup (ng-1) was included (based on the include/exclude rules)
[ℹ] will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
[ℹ] will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
[ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-northeast-1 --cluster=mc-1237-testing-with-iam'
[ℹ] CloudWatch logging will not be enabled for cluster "mc-1237-testing-with-iam" in "ap-northeast-1"
[ℹ] you can enable it with 'eksctl utils update-cluster-logging --region=ap-northeast-1 --cluster=mc-1237-testing-with-iam'
[ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "mc-1237-testing-with-iam" in "ap-northeast-1"
[ℹ] 2 sequential tasks: { create cluster control plane "mc-1237-testing-with-iam", create nodegroup "ng-1" }
[ℹ] building cluster stack "eksctl-mc-1237-testing-with-iam-cluster"
[ℹ] deploying stack "eksctl-mc-1237-testing-with-iam-cluster"
[ℹ] building nodegroup stack "eksctl-mc-1237-testing-with-iam-nodegroup-ng-1"
[ℹ] deploying stack "eksctl-mc-1237-testing-with-iam-nodegroup-ng-1"
[✔] all EKS cluster resources for "mc-1237-testing-with-iam" have been created
[✔] saved kubeconfig as "${HOME}/.kube/config"
[ℹ] adding identity "arn:aws:iam::083751696308:role/eksctl-mc-1237-testing-with-iam-n-NodeInstanceRole-1M7OF6KB2D8RV" to auth ConfigMap
[ℹ] nodegroup "ng-1" has 0 node(s)
[ℹ] waiting for at least 1 node(s) to become ready in "ng-1"
[ℹ] nodegroup "ng-1" has 1 node(s)
[ℹ] node "ip-192-168-13-77.ap-northeast-1.compute.internal" is ready
[ℹ] kubectl command should work with "${HOME}/.kube/config", try 'kubectl get nodes'
[✔] EKS cluster "mc-1237-testing-with-iam" in "ap-northeast-1" region is ready
$ EKSCTL_EXPERIMENTAL=true eksctl enable repo \
> -f examples/eks-quickstart-app-dev.yaml \
> --git-email carre.marc+flux@gmail.com \
> --git-url git@github.com:marccarre/my-gitops-repo.git
[ℹ] Generating public key infrastructure for the Helm Operator and Tiller
[ℹ] this may take up to a minute, please be patient
[!] Public key infrastructure files were written into directory "/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/eksctl-helm-pki431635447"
[!] please move the files into a safe place or delete them
[ℹ] Generating manifests
[ℹ] Cloning git@github.com:marccarre/my-gitops-repo.git
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/eksctl-install-flux-clone-956113642'...
remote: Enumerating objects: 59, done.
remote: Counting objects: 100% (59/59), done.
remote: Compressing objects: 100% (55/55), done.
remote: Total 447 (delta 11), reused 50 (delta 3), pack-reused 388
Receiving objects: 100% (447/447), 183.32 KiB | 514.00 KiB/s, done.
Resolving deltas: 100% (157/157), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] Writing Flux manifests
[ℹ] created "Namespace/flux"
[ℹ] Applying Helm TLS Secret(s)
[ℹ] created "flux:Secret/flux-helm-tls-cert"
[ℹ] created "flux:Secret/tiller-secret"
[!] Note: certificate secrets aren't added to the Git repository for security reasons
[ℹ] Applying manifests
[ℹ] created "flux:Deployment.apps/flux"
[ℹ] created "flux:ServiceAccount/flux-helm-operator"
[ℹ] created "ClusterRole.rbac.authorization.k8s.io/flux-helm-operator"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/flux-helm-operator"
[ℹ] created "CustomResourceDefinition.apiextensions.k8s.io/helmreleases.helm.fluxcd.io"
[ℹ] created "flux:Secret/flux-git-deploy"
[ℹ] created "flux:Deployment.apps/memcached"
[ℹ] created "flux:Deployment.apps/flux-helm-operator"
[ℹ] created "flux:Deployment.extensions/tiller-deploy"
[ℹ] created "flux:Service/tiller-deploy"
[ℹ] created "flux:Service/memcached"
[ℹ] created "flux:ServiceAccount/flux"
[ℹ] created "ClusterRole.rbac.authorization.k8s.io/flux"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/flux"
[ℹ] created "flux:ConfigMap/flux-helm-tls-ca-config"
[ℹ] created "flux:ServiceAccount/tiller"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/tiller"
[ℹ] created "flux:ServiceAccount/helm"
[ℹ] created "flux:Role.rbac.authorization.k8s.io/tiller-user"
[ℹ] created "kube-system:RoleBinding.rbac.authorization.k8s.io/tiller-user-binding"
[ℹ] Waiting for Helm Operator to start
ERROR: logging before flag.Parse: E1210 18:44:24.787197 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:24 socat[6735] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:26.816135 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:26 socat[6814] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:28.844545 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:28 socat[6870] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:30.877698 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:30 socat[6967] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:32.914902 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:32 socat[7082] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:34.944906 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:34 socat[7084] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:36.971253 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:36 socat[7085] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:38.998610 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:39 socat[7090] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:41.023201 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:41 socat[7093] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:43.053384 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:43 socat[7113] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:45.084005 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:45 socat[7115] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:47.115951 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:47 socat[7116] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
[ℹ] Helm Operator started successfully
[ℹ] see https://docs.fluxcd.io/projects/helm-operator for details on how to use the Helm Operator
[ℹ] Waiting for Flux to start
[ℹ] Flux started successfully
[ℹ] see https://docs.fluxcd.io/projects/flux for details on how to use Flux
[ℹ] Committing and pushing manifests to git@github.com:marccarre/my-gitops-repo.git
[master 15b0aad] Add Initial Flux configuration
13 files changed, 803 insertions(+)
create mode 100644 flux/flux-account.yaml
create mode 100644 flux/flux-deployment.yaml
create mode 100644 flux/flux-helm-operator-account.yaml
create mode 100644 flux/flux-helm-release-crd.yaml
create mode 100644 flux/flux-namespace.yaml
create mode 100644 flux/flux-secret.yaml
create mode 100644 flux/helm-operator-deployment.yaml
create mode 100644 flux/memcache-dep.yaml
create mode 100644 flux/memcache-svc.yaml
create mode 100644 flux/tiller-ca-cert-configmap.yaml
create mode 100644 flux/tiller-dep.yaml
create mode 100644 flux/tiller-rbac.yaml
create mode 100644 flux/tiller-svc.yaml
Enumerating objects: 17, done.
Counting objects: 100% (17/17), done.
Delta compression using up to 8 threads
Compressing objects: 100% (15/15), done.
Writing objects: 100% (16/16), 9.33 KiB | 9.33 MiB/s, done.
Total 16 (delta 1), reused 12 (delta 1)
remote: Resolving deltas: 100% (1/1), done.
To github.com:marccarre/my-gitops-repo.git
e54ab6f..15b0aad master -> master
[ℹ] Flux will only operate properly once it has write-access to the Git repository
[ℹ] please configure git@github.com:marccarre/my-gitops-repo.git so that the following Flux SSH public key has write access to it
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDFgi4LH0m5lCSUf/qmBTTZIz3MASZOQMepyDUYxtmAycwC0158op7ykTvHgmAqfXMxS90LzDQ4qPUxWKgExfjnWv3u7gWJBhDJhhDyLEodJLO6/IljgC1rUPTj5QJ1AwcPM7cvoB5sIBVq1iU6Jmf0Hp/BL2QEiLdiBdpA4HkPGKOMvzB+nNiLg4iJbCdAKAefHJWqWvf2k+PPTkVgpQ9ujcyQ+KHczY8Aj4HPu9he8C8S9Sqj2Vxq/qKZVbAuxllINy/WXlCB9SdbPx1b66g9Hiw6meoXiYJPaLft78SVXLQBx7l1anDabmcRnNHSChwMY8AAVFBssm537DyAHuG5
### Then added the above SSH key to https://github.com/marccarre/my-gitops-repo/deploy_keys
$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
flux flux-7696dbc4cd-sjbv7 1/1 Running 0 17m
flux flux-helm-operator-8687676b89-qw7kq 1/1 Running 0 17m
flux memcached-5dcd7579-7bn6l 1/1 Running 0 17m
flux tiller-deploy-69547b56b4-p6zxd 1/1 Running 0 17m
kube-system aws-node-f8g7z 1/1 Running 0 20m
kube-system coredns-699bb99bf8-gptx4 1/1 Running 0 27m
kube-system coredns-699bb99bf8-smzch 1/1 Running 0 27m
kube-system kube-proxy-28xqt 1/1 Running 0 20m
$ EKSCTL_EXPERIMENTAL=true eksctl enable profile app-dev \
> -f examples/eks-quickstart-app-dev.yaml \
> --git-email carre.marc+flux@gmail.com \
> --git-url git@github.com:marccarre/my-gitops-repo.git
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/my-gitops-repo-547778386'...
remote: Enumerating objects: 63, done.
remote: Counting objects: 100% (63/63), done.
remote: Compressing objects: 100% (59/59), done.
remote: Total 451 (delta 12), reused 53 (delta 3), pack-reused 388
Receiving objects: 100% (451/451), 185.04 KiB | 104.00 KiB/s, done.
Resolving deltas: 100% (158/158), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] cloning repository "https://github.com/weaveworks/eks-quickstart-app-dev":master
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/quickstart-008692361'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 214 (delta 0), reused 0 (delta 0), pack-reused 209
Receiving objects: 100% (214/214), 57.27 KiB | 335.00 KiB/s, done.
Resolving deltas: 100% (92/92), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] processing template files in repository
[ℹ] writing new manifests to "/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/my-gitops-repo-547778386/base"
[master b7070d5] Add app-dev quickstart components
27 files changed, 1380 insertions(+)
create mode 100644 base/LICENSE
create mode 100644 base/README.md
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-configmap.yaml
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-daemonset.yaml
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-rbac.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-configmap-cluster-info.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-configmap-fluentd-config.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-daemonset.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-rbac.yaml
create mode 100644 base/demo/helm-release.yaml
create mode 100644 base/kube-system/alb-ingress-controller-deployment.yaml
create mode 100644 base/kube-system/alb-ingress-controller-rbac.yaml
create mode 100644 base/kube-system/cluster-autoscaler-deployment.yaml
create mode 100644 base/kube-system/cluster-autoscaler-rbac.yaml
create mode 100644 base/kubernetes-dashboard/dashboard-metrics-scraper-deployment.yaml
create mode 100644 base/kubernetes-dashboard/dashboard-metrics-scraper-service.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-configmap.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-deployment.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-rbac.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-secrets.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-service.yaml
create mode 100644 base/monitoring/metrics-server.yaml
create mode 100644 base/monitoring/prometheus-operator.yaml
create mode 100644 base/namespaces/amazon-cloudwatch.yaml
create mode 100644 base/namespaces/demo.yaml
create mode 100644 base/namespaces/kubernetes-dashboard.yaml
create mode 100644 base/namespaces/monitoring.yaml
Enumerating objects: 37, done.
Counting objects: 100% (37/37), done.
Delta compression using up to 8 threads
Compressing objects: 100% (28/28), done.
Writing objects: 100% (36/36), 13.54 KiB | 13.54 MiB/s, done.
Total 36 (delta 7), reused 27 (delta 7)
remote: Resolving deltas: 100% (7/7), done.
To github.com:marccarre/my-gitops-repo.git
15b0aad..b7070d5 master -> master
$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
amazon-cloudwatch cloudwatch-agent-h9wr7 1/1 Running 0 15m
amazon-cloudwatch fluentd-cloudwatch-8r5f6 1/1 Running 0 15m
demo podinfo-67b7886b6c-bvdtm 1/1 Running 0 15m
flux flux-7696dbc4cd-sjbv7 1/1 Running 0 36m
flux flux-helm-operator-8687676b89-qw7kq 1/1 Running 0 36m
flux memcached-5dcd7579-7bn6l 1/1 Running 0 36m
flux tiller-deploy-69547b56b4-p6zxd 1/1 Running 0 36m
kube-system alb-ingress-controller-8df75bc98-gssb9 1/1 Running 0 15m
kube-system aws-node-f8g7z 1/1 Running 0 39m
kube-system cluster-autoscaler-86d68b66cb-b9xqv 1/1 Running 0 15m
kube-system coredns-699bb99bf8-gptx4 1/1 Running 0 46m
kube-system coredns-699bb99bf8-smzch 1/1 Running 0 46m
kube-system kube-proxy-28xqt 1/1 Running 0 39m
kubernetes-dashboard dashboard-metrics-scraper-65785bfbc-s8tq6 1/1 Running 0 15m
kubernetes-dashboard kubernetes-dashboard-76b969b44b-rwgk5 1/1 Running 0 15m
monitoring alertmanager-prometheus-operator-alertmanager-0 2/2 Running 0 14m
monitoring metrics-server-5df4599bd7-cgh79 1/1 Running 0 15m
monitoring prometheus-operator-grafana-dd95fb7d4-n9ddh 2/2 Running 0 15m
monitoring prometheus-operator-kube-state-metrics-5d7558d7cc-h8xgg 1/1 Running 0 15m
monitoring prometheus-operator-operator-67895dd7c5-nqj7w 1/1 Running 0 15m
monitoring prometheus-operator-prometheus-node-exporter-qp8gp 1/1 Running 0 15m
monitoring prometheus-prometheus-operator-prometheus-0 3/3 Running 1 14m
If, however, you do NOT have the IAM roles in place, then the
|
Yes. I have resolved the issue. If you follow the instructions word for
word it fails. You need to add the roles with the necessary config when
creating the cluster.
…On Tue, 10 Dec 2019, 10:33 Marc Carré, ***@***.***> wrote:
@ilanpillemer <https://github.com/ilanpillemer>, did you have the required
IAM roles
<https://github.com/weaveworks/eks-quickstart-app-dev/#pre-requisites> in
place in your cluster?
If you have them, then this should work fine. See also the steps in this
collapsible. (I just re-ran this myself to be sure it still does work as
expected. It does.)
$ git diff
diff --git a/examples/eks-quickstart-app-dev.yaml b/examples/eks-quickstart-app-dev.yaml
index 487cb46b..5783c605 100644
--- a/examples/eks-quickstart-app-dev.yaml
+++ b/examples/eks-quickstart-app-dev.yaml
@@ -5,8 +5,8 @@ apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
- name: cluster-12
- region: eu-north-1
+ name: mc-1237-testing-with-iam
+ region: ap-northeast-1
nodeGroups:
- name: ng-1
$ eksctl create cluster -f examples/eks-quickstart-app-dev.yaml
[ℹ] eksctl version 0.11.1
[ℹ] using region ap-northeast-1
[ℹ] setting availability zones to [ap-northeast-1c ap-northeast-1d ap-northeast-1a]
[ℹ] subnets for ap-northeast-1c - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ] subnets for ap-northeast-1d - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ] subnets for ap-northeast-1a - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ] nodegroup "ng-1" will use "ami-02e124a380df41614" [AmazonLinux2/1.14]
[ℹ] using Kubernetes version 1.14
[ℹ] creating EKS cluster "mc-1237-testing-with-iam" in "ap-northeast-1" region with un-managed nodes
[ℹ] 1 nodegroup (ng-1) was included (based on the include/exclude rules)
[ℹ] will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
[ℹ] will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
[ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-northeast-1 --cluster=mc-1237-testing-with-iam'
[ℹ] CloudWatch logging will not be enabled for cluster "mc-1237-testing-with-iam" in "ap-northeast-1"
[ℹ] you can enable it with 'eksctl utils update-cluster-logging --region=ap-northeast-1 --cluster=mc-1237-testing-with-iam'
[ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "mc-1237-testing-with-iam" in "ap-northeast-1"
[ℹ] 2 sequential tasks: { create cluster control plane "mc-1237-testing-with-iam", create nodegroup "ng-1" }
[ℹ] building cluster stack "eksctl-mc-1237-testing-with-iam-cluster"
[ℹ] deploying stack "eksctl-mc-1237-testing-with-iam-cluster"
[ℹ] building nodegroup stack "eksctl-mc-1237-testing-with-iam-nodegroup-ng-1"
[ℹ] deploying stack "eksctl-mc-1237-testing-with-iam-nodegroup-ng-1"
[✔] all EKS cluster resources for "mc-1237-testing-with-iam" have been created
[✔] saved kubeconfig as "${HOME}/.kube/config"
[ℹ] adding identity "arn:aws:iam::083751696308:role/eksctl-mc-1237-testing-with-iam-n-NodeInstanceRole-1M7OF6KB2D8RV" to auth ConfigMap
[ℹ] nodegroup "ng-1" has 0 node(s)
[ℹ] waiting for at least 1 node(s) to become ready in "ng-1"
[ℹ] nodegroup "ng-1" has 1 node(s)
[ℹ] node "ip-192-168-13-77.ap-northeast-1.compute.internal" is ready
[ℹ] kubectl command should work with "${HOME}/.kube/config", try 'kubectl get nodes'
[✔] EKS cluster "mc-1237-testing-with-iam" in "ap-northeast-1" region is ready
$ EKSCTL_EXPERIMENTAL=true eksctl enable repo \
> -f examples/eks-quickstart-app-dev.yaml \
> --git-email ***@***.*** \
> --git-url ***@***.***:marccarre/my-gitops-repo.git
[ℹ] Generating public key infrastructure for the Helm Operator and Tiller
[ℹ] this may take up to a minute, please be patient
[!] Public key infrastructure files were written into directory "/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/eksctl-helm-pki431635447"
[!] please move the files into a safe place or delete them
[ℹ] Generating manifests
[ℹ] Cloning ***@***.***:marccarre/my-gitops-repo.git
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/eksctl-install-flux-clone-956113642'...
remote: Enumerating objects: 59, done.
remote: Counting objects: 100% (59/59), done.
remote: Compressing objects: 100% (55/55), done.
remote: Total 447 (delta 11), reused 50 (delta 3), pack-reused 388
Receiving objects: 100% (447/447), 183.32 KiB | 514.00 KiB/s, done.
Resolving deltas: 100% (157/157), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] Writing Flux manifests
[ℹ] created "Namespace/flux"
[ℹ] Applying Helm TLS Secret(s)
[ℹ] created "flux:Secret/flux-helm-tls-cert"
[ℹ] created "flux:Secret/tiller-secret"
[!] Note: certificate secrets aren't added to the Git repository for security reasons
[ℹ] Applying manifests
[ℹ] created "flux:Deployment.apps/flux"
[ℹ] created "flux:ServiceAccount/flux-helm-operator"
[ℹ] created "ClusterRole.rbac.authorization.k8s.io/flux-helm-operator"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/flux-helm-operator"
[ℹ] created "CustomResourceDefinition.apiextensions.k8s.io/helmreleases.helm.fluxcd.io"
[ℹ] created "flux:Secret/flux-git-deploy"
[ℹ] created "flux:Deployment.apps/memcached"
[ℹ] created "flux:Deployment.apps/flux-helm-operator"
[ℹ] created "flux:Deployment.extensions/tiller-deploy"
[ℹ] created "flux:Service/tiller-deploy"
[ℹ] created "flux:Service/memcached"
[ℹ] created "flux:ServiceAccount/flux"
[ℹ] created "ClusterRole.rbac.authorization.k8s.io/flux"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/flux"
[ℹ] created "flux:ConfigMap/flux-helm-tls-ca-config"
[ℹ] created "flux:ServiceAccount/tiller"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/tiller"
[ℹ] created "flux:ServiceAccount/helm"
[ℹ] created "flux:Role.rbac.authorization.k8s.io/tiller-user"
[ℹ] created "kube-system:RoleBinding.rbac.authorization.k8s.io/tiller-user-binding"
[ℹ] Waiting for Helm Operator to start
ERROR: logging before flag.Parse: E1210 18:44:24.787197 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:24 socat[6735] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:26.816135 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:26 socat[6814] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:28.844545 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:28 socat[6870] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:30.877698 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:30 socat[6967] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:32.914902 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:32 socat[7082] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:34.944906 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:34 socat[7084] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:36.971253 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:36 socat[7085] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:38.998610 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:39 socat[7090] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:41.023201 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:41 socat[7093] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:43.053384 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:43 socat[7113] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:45.084005 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:45 socat[7115] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
ERROR: logging before flag.Parse: E1210 18:44:47.115951 4822 portforward.go:331] an error occurred forwarding 50846 -> 3030: error forwarding port 3030 to pod 76da29d57382ad29d0d4b67fe633dec4222c084530a43eb7c7f1719ba50b10a0, uid : exit status 1: 2019/12/10 09:44:47 socat[7116] E connect(5, AF=2 127.0.0.1:3030, 16): Connection refused
[!] Helm Operator is not ready yet (Get http://127.0.0.1:50846/healthz: EOF), retrying ...
[ℹ] Helm Operator started successfully
[ℹ] see https://docs.fluxcd.io/projects/helm-operator for details on how to use the Helm Operator
[ℹ] Waiting for Flux to start
[ℹ] Flux started successfully
[ℹ] see https://docs.fluxcd.io/projects/flux for details on how to use Flux
[ℹ] Committing and pushing manifests to ***@***.***:marccarre/my-gitops-repo.git
[master 15b0aad] Add Initial Flux configuration
13 files changed, 803 insertions(+)
create mode 100644 flux/flux-account.yaml
create mode 100644 flux/flux-deployment.yaml
create mode 100644 flux/flux-helm-operator-account.yaml
create mode 100644 flux/flux-helm-release-crd.yaml
create mode 100644 flux/flux-namespace.yaml
create mode 100644 flux/flux-secret.yaml
create mode 100644 flux/helm-operator-deployment.yaml
create mode 100644 flux/memcache-dep.yaml
create mode 100644 flux/memcache-svc.yaml
create mode 100644 flux/tiller-ca-cert-configmap.yaml
create mode 100644 flux/tiller-dep.yaml
create mode 100644 flux/tiller-rbac.yaml
create mode 100644 flux/tiller-svc.yaml
Enumerating objects: 17, done.
Counting objects: 100% (17/17), done.
Delta compression using up to 8 threads
Compressing objects: 100% (15/15), done.
Writing objects: 100% (16/16), 9.33 KiB | 9.33 MiB/s, done.
Total 16 (delta 1), reused 12 (delta 1)
remote: Resolving deltas: 100% (1/1), done.
To github.com:marccarre/my-gitops-repo.git
e54ab6f..15b0aad master -> master
[ℹ] Flux will only operate properly once it has write-access to the Git repository
[ℹ] please configure ***@***.***:marccarre/my-gitops-repo.git so that the following Flux SSH public key has write access to it
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDFgi4LH0m5lCSUf/qmBTTZIz3MASZOQMepyDUYxtmAycwC0158op7ykTvHgmAqfXMxS90LzDQ4qPUxWKgExfjnWv3u7gWJBhDJhhDyLEodJLO6/IljgC1rUPTj5QJ1AwcPM7cvoB5sIBVq1iU6Jmf0Hp/BL2QEiLdiBdpA4HkPGKOMvzB+nNiLg4iJbCdAKAefHJWqWvf2k+PPTkVgpQ9ujcyQ+KHczY8Aj4HPu9he8C8S9Sqj2Vxq/qKZVbAuxllINy/WXlCB9SdbPx1b66g9Hiw6meoXiYJPaLft78SVXLQBx7l1anDabmcRnNHSChwMY8AAVFBssm537DyAHuG5
### Then added the above SSH key to https://github.com/marccarre/my-gitops-repo/deploy_keys
$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
flux flux-7696dbc4cd-sjbv7 1/1 Running 0 17m
flux flux-helm-operator-8687676b89-qw7kq 1/1 Running 0 17m
flux memcached-5dcd7579-7bn6l 1/1 Running 0 17m
flux tiller-deploy-69547b56b4-p6zxd 1/1 Running 0 17m
kube-system aws-node-f8g7z 1/1 Running 0 20m
kube-system coredns-699bb99bf8-gptx4 1/1 Running 0 27m
kube-system coredns-699bb99bf8-smzch 1/1 Running 0 27m
kube-system kube-proxy-28xqt 1/1 Running 0 20m
$ EKSCTL_EXPERIMENTAL=true eksctl enable profile app-dev \
> -f examples/eks-quickstart-app-dev.yaml \
> --git-email ***@***.*** \
> --git-url ***@***.***:marccarre/my-gitops-repo.git
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/my-gitops-repo-547778386'...
remote: Enumerating objects: 63, done.
remote: Counting objects: 100% (63/63), done.
remote: Compressing objects: 100% (59/59), done.
remote: Total 451 (delta 12), reused 53 (delta 3), pack-reused 388
Receiving objects: 100% (451/451), 185.04 KiB | 104.00 KiB/s, done.
Resolving deltas: 100% (158/158), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] cloning repository "https://github.com/weaveworks/eks-quickstart-app-dev":master
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/quickstart-008692361'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 214 (delta 0), reused 0 (delta 0), pack-reused 209
Receiving objects: 100% (214/214), 57.27 KiB | 335.00 KiB/s, done.
Resolving deltas: 100% (92/92), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] processing template files in repository
[ℹ] writing new manifests to "/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/my-gitops-repo-547778386/base"
[master b7070d5] Add app-dev quickstart components
27 files changed, 1380 insertions(+)
create mode 100644 base/LICENSE
create mode 100644 base/README.md
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-configmap.yaml
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-daemonset.yaml
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-rbac.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-configmap-cluster-info.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-configmap-fluentd-config.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-daemonset.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-rbac.yaml
create mode 100644 base/demo/helm-release.yaml
create mode 100644 base/kube-system/alb-ingress-controller-deployment.yaml
create mode 100644 base/kube-system/alb-ingress-controller-rbac.yaml
create mode 100644 base/kube-system/cluster-autoscaler-deployment.yaml
create mode 100644 base/kube-system/cluster-autoscaler-rbac.yaml
create mode 100644 base/kubernetes-dashboard/dashboard-metrics-scraper-deployment.yaml
create mode 100644 base/kubernetes-dashboard/dashboard-metrics-scraper-service.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-configmap.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-deployment.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-rbac.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-secrets.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-service.yaml
create mode 100644 base/monitoring/metrics-server.yaml
create mode 100644 base/monitoring/prometheus-operator.yaml
create mode 100644 base/namespaces/amazon-cloudwatch.yaml
create mode 100644 base/namespaces/demo.yaml
create mode 100644 base/namespaces/kubernetes-dashboard.yaml
create mode 100644 base/namespaces/monitoring.yaml
Enumerating objects: 37, done.
Counting objects: 100% (37/37), done.
Delta compression using up to 8 threads
Compressing objects: 100% (28/28), done.
Writing objects: 100% (36/36), 13.54 KiB | 13.54 MiB/s, done.
Total 36 (delta 7), reused 27 (delta 7)
remote: Resolving deltas: 100% (7/7), done.
To github.com:marccarre/my-gitops-repo.git
15b0aad..b7070d5 master -> master
$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
amazon-cloudwatch cloudwatch-agent-h9wr7 1/1 Running 0 15m
amazon-cloudwatch fluentd-cloudwatch-8r5f6 1/1 Running 0 15m
demo podinfo-67b7886b6c-bvdtm 1/1 Running 0 15m
flux flux-7696dbc4cd-sjbv7 1/1 Running 0 36m
flux flux-helm-operator-8687676b89-qw7kq 1/1 Running 0 36m
flux memcached-5dcd7579-7bn6l 1/1 Running 0 36m
flux tiller-deploy-69547b56b4-p6zxd 1/1 Running 0 36m
kube-system alb-ingress-controller-8df75bc98-gssb9 1/1 Running 0 15m
kube-system aws-node-f8g7z 1/1 Running 0 39m
kube-system cluster-autoscaler-86d68b66cb-b9xqv 1/1 Running 0 15m
kube-system coredns-699bb99bf8-gptx4 1/1 Running 0 46m
kube-system coredns-699bb99bf8-smzch 1/1 Running 0 46m
kube-system kube-proxy-28xqt 1/1 Running 0 39m
kubernetes-dashboard dashboard-metrics-scraper-65785bfbc-s8tq6 1/1 Running 0 15m
kubernetes-dashboard kubernetes-dashboard-76b969b44b-rwgk5 1/1 Running 0 15m
monitoring alertmanager-prometheus-operator-alertmanager-0 2/2 Running 0 14m
monitoring metrics-server-5df4599bd7-cgh79 1/1 Running 0 15m
monitoring prometheus-operator-grafana-dd95fb7d4-n9ddh 2/2 Running 0 15m
monitoring prometheus-operator-kube-state-metrics-5d7558d7cc-h8xgg 1/1 Running 0 15m
monitoring prometheus-operator-operator-67895dd7c5-nqj7w 1/1 Running 0 15m
monitoring prometheus-operator-prometheus-node-exporter-qp8gp 1/1 Running 0 15m
monitoring prometheus-prometheus-operator-prometheus-0 3/3 Running 1 14m
If, however, you do *NOT* have the IAM roles in place, then the
cluster-autoscaler will CrashLoopBackOff. See also these steps which
reproduce the issue. (Which I have also run, to double check things &
ensure I can actually reproduce the issue.)
$ eksctl create cluster --name mc-1237-testing
[ℹ] eksctl version 0.11.1
[ℹ] using region ap-northeast-1
[ℹ] setting availability zones to [ap-northeast-1d ap-northeast-1a ap-northeast-1c]
[ℹ] subnets for ap-northeast-1d - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ] subnets for ap-northeast-1a - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ] subnets for ap-northeast-1c - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ] nodegroup "ng-7bfc0f1f" will use "ami-02e124a380df41614" [AmazonLinux2/1.14]
[ℹ] using Kubernetes version 1.14
[ℹ] creating EKS cluster "mc-1237-testing" in "ap-northeast-1" region with un-managed nodes
[ℹ] will create 2 separate CloudFormation stacks for cluster itself and the initial nodegroup
[ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-northeast-1 --cluster=mc-1237-testing'
[ℹ] CloudWatch logging will not be enabled for cluster "mc-1237-testing" in "ap-northeast-1"
[ℹ] you can enable it with 'eksctl utils update-cluster-logging --region=ap-northeast-1 --cluster=mc-1237-testing'
[ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "mc-1237-testing" in "ap-northeast-1"
[ℹ] 2 sequential tasks: { create cluster control plane "mc-1237-testing", create nodegroup "ng-7bfc0f1f" }
[ℹ] building cluster stack "eksctl-mc-1237-testing-cluster"
[ℹ] deploying stack "eksctl-mc-1237-testing-cluster"
[ℹ] building nodegroup stack "eksctl-mc-1237-testing-nodegroup-ng-7bfc0f1f"
[ℹ] --nodes-min=2 was set automatically for nodegroup ng-7bfc0f1f
[ℹ] --nodes-max=2 was set automatically for nodegroup ng-7bfc0f1f
[ℹ] deploying stack "eksctl-mc-1237-testing-nodegroup-ng-7bfc0f1f"
[✔] all EKS cluster resources for "mc-1237-testing" have been created
[✔] saved kubeconfig as "${HOME}/.kube/config"
[ℹ] adding identity "arn:aws:iam::083751696308:role/eksctl-mc-1237-testing-nodegroup-NodeInstanceRole-KGOKLPVNIK10" to auth ConfigMap
[ℹ] nodegroup "ng-7bfc0f1f" has 0 node(s)
[ℹ] waiting for at least 2 node(s) to become ready in "ng-7bfc0f1f"
[ℹ] nodegroup "ng-7bfc0f1f" has 2 node(s)
[ℹ] node "ip-192-168-2-23.ap-northeast-1.compute.internal" is ready
[ℹ] node "ip-192-168-48-84.ap-northeast-1.compute.internal" is ready
[ℹ] kubectl command should work with "${HOME}/.kube/config", try 'kubectl get nodes'
[✔] EKS cluster "mc-1237-testing" in "ap-northeast-1" region is ready
$ EKSCTL_EXPERIMENTAL=true eksctl enable repo \
> --cluster mc-1237-testing \
> --region ap-northeast-1 \
> --git-email ***@***.*** \
> --git-url ***@***.***:marccarre/my-gitops-repo.git
[ℹ] Generating public key infrastructure for the Helm Operator and Tiller
[ℹ] this may take up to a minute, please be patient
[!] Public key infrastructure files were written into directory "/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/eksctl-helm-pki563648596"
[!] please move the files into a safe place or delete them
[ℹ] Generating manifests
[ℹ] Cloning ***@***.***:marccarre/my-gitops-repo.git
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/eksctl-install-flux-clone-026154915'...
remote: Enumerating objects: 43, done.
remote: Counting objects: 100% (43/43), done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 431 (delta 9), reused 35 (delta 3), pack-reused 388
Receiving objects: 100% (431/431), 177.90 KiB | 497.00 KiB/s, done.
Resolving deltas: 100% (155/155), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] Writing Flux manifests
[ℹ] created "Namespace/flux"
[ℹ] Applying Helm TLS Secret(s)
[ℹ] created "flux:Secret/flux-helm-tls-cert"
[ℹ] created "flux:Secret/tiller-secret"
[!] Note: certificate secrets aren't added to the Git repository for security reasons
[ℹ] Applying manifests
[ℹ] created "flux:ServiceAccount/flux"
[ℹ] created "ClusterRole.rbac.authorization.k8s.io/flux"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/flux"
[ℹ] created "CustomResourceDefinition.apiextensions.k8s.io/helmreleases.helm.fluxcd.io"
[ℹ] created "flux:Service/memcached"
[ℹ] created "flux:ServiceAccount/tiller"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/tiller"
[ℹ] created "flux:ServiceAccount/helm"
[ℹ] created "flux:Role.rbac.authorization.k8s.io/tiller-user"
[ℹ] created "kube-system:RoleBinding.rbac.authorization.k8s.io/tiller-user-binding"
[ℹ] created "flux:Deployment.extensions/tiller-deploy"
[ℹ] created "flux:Deployment.apps/flux"
[ℹ] created "flux:ConfigMap/flux-helm-tls-ca-config"
[ℹ] created "flux:Deployment.apps/flux-helm-operator"
[ℹ] created "flux:Deployment.apps/memcached"
[ℹ] created "flux:Secret/flux-git-deploy"
[ℹ] created "flux:ServiceAccount/flux-helm-operator"
[ℹ] created "ClusterRole.rbac.authorization.k8s.io/flux-helm-operator"
[ℹ] created "ClusterRoleBinding.rbac.authorization.k8s.io/flux-helm-operator"
[ℹ] created "flux:Service/tiller-deploy"
[ℹ] Waiting for Helm Operator to start
[ℹ] Helm Operator started successfully
[ℹ] see https://docs.fluxcd.io/projects/helm-operator for details on how to use the Helm Operator
[ℹ] Waiting for Flux to start
[ℹ] Flux started successfully
[ℹ] see https://docs.fluxcd.io/projects/flux for details on how to use Flux
[ℹ] Committing and pushing manifests to ***@***.***:marccarre/my-gitops-repo.git
[master f8e0c52] Add Initial Flux configuration
13 files changed, 803 insertions(+)
create mode 100644 flux/flux-account.yaml
create mode 100644 flux/flux-deployment.yaml
create mode 100644 flux/flux-helm-operator-account.yaml
create mode 100644 flux/flux-helm-release-crd.yaml
create mode 100644 flux/flux-namespace.yaml
create mode 100644 flux/flux-secret.yaml
create mode 100644 flux/helm-operator-deployment.yaml
create mode 100644 flux/memcache-dep.yaml
create mode 100644 flux/memcache-svc.yaml
create mode 100644 flux/tiller-ca-cert-configmap.yaml
create mode 100644 flux/tiller-dep.yaml
create mode 100644 flux/tiller-rbac.yaml
create mode 100644 flux/tiller-svc.yaml
Enumerating objects: 17, done.
Counting objects: 100% (17/17), done.
Delta compression using up to 8 threads
Compressing objects: 100% (15/15), done.
Writing objects: 100% (16/16), 9.33 KiB | 9.33 MiB/s, done.
Total 16 (delta 1), reused 12 (delta 1)
remote: Resolving deltas: 100% (1/1), done.
To github.com:marccarre/my-gitops-repo.git
4b9a79d..f8e0c52 master -> master
[ℹ] Flux will only operate properly once it has write-access to the Git repository
[ℹ] please configure ***@***.***:marccarre/my-gitops-repo.git so that the following Flux SSH public key has write access to it
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCxoYrh1xqsHGQuJZnsY2hiOyplanBS/wmLQaxyPu2eMexmG1uy4Vq+e1qHQ6ukTlPSV92N2diz7Mml/VnfMIu6/S6WpcEa8s8cX+4X2w4DN5VGcOdMbRa76Td6me1Kp7X4BvQSpmtfj380+7dY+yxywTVf97ZFYq1atitxvjgVHIUCDLAXxqmM2t7OnH5nYEJFS+32BRmENMpzEfB+31PiOAgsUHENA4BCr0sbxDpKt3j4hzJbntgYQVyhaNLBH8S34Ogz1V0i8H5iplJ6YjsNXpeUhmRYFH4rKOTi0EJv7wEWMEH1gttQvLxhHAd6s4qDMB27aQSJFMh55/DW/r6Z
### Then added the above SSH key to https://github.com/marccarre/my-gitops-repo/deploy_keys
$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
flux flux-7696dbc4cd-4h927 1/1 Running 0 69s
flux flux-helm-operator-8687676b89-hskbj 1/1 Running 0 68s
flux memcached-5dcd7579-tpkvd 1/1 Running 0 69s
flux tiller-deploy-69547b56b4-sp9md 1/1 Running 0 69s
kube-system aws-node-97px5 1/1 Running 0 7m5s
kube-system aws-node-kxbzd 1/1 Running 0 7m5s
kube-system coredns-699bb99bf8-sn7ws 1/1 Running 0 13m
kube-system coredns-699bb99bf8-zx26g 1/1 Running 0 13m
kube-system kube-proxy-t2rvs 1/1 Running 0 7m5s
kube-system kube-proxy-tkncf 1/1 Running 0 7m5s
$ EKSCTL_EXPERIMENTAL=true eksctl enable profile app-dev \
> --cluster mc-1237-testing \
> --region ap-northeast-1 \
> --git-email ***@***.*** \
> --git-url ***@***.***:marccarre/my-gitops-repo.git
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/my-gitops-repo-130038557'...
remote: Enumerating objects: 47, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (44/44), done.
remote: Total 435 (delta 10), reused 38 (delta 3), pack-reused 388
Receiving objects: 100% (435/435), 179.62 KiB | 494.00 KiB/s, done.
Resolving deltas: 100% (156/156), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] cloning repository "https://github.com/weaveworks/eks-quickstart-app-dev":master
Cloning into '/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/quickstart-019213272'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 214 (delta 0), reused 0 (delta 0), pack-reused 209
Receiving objects: 100% (214/214), 57.27 KiB | 322.00 KiB/s, done.
Resolving deltas: 100% (92/92), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
[ℹ] processing template files in repository
[ℹ] writing new manifests to "/var/folders/24/d3mml6bn20nftpt91cfldq1h0000gn/T/my-gitops-repo-130038557/base"
[master 5e6bcf5] Add app-dev quickstart components
27 files changed, 1380 insertions(+)
create mode 100644 base/LICENSE
create mode 100644 base/README.md
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-configmap.yaml
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-daemonset.yaml
create mode 100644 base/amazon-cloudwatch/cloudwatch-agent-rbac.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-configmap-cluster-info.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-configmap-fluentd-config.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-daemonset.yaml
create mode 100644 base/amazon-cloudwatch/fluentd-rbac.yaml
create mode 100644 base/demo/helm-release.yaml
create mode 100644 base/kube-system/alb-ingress-controller-deployment.yaml
create mode 100644 base/kube-system/alb-ingress-controller-rbac.yaml
create mode 100644 base/kube-system/cluster-autoscaler-deployment.yaml
create mode 100644 base/kube-system/cluster-autoscaler-rbac.yaml
create mode 100644 base/kubernetes-dashboard/dashboard-metrics-scraper-deployment.yaml
create mode 100644 base/kubernetes-dashboard/dashboard-metrics-scraper-service.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-configmap.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-deployment.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-rbac.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-secrets.yaml
create mode 100644 base/kubernetes-dashboard/kubernetes-dashboard-service.yaml
create mode 100644 base/monitoring/metrics-server.yaml
create mode 100644 base/monitoring/prometheus-operator.yaml
create mode 100644 base/namespaces/amazon-cloudwatch.yaml
create mode 100644 base/namespaces/demo.yaml
create mode 100644 base/namespaces/kubernetes-dashboard.yaml
create mode 100644 base/namespaces/monitoring.yaml
Enumerating objects: 37, done.
Counting objects: 100% (37/37), done.
Delta compression using up to 8 threads
Compressing objects: 100% (28/28), done.
Writing objects: 100% (36/36), 13.52 KiB | 13.52 MiB/s, done.
Total 36 (delta 7), reused 25 (delta 7)
remote: Resolving deltas: 100% (7/7), done.
To github.com:marccarre/my-gitops-repo.git
f8e0c52..5e6bcf5 master -> master
$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
amazon-cloudwatch cloudwatch-agent-6km5b 1/1 Running 0 109m
amazon-cloudwatch cloudwatch-agent-kcpb9 1/1 Running 0 109m
amazon-cloudwatch fluentd-cloudwatch-8wxxn 1/1 Running 0 109m
amazon-cloudwatch fluentd-cloudwatch-nst52 1/1 Running 0 109m
demo podinfo-67b7886b6c-pjws4 1/1 Running 0 109m
flux flux-7696dbc4cd-4h927 1/1 Running 0 116m
flux flux-helm-operator-8687676b89-hskbj 1/1 Running 0 115m
flux memcached-5dcd7579-tpkvd 1/1 Running 0 116m
flux tiller-deploy-69547b56b4-sp9md 1/1 Running 0 116m
kube-system alb-ingress-controller-776b5b58c9-bbt7t 1/1 Running 0 109m
kube-system aws-node-97px5 1/1 Running 0 121m
kube-system aws-node-kxbzd 1/1 Running 0 121m
kube-system cluster-autoscaler-55d556f787-rm7cc 0/1 CrashLoopBackOff 25 109m
kube-system coredns-699bb99bf8-sn7ws 1/1 Running 0 128m
kube-system coredns-699bb99bf8-zx26g 1/1 Running 0 128m
kube-system kube-proxy-t2rvs 1/1 Running 0 121m
kube-system kube-proxy-tkncf 1/1 Running 0 121m
kubernetes-dashboard dashboard-metrics-scraper-65785bfbc-52952 1/1 Running 0 109m
kubernetes-dashboard kubernetes-dashboard-76b969b44b-hf9kd 1/1 Running 0 109m
monitoring alertmanager-prometheus-operator-alertmanager-0 2/2 Running 0 108m
monitoring metrics-server-5df4599bd7-l5b8q 1/1 Running 0 109m
monitoring prometheus-operator-grafana-dd95fb7d4-gzqxn 2/2 Running 0 109m
monitoring prometheus-operator-kube-state-metrics-5d7558d7cc-qx4tl 1/1 Running 0 109m
monitoring prometheus-operator-operator-67895dd7c5-nhbbv 1/1 Running 0 109m
monitoring prometheus-operator-prometheus-node-exporter-77nb6 1/1 Running 0 109m
monitoring prometheus-operator-prometheus-node-exporter-hfdv9 1/1 Running 0 109m
monitoring prometheus-prometheus-operator-prometheus-0 3/3 Running 1 108m
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1237>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGZZDXBZC2E5AXMQ72KFG3QX5WBJANCNFSM4ISMF6YA>
.
|
Which instructions were you following exactly @ilanpillemer? (Could you please share a link to them to ensure we are on the same page, and/or so that we know if we need to update/correct anything published elsewhere? 🙇 ) If you are talking about something else than this, would you have any suggestion to make these instructions clearer? Note that the pre-requisites for the
Yes, this is what the first two commands in what I shared here were hoping to show, i.e.:
|
Yes. Now it seems completely obvious what I had to do with hindsight. I think a very minor tweak would help. I used the gitops quick start guide at eksctl.io. When I look now it says some variant of the command should be used. Perhaps a few more words like for example if you need the auto scaler or alb ingress then the necessary switches you can find in the documents should be used. Or something similar. Great work with eksctl and flux, they are game changing. |
What happened?
Deployed with
eksctl gitops apply
and after deployment and adding flux's ssh key to my gitops repo, cluster autoscaler doesn't start:My cluster looks as follows:
The error in the logs of the cluster-autoscaler container are:
The text was updated successfully, but these errors were encountered: