Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error deploying the Cluster-Monitoring #94

Closed
CHoldinghausen opened this issue May 18, 2021 · 1 comment
Closed

Error deploying the Cluster-Monitoring #94

CHoldinghausen opened this issue May 18, 2021 · 1 comment

Comments

@CHoldinghausen
Copy link

Hello Experts,

I used the viya4-iac to deploy the infrastructure and now im using the viya4-deployment to deploy the whole new 2020.1.5 SAS Version. Unfurtunately i got an Error during the Deployment of the monitoring parts.

I think this tasks run into an timeout but im not 100% sure. Can you please help me out here?

`TASK [monitoring : cluster-monitoring - deploy] ********************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "/tmp/ansible._q1dqpdw/viya4-monitoring-kubernetes/monitoring/bin/deploy_monitoring_cluster.sh", "delta": "0:20:39.446237", "end": "2021-05-18 13:33:11.933409", "msg": "non-zero return code", "rc": 1, "start": "2021-05-18 13:12:32.487172", "stderr": "Error: release v4m-prometheus-operator failed, and has been uninstalled due to atomic being set: timed out waiting for the condition", "stderr_lines": ["Error: release v4m-prometheus-operator failed, and has been uninstalled due to atomic being set: timed out waiting for the condition"], "stdout": "Helm client version: 3.5.4\nKubernetes client version: v1.18.8\nKubernetes server version: v1.18.14\n\nDeploying monitoring to the [monitoring] namespace...\nAdding [stable] helm repository\n"stable" has been added to your repositories\nAdding [prometheus-community] helm repository\n"prometheus-community" has been added to your repositories\nUpdating helm repositories...\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the "prometheus-community" chart repository\n...Successfully got an update from the "stable" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\nUpdating Prometheus Operator custom resource definitions\ncustomresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com configured\ncustomresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com configured\ncustomresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com configured\ncustomresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com configured\ncustomresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com configured\ncustomresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com configured\ncustomresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com configured\ncustomresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com configured\nProvisioning TLS-enabled Prometheus datasource for Grafana...\nconfigmap "grafana-datasource-prom-https" deleted\nconfigmap/grafana-datasource-prom-https created\nconfigmap/grafana-datasource-prom-https labeled\nEnabling Prometheus node-exporter for TLS...\nconfigmap "node-exporter-tls-web-config" deleted\nconfigmap/node-exporter-tls-web-config created\nconfigmap/node-exporter-tls-web-config labeled\nUser response file: [/tmp/ansible._q1dqpdw/monitoring/user-values-prom-operator.yaml]\nDeploying the Kube Prometheus Stack. This may take a few minutes...\nInstalling via Helm...(Tue May 18 13:12:48 UTC 2021 - timeout 20m)\nRelease "v4m-prometheus-operator" does not exist. Installing it now.", "stdout_lines": ["Helm client version: 3.5.4", "Kubernetes client version: v1.18.8", "Kubernetes server version: v1.18.14", "", "Deploying monitoring to the [monitoring] namespace...", "Adding [stable] helm repository", ""stable" has been added to your repositories", "Adding [prometheus-community] helm repository", ""prometheus-community" has been added to your repositories", "Updating helm repositories...", "Hang tight while we grab the latest from your chart repositories...", "...Successfully got an update from the "prometheus-community" chart repository", "...Successfully got an update from the "stable" chart repository", "Update Complete. ⎈Happy Helming!⎈", "Updating Prometheus Operator custom resource definitions", "customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com configured", "customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com configured", "customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com configured", "customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com configured", "customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com configured", "customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com configured", "customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com configured", "customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com configured", "Provisioning TLS-enabled Prometheus datasource for Grafana...", "configmap "grafana-datasource-prom-https" deleted", "configmap/grafana-datasource-prom-https created", "configmap/grafana-datasource-prom-https labeled", "Enabling Prometheus node-exporter for TLS...", "configmap "node-exporter-tls-web-config" deleted", "configmap/node-exporter-tls-web-config created", "configmap/node-exporter-tls-web-config labeled", "User response file: [/tmp/ansible._q1dqpdw/monitoring/user-values-prom-operator.yaml]", "Deploying the Kube Prometheus Stack. This may take a few minutes...", "Installing via Helm...(Tue May 18 13:12:48 UTC 2021 - timeout 20m)", "Release "v4m-prometheus-operator" does not exist. Installing it now."]}

PLAY RECAP *********************************************************************
localhost : ok=89 changed=24 unreachable=0 failed=1 skipped=45 rescued=0 ignored=0

Tuesday 18 May 2021 13:33:11 +0000 (0:20:39.599) 0:23:05.642 ***********

monitoring : cluster-monitoring - deploy ----------------------------- 1239.60s
vdm : manifest - deploy ------------------------------------------------ 75.55s
vdm : kustomize - Generate deployment manifest ------------------------- 28.93s
vdm : prereqs - cluster-local deploy ------------------------------------ 4.82s
vdm : prereqs - cluster-wide -------------------------------------------- 4.21s
vdm : copy - VDM generators --------------------------------------------- 3.09s
vdm : assets - Download ------------------------------------------------- 2.06s
vdm : assets - Get License ---------------------------------------------- 1.97s
vdm : copy - VDM transformers ------------------------------------------- 1.91s
monitoring : v4m - download --------------------------------------------- 1.80s
vdm : Download viya4-orders-cli ----------------------------------------- 1.25s
nfs-subdir-external-provisioner : Deploy nfs-subdir-external-provisioner --- 1.18s
cert-manager : Deploy cert-manager -------------------------------------- 1.00s
vdm : assets - Extract downloaded assets -------------------------------- 0.86s
nfs-subdir-external-provisioner : Remove deprecated efs-provisioner namespace --- 0.82s
metrics-server : Check for metrics service ------------------------------ 0.80s
jump-server : jump-server - lookup groups ------------------------------- 0.76s
vdm : Create namespace -------------------------------------------------- 0.73s
monitoring : cluster-monitoring - lookup creds -------------------------- 0.71s
Gathering Facts --------------------------------------------------------- 0.71s
`
Best Regards
Carsten

@CHoldinghausen
Copy link
Author

After running it some times it just finished

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant