New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-introduce DPCR Loki logging for GCP and Azure clusters #39064
Conversation
Initial tests shows about 1.9 million log lines sent to loki for a single job run. 1 million of them were audit logs, so this will eliminate almost half our logging load by itself. Remove unused mounts for the audit logs
Will fail a test without this: : [sig-arch] Managed cluster should set requests but not limits [Suite:openshift/conformance/parallel] expand_less Run #0: Failed expand_less 6s { fail [github.com/openshift/origin/test/extended/operators/resources.go:196]: May 5 09:10:34.626: Pods in platform namespaces are not following resource request/limit rules or do not have an exception granted: apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token does not have a cpu request (rule: "apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token/request[cpu]") apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token does not have a memory request (rule: "apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token/request[memory]")
/pj-rehearse periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade |
/pj-rehearse periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-ovn-rt-upgrade |
/pj-rehearse periodic-ci-openshift-release-master-ci-4.14-e2e-azure-ovn-upgrade |
/pj-rehearse periodic-ci-openshift-release-master-nightly-4.14-e2e-aws-sdn-upgrade |
@dgoodwin: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
[REHEARSALNOTIFIER]
A total of 12431 jobs have been affected by this change. The above listing is non-exhaustive and limited to 35 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
/pj-rehearse ack |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, stbenjam The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…39064) * Revert "Revert "Enable DPCR Loki for specific set of jobs (openshift#38914)"" This reverts commit 2b7a44f. * Stop sending audit logs to loki Initial tests shows about 1.9 million log lines sent to loki for a single job run. 1 million of them were audit logs, so this will eliminate almost half our logging load by itself. Remove unused mounts for the audit logs * Set resource requests on new promtail prod-bearer-token container Will fail a test without this: : [sig-arch] Managed cluster should set requests but not limits [Suite:openshift/conformance/parallel] expand_less Run #0: Failed expand_less 6s { fail [github.com/openshift/origin/test/extended/operators/resources.go:196]: May 5 09:10:34.626: Pods in platform namespaces are not following resource request/limit rules or do not have an exception granted: apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token does not have a cpu request (rule: "apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token/request[cpu]") apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token does not have a memory request (rule: "apps/v1/DaemonSet/openshift-e2e-loki/loki-promtail/container/prod-bearer-token/request[memory]") * Enable loki logging for all azure jobs
In Makefile, ensure ADDITIONAL_MANIFEST_DIR exists before trying to move its content. Needed for trying to resolve the issue in e2e-metal-single-node-live-iso job[*] ``` mv /root/sno-additional-manifests/* /home/sno/sno-additional-manifests/ mv: cannot stat '/root/sno-additional-manifests/*': No such file or directory make: *** [Makefile:280: deploy_ibip] Error 1 ``` Note: could be related to openshift/release#39064 [*] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_assisted-test-infra/2143/pull-ci-openshift-assisted-test-infra-master-e2e-metal-single-node-live-iso/1655682994788110336/build-log.txt
In Makefile, ensure ADDITIONAL_MANIFEST_DIR exists before trying to move its content. Needed for trying to resolve the issue in e2e-metal-single-node-live-iso job[*] ``` mv /root/sno-additional-manifests/* /home/sno/sno-additional-manifests/ mv: cannot stat '/root/sno-additional-manifests/*': No such file or directory make: *** [Makefile:280: deploy_ibip] Error 1 ``` Note: could be related to openshift/release#39064 [*] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_assisted-test-infra/2143/pull-ci-openshift-assisted-test-infra-master-e2e-metal-single-node-live-iso/1655682994788110336/build-log.txt
In Makefile, ensure ADDITIONAL_MANIFEST_DIR exists before trying to move its content. Needed for trying to resolve the issue in e2e-metal-single-node-live-iso job[*] ``` mv /root/sno-additional-manifests/* /home/sno/sno-additional-manifests/ mv: cannot stat '/root/sno-additional-manifests/*': No such file or directory make: *** [Makefile:280: deploy_ibip] Error 1 ``` Note: could be related to openshift/release#39064 [*] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_assisted-test-infra/2143/pull-ci-openshift-assisted-test-infra-master-e2e-metal-single-node-live-iso/1655682994788110336/build-log.txt
In Makefile, ensure ADDITIONAL_MANIFEST_DIR exists before trying to move its content. Needed for trying to resolve the issue in e2e-metal-single-node-live-iso job[*] ``` mv /root/sno-additional-manifests/* /home/sno/sno-additional-manifests/ mv: cannot stat '/root/sno-additional-manifests/*': No such file or directory make: *** [Makefile:280: deploy_ibip] Error 1 ``` Note: could be related to openshift/release#39064 [*] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_assisted-test-infra/2143/pull-ci-openshift-assisted-test-infra-master-e2e-metal-single-node-live-iso/1655682994788110336/build-log.txt
In Makefile, ensure ADDITIONAL_MANIFEST_DIR exists before trying to move its content. Needed for trying to resolve the issue in e2e-metal-single-node-live-iso job[*] ``` mv /root/sno-additional-manifests/* /home/sno/sno-additional-manifests/ mv: cannot stat '/root/sno-additional-manifests/*': No such file or directory make: *** [Makefile:280: deploy_ibip] Error 1 ``` Note: could be related to openshift/release#39064 [*] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_assisted-test-infra/2143/pull-ci-openshift-assisted-test-infra-master-e2e-metal-single-node-live-iso/1655682994788110336/build-log.txt
TRT-968