Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add monitoring to kind #2190

Closed
wants to merge 1 commit into from
Closed

Conversation

aojea
Copy link
Contributor

@aojea aojea commented Apr 9, 2021

Install a prometheus deployment that collect metrics from all
the kubernetes components.
Modify kube-proxy to listen for metrics in all addresses, it listen
by default only in localhost.
Once the cluster is destroyed, it dumps the prometheus snapshot
as a tarball, so it can be analyzed locally.

https://suraj.io/post/how-to-backup-and-restore-prometheus/

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 9, 2021
@aojea
Copy link
Contributor Author

aojea commented Apr 9, 2021

/hold
I want to investigate more the CI performance from the cluster POV

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 9, 2021
@aojea aojea force-pushed the monitoring branch 2 times, most recently from a86fd6e to 44dfb51 Compare April 9, 2021 22:13
@aojea
Copy link
Contributor Author

aojea commented Apr 9, 2021

niiiiiiiiiiiiiiiiiiiiiiiice

image

@aojea aojea force-pushed the monitoring branch 2 times, most recently from ea4741b to 7cf75c9 Compare April 15, 2021 11:15
Install a prometheus deployment that collect metrics from all
the kubernetes components.
Modify kube-proxy to listen for metrics in all addresses, it listen
by default only in localhost.
Once the cluster is destroyed, it dumps the prometheus snapshot
as a tarball, so it can be analyzed locally.

https://suraj.io/post/how-to-backup-and-restore-prometheus/
@k8s-ci-robot
Copy link
Contributor

@aojea: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kind-conformance-parallel-ga-only af9fdfd link /test pull-kind-conformance-parallel-ga-only
pull-kind-e2e-kubernetes-1-20 af9fdfd link /test pull-kind-e2e-kubernetes-1-20

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@BenTheElder
Copy link
Member

I think we should discuss having a mode where e2e.test in k/k is responsible for installing monitoring and snapshotting it 🤔

@aojea
Copy link
Contributor Author

aojea commented Jun 24, 2021

I think we should discuss having a mode where e2e.test in k/k is responsible for installing monitoring and snapshotting it

let's fork this to k/k then, I really find it useful in the openshift CI, they do this metrics snapshoting

I think we should discuss having a mode where e2e.test in k/k is responsible for installing monitoring and snapshotting it

yeah. let's discuss it there
kubernetes/kubernetes#103145

@aojea aojea closed this Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants