Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add crio metrics as an endpoint of nodes #228

Merged
merged 2 commits into from
Feb 8, 2019

Conversation

brancz
Copy link
Contributor

@brancz brancz commented Jan 31, 2019

As this needs a bit of a hack, I went ahead and did the relabelling rules to make this work.

@squat @s-urbaniak @mxinden @metalmatze

cc @runcom

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 31, 2019
@brancz
Copy link
Contributor Author

brancz commented Jan 31, 2019

Still testing this so putting a hold on this for now.

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2019
@brancz
Copy link
Contributor Author

brancz commented Jan 31, 2019

I am currently seeing a problem where consistently there are nodes that cannot be scraped. @runcom is investigating.

screenshot from 2019-01-31 17-00-28

@runcom
Copy link
Member

runcom commented Feb 5, 2019

so, I've been testing this on a brand new cluster from today and I'm perfectly able to grab metrics from each node

@brancz
Copy link
Contributor Author

brancz commented Feb 5, 2019

Got it, in that case I will give it another try.

@brancz
Copy link
Contributor Author

brancz commented Feb 6, 2019

I think I have an idea what's happening. This seems to only happen to workers, I think the security group rules are probably off.

@runcom
Copy link
Member

runcom commented Feb 7, 2019

latest installer now has openshift/installer#1201 which should open ports for worker<->worker communication for metrics

@brancz
Copy link
Contributor Author

brancz commented Feb 7, 2019

Awesome. I'm going to give this a try soon.

@brancz
Copy link
Contributor Author

brancz commented Feb 7, 2019

Looking good! This is all the successful scrapes of crio of this e2e run:

screenshot from 2019-02-07 22-50-24

Ready for a review now!

/hold cancel

@runcom
Copy link
Member

runcom commented Feb 7, 2019

lgtm (fwiw)

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 7, 2019
@mrunalp
Copy link
Member

mrunalp commented Feb 7, 2019

👍 lgtm

@s-urbaniak
Copy link
Contributor

@brancz can we add the up query to the existing e2e tests too?

queries := []struct {
query string
expectN int
}{
{
query: `up{job="node-exporter"} == 1`,
expectN: 1,
}, {
query: `up{job="kubelet"} == 1`,
expectN: 1,
}, {
query: `up{job="scheduler"} == 1`,
expectN: 1,
}, {
query: `up{job="kube-controller-manager"} == 1`,
expectN: 1,
}, {
query: `up{job="apiserver"} == 1`,
expectN: 1,
}, {
query: `up{job="kube-state-metrics"} == 1`,
expectN: 1,
}, {
query: `up{job="prometheus-k8s"} == 1`,
expectN: 1,
}, {
query: `up{job="prometheus-operator"} == 1`,
expectN: 1,
}, {
query: `up{job="alertmanager-main"} == 1`,
expectN: 2,
}, {
query: `namespace:container_memory_usage_bytes:sum`,
expectN: 1,
},
}

@brancz
Copy link
Contributor Author

brancz commented Feb 8, 2019

/retest

1 similar comment
@brancz
Copy link
Contributor Author

brancz commented Feb 8, 2019

/retest

@s-urbaniak
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 8, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brancz, s-urbaniak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@brancz
Copy link
Contributor Author

brancz commented Feb 8, 2019

/retest

1 similar comment
@brancz
Copy link
Contributor Author

brancz commented Feb 8, 2019

/retest

@openshift-merge-robot openshift-merge-robot merged commit 8943726 into openshift:master Feb 8, 2019
@brancz brancz deleted the add-crio branch February 8, 2019 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants