Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1993854: add metrics for multi interface #50

Merged
merged 3 commits into from Aug 23, 2021

Conversation

aneeshkp
Copy link
Contributor

@aneeshkp aneeshkp commented Aug 12, 2021

This PR fixes Prometheus metrics for multi interface and single interface case.
Last PR added config filename to the log instead of interface and and then extract metrics method signature was changed to send list of interfaces to parse and match the port id from the role to determine the interface role.
This PR uses those changes to update Prometheus metrics , by adding new metrics and updating existing now.
Introduced following metrics
openshift_ptp_clock_state{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"}
openshift_ptp_ptp_interface_role{iface="ens5f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1
openshift_ptp_delay_from_system{iface="ens7f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 2720

Single interface Metrics

# HELP openshift_ptp_clock_state 0 = FREERUN, 1 = LOCKED, 2 = HOLDOVER
# TYPE openshift_ptp_clock_state gauge
openshift_ptp_clock_state{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1
// linux ptp daemon uses summary update (rms) for logs , hence there is no lock state for CLOCK_REALTIME. The side car discourages -u and hence its avilable in sidecar
# HELP openshift_ptp_delay_from_master 
# TYPE openshift_ptp_delay_from_master gauge
openshift_ptp_delay_from_master{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 1234
openshift_ptp_delay_from_master{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 80
# HELP openshift_ptp_frequency_adjustment 
# TYPE openshift_ptp_frequency_adjustment gauge
openshift_ptp_frequency_adjustment{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} -77549
openshift_ptp_frequency_adjustment{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} -2011
# HELP openshift_ptp_max_offset_from_master 
# TYPE openshift_ptp_max_offset_from_master gauge
openshift_ptp_max_offset_from_master{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 0
openshift_ptp_max_offset_from_master{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 3
# HELP openshift_ptp_offset_from_master 
# TYPE openshift_ptp_offset_from_master gauge
openshift_ptp_offset_from_master{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 0
openshift_ptp_offset_from_master{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 3
# HELP openshift_ptp_ptp_interface_role 0 = PASSIVE 1 = SLAVE 2 = MASTER 3 = FAULTY
# TYPE openshift_ptp_ptp_interface_role gauge
openshift_ptp_ptp_interface_role{iface="ens5f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1

Faulty interface

# HELP openshift_ptp_ptp_interface_role 0 = PASSIVE 1 = SLAVE 2 = MASTER 3 = FAULTY
# TYPE openshift_ptp_ptp_interface_role gauge
openshift_ptp_ptp_interface_role{iface="ens5f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 3

but this will still stay locked as the known last state (will be more accurate with event frame aka sidecar)

openshift_ptp_clock_state{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1

Now for multi interface

# HELP openshift_ptp_clock_state 0 = FREERUN, 1 = LOCKED, 2 = HOLDOVER
# TYPE openshift_ptp_clock_state gauge
openshift_ptp_clock_state{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1
# HELP openshift_ptp_delay_from_master 
# TYPE openshift_ptp_delay_from_master gauge
openshift_ptp_delay_from_master{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 1400
openshift_ptp_delay_from_master{iface="ens5f0",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 2771
openshift_ptp_delay_from_master{iface="ens7f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 2720
openshift_ptp_delay_from_master{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 81
# HELP openshift_ptp_frequency_adjustment 
# TYPE openshift_ptp_frequency_adjustment gauge
openshift_ptp_frequency_adjustment{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} -2.034256e+06
openshift_ptp_frequency_adjustment{iface="ens5f0",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} -1948
openshift_ptp_frequency_adjustment{iface="ens7f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} -3972
openshift_ptp_frequency_adjustment{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} -1952
# HELP openshift_ptp_max_offset_from_master 
# TYPE openshift_ptp_max_offset_from_master gauge
openshift_ptp_max_offset_from_master{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 2.115298e+06
openshift_ptp_max_offset_from_master{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 12
# HELP openshift_ptp_max_offset_from_system 
# TYPE openshift_ptp_max_offset_from_system gauge
openshift_ptp_max_offset_from_system{iface="ens5f0",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 15
openshift_ptp_max_offset_from_system{iface="ens7f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 3
# HELP openshift_ptp_offset_from_master 
# TYPE openshift_ptp_offset_from_master gauge
openshift_ptp_offset_from_master{iface="CLOCK_REALTIME",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 2.115298e+06
openshift_ptp_offset_from_master{iface="master",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 12
# HELP openshift_ptp_offset_from_system 
# TYPE openshift_ptp_offset_from_system gauge
openshift_ptp_offset_from_system{iface="ens5f0",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 15
openshift_ptp_offset_from_system{iface="ens7f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 3
# HELP openshift_ptp_ptp_interface_role 0 = PASSIVE 1 = SLAVE 2 = MASTER 3 = FAULTY
# TYPE openshift_ptp_ptp_interface_role gauge
openshift_ptp_ptp_interface_role{iface="ens5f0",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 2
openshift_ptp_ptp_interface_role{iface="ens5f1",node="cnfde7.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1

@SchSeba SchSeba changed the title add metrics for multi interface Bug 1993854: add metrics for multi interface Aug 16, 2021
@openshift-ci openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Aug 16, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 16, 2021

@aneeshkp: This pull request references Bugzilla bug 1993854, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (omarzian@redhat.com), skipping review request.

In response to this:

Bug 1993854: add metrics for multi interface

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@SchSeba SchSeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

just a few small comments

pkg/daemon/metrics.go Show resolved Hide resolved
}
}

func extractSummaryMetrics(processName, output string) (offsetFromMaster, maxOffsetFromMaster, frequencyAdjustment, delayFromMaster float64) {
func extractSummaryMetrics(configName, processName, output string) (iface string, offsetFromMaster, maxOffsetFromMaster, frequencyAdjustment, delayFromMaster float64) {
// remove everything before the rms string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove this comment is not right any more

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

pkg/daemon/metrics.go Outdated Show resolved Hide resolved
pkg/daemon/metrics.go Show resolved Hide resolved
pkg/daemon/metrics.go Outdated Show resolved Hide resolved
pkg/daemon/metrics.go Show resolved Hide resolved
pkg/daemon/metrics.go Show resolved Hide resolved
pkg/daemon/metrics.go Show resolved Hide resolved
@SchSeba
Copy link
Contributor

SchSeba commented Aug 16, 2021

/cc @josephdrichard

please have a look

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 16, 2021

@SchSeba: GitHub didn't allow me to request PR reviews from the following users: josephdrichard.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @josephdrichard

please have a look

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@josephdrichard
Copy link
Contributor

/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 16, 2021

@josephdrichard: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 17, 2021
@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 17, 2021
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 18, 2021
@aneeshkp aneeshkp requested a review from SchSeba August 18, 2021 23:33
@aneeshkp
Copy link
Contributor Author

@SchSeba added frequency and delay from system metrics.,

@aneeshkp
Copy link
Contributor Author

ran cnf-test, passed for ptp metrics also the log is intact

ptp4l[6004331.058]: [ptp4l.0.config] selected best master clock 001747.fffe.701560
 port 1: new foreign master

`
Ran 9 of 138 Specs in 376.831 seconds
FAIL! -- 7 Passed | 2 Failed | 0 Pending | 129 Skipped

Summarizing 2 Failures:

[Fail] ptp Test Offset PTP configuration verifications [It] PTP time diff between Grandmaster and Slave should be in range -100ms and 100ms
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/ptp/ptp.go:403

[Fail] [ptp] PTP e2e tests PTP Interfaces discovery [It] PTP daemon apply match rule based on nodeLabel
/go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift/ptp-operator/test/ptp/ptp.go:223

Ran 9 of 138 Specs in 376.831 seconds

@SchSeba
Copy link
Contributor

SchSeba commented Aug 19, 2021

/retest

1 similar comment
@SchSeba
Copy link
Contributor

SchSeba commented Aug 19, 2021

/retest

@aneeshkp
Copy link
Contributor Author

/test e2e-aws

@aneeshkp
Copy link
Contributor Author

all cnf test passed for ptp-operator
JUnit report was created: /junit.xml/cnftests-junit.xml

Ran 12 of 138 Specs in 460.530 seconds
SUCCESS! -- 12 Passed | 0 Failed | 0 Pending | 126 Skipped
You're using deprecated Ginkgo functionality:

@SchSeba
Copy link
Contributor

SchSeba commented Aug 23, 2021

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 23, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 23, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aneeshkp, josephdrichard, SchSeba

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@SchSeba
Copy link
Contributor

SchSeba commented Aug 23, 2021

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 23, 2021
@openshift-merge-robot openshift-merge-robot merged commit 6735459 into openshift:master Aug 23, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 23, 2021

@aneeshkp: All pull requests linked via external trackers have merged:

Bugzilla bug 1993854 has been moved to the MODIFIED state.

In response to this:

Bug 1993854: add metrics for multi interface

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants