Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPCT-226: Added check rules for etcd #80

Merged

Conversation

mtulio
Copy link
Contributor

@mtulio mtulio commented Sep 25, 2023

Introducing etcd check rules based in the parsed logs for etcd (etcd request took too long).

The acceptance values are calibrated from the baseline providers:

  • AWS (ocp414rc0_AWS_None_202309222127_sonobuoy_47efe9ef-06e4-48f3-a190-4e3523ff1ae0.tar.gz)

Screenshot from 2023-09-25 13-41-29

  • AWS (4.13.9-20230925-HighlyAvailable-aws-None-202309250459_sonobuoy_f4e06587-e7b3-4cbd-bcf4-d1350ec35d9a.tar.gz):

Screenshot from 2023-09-25 13-38-17

  • vSphere (4.13.9-20230821-HighlyAvailable-vsphere-None.tar.gz):

Screenshot from 2023-09-25 13-34-52

The check rules is implemented in the feature #76

checkSum.Checks = append(checkSum.Checks, &Check{
ID: "OPCT-010",
Name: "etcd logs: slow requests: average should be under 500ms",
Test: func() CheckResult {
prefix := "Check OPCT-010 Failed"
if re.Provider.MustGatherInfo == nil {
log.Debugf("%s: unable to read must-gather information.", prefix)
return CheckResultFail
}
if re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"] == nil {
log.Debugf("%s: unable to read statistics from parsed etcd logs.", prefix)
return CheckResultFail
}
if re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"].StatMean == "" {
log.Debugf("%s: unable to get p50/mean statistics from parsed data: %v", prefix, re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"])
return CheckResultFail
}
values := strings.Split(re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"].StatMean, " ")
if values[0] == "" {
log.Debugf("%s: unable to get parse p50/mean: %v", prefix, values)
return CheckResultFail
}
value, err := strconv.ParseFloat(values[0], 64)
if err != nil {
log.Debugf("%s: unable to convert p50/mean to float: %v", prefix, err)
return CheckResultFail
}
if value >= 500 {
log.Debugf("%s acceptance criteria: want=[%v] got=[%v]", prefix, "<500", value)
return CheckResultFail
}
return CheckResultPass
},
})
checkSum.Checks = append(checkSum.Checks, &Check{
ID: "OPCT-011",
Name: "etcd logs: slow requests: maximum should be under 1500ms",
Test: func() CheckResult {
prefix := "Check OPCT-011 Failed"
if re.Provider.MustGatherInfo == nil {
log.Debugf("%s: unable to read must-gather information.", prefix)
return CheckResultFail
}
if re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"] == nil {
log.Debugf("%s: unable to read statistics from parsed etcd logs.", prefix)
return CheckResultFail
}
if re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"].StatMax == "" {
log.Debugf("%s: unable to get p50/mean statistics from parsed data: %v", prefix, re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"])
return CheckResultFail
}
values := strings.Split(re.Provider.MustGatherInfo.ErrorEtcdLogs.FilterRequestSlowAll["all"].StatMax, " ")
if values[0] == "" {
log.Debugf("%s: unable to get parse p50/mean: %v", prefix, values)
return CheckResultFail
}
value, err := strconv.ParseFloat(values[0], 64)
if err != nil {
log.Debugf("%s: unable to convert p50/mean to float: %v", prefix, err)
return CheckResultFail
}
if value >= 1500 {
log.Debugf("%s acceptance criteria: want=[%v] got=[%v]", prefix, "<1500", value)
return CheckResultFail
}
return CheckResultPass
},
})

@mtulio
Copy link
Contributor Author

mtulio commented Sep 25, 2023

need to be merged after fixing broken master https://github.com/redhat-openshift-ecosystem/provider-certification-tool/commits/main
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 25, 2023
@mtulio
Copy link
Contributor Author

mtulio commented Sep 25, 2023

/assign @rvanderp3

docs/review/rules.md Outdated Show resolved Hide resolved
docs/review/rules.md Outdated Show resolved Hide resolved
@mtulio mtulio added the kind/documentation Categorizes issue or PR as related to documentation. label Sep 25, 2023
Co-authored-by: Richard Vanderpool <49568690+rvanderp3@users.noreply.github.com>
@mtulio
Copy link
Contributor Author

mtulio commented Sep 25, 2023

Thanks @rvanderp3 ! Fixed.
Also fixed the main branch doc build (CI):
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 25, 2023
@rvanderp3
Copy link
Contributor

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 25, 2023
@openshift-ci
Copy link

openshift-ci bot commented Sep 25, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rvanderp3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 25, 2023
@openshift-merge-robot openshift-merge-robot merged commit 32853e2 into redhat-openshift-ecosystem:main Sep 25, 2023
6 checks passed
@mtulio mtulio deleted the feat-checks-etcd branch September 25, 2023 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/documentation Categorizes issue or PR as related to documentation. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants