Skip to content

Commit

Permalink
OPCT-9: introduce the report cmd to automate the review steps (#22)
Browse files Browse the repository at this point in the history
This PR introduces the command `report` to be used on the result review
stage (post-submit to Red Hat), applying different filters according to
checks on the Baseline results and CI flake tests (using Sippy API as a
data source)

This command also dumps the failed filtered tests into text files,
extracting the details for each failed test to be used while
troubleshooting the post-execution.

The following files are optional files to be used alongside the archive:

- baseline results : reference artifacts execution in the OPCT CI
Pipeline to help to review the providers artifact

Example of CLI
```bash
./openshift-provider-cert-linux-amd64 report \
  --baseline ./opct_baseline-ocp_4.11.4-platform_none-aws-202210102258_sonobuoy_7a895e01-3d3e-44cc-a5d3-ac4f2ed678fd.tar.gz \
  202210132151_sonobuoy_6af99324-2dc6-4de4-938c-200b84111481.tar.gz  \
  --save-to ./processed-results
```

There are spikes to understand if we can collect include/collect it on
the certification runtime, so we can allow end-users to use this command
before submitting their results.

https://issues.redhat.com/browse/OPCT-9
https://issues.redhat.com/browse/OPCT-10
  • Loading branch information
mtulio committed Feb 16, 2023
1 parent dabe858 commit f3df29e
Show file tree
Hide file tree
Showing 14 changed files with 1,741 additions and 67 deletions.
2 changes: 2 additions & 0 deletions cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (

"github.com/redhat-openshift-ecosystem/provider-certification-tool/pkg/assets"
"github.com/redhat-openshift-ecosystem/provider-certification-tool/pkg/destroy"
"github.com/redhat-openshift-ecosystem/provider-certification-tool/pkg/report"
"github.com/redhat-openshift-ecosystem/provider-certification-tool/pkg/retrieve"
"github.com/redhat-openshift-ecosystem/provider-certification-tool/pkg/run"
"github.com/redhat-openshift-ecosystem/provider-certification-tool/pkg/status"
Expand Down Expand Up @@ -65,6 +66,7 @@ func init() {
rootCmd.AddCommand(run.NewCmdRun())
rootCmd.AddCommand(status.NewCmdStatus())
rootCmd.AddCommand(version.NewCmdVersion())
rootCmd.AddCommand(report.NewCmdReport())

// Link in child commands direct from Sonobuoy
rootCmd.AddCommand(app.NewSonobuoyCommand())
Expand Down
135 changes: 76 additions & 59 deletions docs/support-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ pip3 install o-must-gather --user
### Download Baseline CI results <a name="setup-download-baseline"></a>

The Openshift provider certification tool is run periodically ([source code](https://github.com/openshift/release/blob/master/ci-operator/jobs/redhat-openshift-ecosystem/provider-certification-tool/redhat-openshift-ecosystem-provider-certification-tool-main-periodics.yaml)) in OpenShift CI using the latest stable release of OpenShift.
These baseline results are stored long-term in an AWS S3 bucket (`s3://openshift-provider-certification/baseline-results`). An HTML listing can be found here: https://openshift-provider-certification.s3.us-west-2.amazonaws.com/index.html.
These baseline results are stored long-term in an AWS S3 bucket (`s3://openshift-provider-certification/baseline-results`). An HTML listing can be found [here](https://openshift-provider-certification.s3.us-west-2.amazonaws.com/index.html).
These baseline results should be used as a reference when reviewing a partner's certification results.

1. Identify cluster version in the partner's must gather:
Expand All @@ -71,26 +71,14 @@ $ file 4.11.13-20221125.tar.gz
4.11.13-20221125.tar.gz: gzip compressed data, original size modulo 2^32 430269440
```

4. Proceed with comparing baseline results with actual provider results.
- Download the suite test list for the version used by the partner

```bash
RELEASE_VERSION="4.11.4->CHANGE_ME"
TESTS_IMG=$(oc adm release info ${RELEASE_VERSION} --image-for='tests')
oc image extract ${TESTS_IMG} --file="/usr/bin/openshift-tests"
chmod u+x ./openshift-tests
./openshift-tests run --dry-run kubernetes/conformance > ./test-list_openshift-tests_kubernetes-conformance.txt
./openshift-tests run --dry-run openshift/conformance > ./test-list_openshift-tests_openshift-validated.txt
```

### Download Partner Results <a name="setup-download-results"></a>

- Download the Provider certification archive from the Support Case. Example file name: `retrieved-archive.tar.gz`
- Download the Must-gather from the Support Case. Example file name: `must-gather.tar.gz`

## Review guide: exploring the failed tests <a name="review-process"></a>

The steps below use the subcommand `process` to apply filters on the failed tests and help to keep the initial focus of the investigation on the failures exclusively on the partner's results.
The steps below use the subcommand `report` to apply filters on the failed tests and help to keep the initial focus of the investigation on the failures exclusively on the partner's results.

The filters use only tests included in the respective suite, isolating from common failures identified on the Baseline results or Flakes from CI. To see more details about the filters, read the [dev documentation describing filters flow](./dev.md#dev-diagram-filters).

Expand All @@ -106,11 +94,11 @@ Required to use this section:

Compare the provider results with the baseline:

> `--baseline` is optional. You must use a trusted baseline results to apply the filters. Otherwise leave it unset.
```bash
./openshift-provider-cert-linux-amd64 process \
./openshift-provider-cert-linux-amd64 report \
--baseline ./opct_baseline-ocp_4.11.4-platform_none-provider-date_uuid.tar.gz \
--base-suite-ocp ./test-list_openshift-tests_openshift-validated.txt \
--base-suite-k8s ./test-list_openshift-tests_kubernetes-conformance.txt \
./<timestamp>_sonobuoy_<uuid>.tar.gz
```

Expand All @@ -119,11 +107,9 @@ Compare the provider results with the baseline:
Compare the results and extract the files (option `--save-to`) to the local directory `./results-provider-processed`:

```bash
./openshift-provider-cert-linux-amd64 process \
./openshift-provider-cert-linux-amd64 report \
--baseline ./opct_baseline-ocp_4.11.4-platform_none-provider-date_uuid.tar.gz \
--base-suite-ocp ./test-list_openshift-tests_openshift-validated.txt \
--base-suite-k8s ./test-list_openshift-tests_kubernetes-conformance.txt \
--save-to processed \
--save-to ./results-provider-processed \
./<timestamp>_sonobuoy_<uuid>.tar.gz
```

Expand All @@ -134,69 +120,100 @@ This is the expected output:
```bash
(...Header...)

$ $CLI_PATH/openshift-provider-cert-linux-amd64-process0 report 4.12.1-20230131.tar.gz --save-to ./results-provider-processed
INFO[2023-02-01T01:26:25-03:00] Processing Plugin 05-openshift-cluster-upgrade...
INFO[2023-02-01T01:26:25-03:00] Ignoring Plugin 05-openshift-cluster-upgrade
INFO[2023-02-01T01:26:25-03:00] Processing Plugin 10-openshift-kube-conformance...
INFO[2023-02-01T01:26:25-03:00] Processing Plugin 20-openshift-conformance-validated...
INFO[2023-02-01T01:26:26-03:00] Processing Plugin 99-openshift-artifacts-collector...
INFO[2023-02-01T01:26:26-03:00] Ignoring Plugin 99-openshift-artifacts-collector
WARN[2023-02-01T01:26:27-03:00] Ignoring to populate source 'baseline'. Missing or invalid baseline artifact (-b):

> OpenShift Provider Certification Summary <

Kubernetes API Server version : v1.25.4+a34b9e9
OpenShift Container Platform version : 4.12.1
- Cluster Update Progressing : False
- Cluster Target Version : Cluster version is 4.12.1

OCP Infrastructure:
- PlatformType : None
- Name : ci-op-nykh40v7-7280e-bsghd
- Topology : HighlyAvailable
- ControlPlaneTopology : HighlyAvailable
- API Server URL : https://api.ci-op-nykh40v7-7280e.vmc-ci.devcluster.openshift.com:6443
- API Server URL (internal) : https://api-int.ci-op-nykh40v7-7280e.vmc-ci.devcluster.openshift.com:6443

Plugins summary by name: Status [Total/Passed/Failed/Skipped] (timeout)
- 10-openshift-kube-conformance : failed [691/669/22/0] (0)
- 20-openshift-conformance-validated : failed [3793/1627/52/2114] (0)

Health summary: [A=True/P=True/D=True]
- Cluster Operators : [33/0/0]
- Node health : 6/6 (100%)
- Pods health : 250/258 (96%)

> Processed Summary <

Total Tests suites:
- kubernetes/conformance: 353
- openshift/conformance: 3488
Total tests by conformance suites:
- kubernetes/conformance: 359
- openshift/conformance: 3454

Total Tests by Certification Layer:
- openshift-kube-conformance:
Result Summary by conformance plugins:
- 10-openshift-kube-conformance:
- Status: failed
- Total: 675
- Passed: 654
- Failed: 21
- Total: 691
- Passed: 669
- Failed: 22
- Timeout: 0
- Skipped: 0
- Failed (without filters) : 21
- Failed (Filter SuiteOnly): 2
- Failed (Filter Baseline : 2
- Failed (without filters) : 22
- Failed (Filter SuiteOnly): 0
- Failed (Filter CI Flakes): 0
- Status After Filters : pass
- openshift-conformance-validated:
- 20-openshift-conformance-validated:
- Status: failed
- Total: 3818
- Passed: 1708
- Failed: 61
- Total: 3793
- Passed: 1627
- Failed: 52
- Timeout: 0
- Skipped: 2049
- Failed (without filters) : 61
- Failed (Filter SuiteOnly): 32
- Failed (Filter Baseline : 7
- Failed (Filter CI Flakes): 2
- Skipped: 2114
- Failed (without filters) : 52
- Failed (Filter SuiteOnly): 22
- Failed (Filter CI Flakes): 3
- Status After Filters : failed

Total Tests by Certification Layer:
Result details by conformance plugins:


=> openshift-kube-conformance: (2 failures, 2 flakes)
=> 10-openshift-kube-conformance: (0 failures, 0 flakes)

--> Failed tests to Review (without flakes) - Immediate action:
<empty>

--> Failed flake tests - Statistic from OpenShift CI
Flakes Perc TestName
1 0.138% [sig-api-machinery] CustomResourcePublishOpenAPI [Privileged:ClusterAdmin] works for multiple CRDs of same group and version but different kinds [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]
2 0.275% [sig-api-machinery] ResourceQuota should create a ResourceQuota and capture the life of a secret. [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]
<empty>


=> openshift-conformance-validated: (7 failures, 5 flakes)
=> 20-openshift-conformance-validated: (22 failures, 19 flakes)

--> Failed tests to Review (without flakes) - Immediate action:
[sig-network-edge][Feature:Idling] Unidling should handle many TCP connections by possibly dropping those over a certain bound [Serial] [Skipped:Network/OVNKubernetes] [Suite:openshift/conformance/serial]
[sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with pvc data source [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-arch] Managed cluster should set requests but not limits [Suite:openshift/conformance/parallel]
[sig-cli] oc basics can get version information from API [Suite:openshift/conformance/parallel]
[sig-scheduling] SchedulerPriorities [Serial] PodTopologySpread Scoring validates pod should be preferably scheduled to node which makes the matching pods more evenly distributed [Suite:openshift/conformance/serial] [Suite:k8s]

--> Failed flake tests - Statistic from OpenShift CI
Flakes Perc TestName
101 10.576% [sig-arch][bz-DNS][Late] Alerts alert/KubePodNotReady should not be at or above pending in ns/openshift-dns [Suite:openshift/conformance/parallel]
67 7.016% [sig-arch][bz-Routing][Late] Alerts alert/KubePodNotReady should not be at or above pending in ns/openshift-ingress [Suite:openshift/conformance/parallel]
2 0.386% [sig-imageregistry] Image registry should redirect on blob pull [Suite:openshift/conformance/parallel]
32 4.848% [sig-network][Feature:EgressFirewall] egressFirewall should have no impact outside its namespace [Suite:openshift/conformance/parallel]
11 2.402% [sig-network][Feature:EgressFirewall] when using openshift-sdn should ensure egressnetworkpolicy is created [Suite:openshift/conformance/parallel]

Data Saved to directory './processed/'
Flakes Perc TestName
1 0.134% [sig-api-machinery][Feature:APIServer] anonymous browsers should get a 403 from / [Suite:openshift/conformance/parallel]
1 0.134% [sig-arch] Managed cluster should ensure control plane pods do not run in best-effort QoS [Suite:openshift/conformance/parallel]
748 100.000% [sig-arch] Managed cluster should ensure platform components have system-* priority class associated [Suite:openshift/conformance/parallel]
-- -- [sig-arch][Late] clients should not use APIs that are removed in upcoming releases [apigroup:config.openshift.io] [Suite:openshift/conformance/parallel]
(...)

Data Saved to directory './results-provider-processed/'

```

> TODO: create the index with a legend with references to the output.

### Understanding the extracted results <a name="review-process-explain"></a>

Expand All @@ -221,7 +238,7 @@ Considerations:
Example of files on the extracted directory:

```bash
$ tree processed/
$ tree ./results-provider-processed
processed/
├── failures-baseline
[redacted]
Expand Down
14 changes: 10 additions & 4 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ require (
github.com/spf13/viper v1.11.0
github.com/stretchr/testify v1.7.1
github.com/vmware-tanzu/sonobuoy v0.56.10
github.com/xuri/excelize/v2 v2.6.1
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c
k8s.io/api v0.23.6
k8s.io/apimachinery v0.23.6
Expand Down Expand Up @@ -52,9 +53,12 @@ require (
github.com/moby/spdystream v0.2.0 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/mohae/deepcopy v0.0.0-20170929034955-c48cc78d4826 // indirect
github.com/pelletier/go-toml v1.9.4 // indirect
github.com/pelletier/go-toml/v2 v2.0.0-beta.8 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/richardlehane/mscfb v1.0.4 // indirect
github.com/richardlehane/msoleps v1.0.3 // indirect
github.com/rifflock/lfshook v0.0.0-20180920164130-b9218ef580f5 // indirect
github.com/russross/blackfriday/v2 v2.1.0 // indirect
github.com/satori/go.uuid v1.2.1-0.20181028125025-b2ce2384e17b // indirect
Expand All @@ -64,10 +68,12 @@ require (
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/subosito/gotenv v1.2.0 // indirect
golang.org/x/crypto v0.0.0-20220411220226-7b82a4e95df4 // indirect
golang.org/x/net v0.0.0-20220726230323-06994584191e // indirect
github.com/xuri/efp v0.0.0-20220603152613-6918739fd470 // indirect
github.com/xuri/nfp v0.0.0-20220409054826-5e722a1d9e22 // indirect
golang.org/x/crypto v0.0.0-20220817201139-bc19a97f63c8 // indirect
golang.org/x/net v0.0.0-20220812174116-3211cb980234 // indirect
golang.org/x/oauth2 v0.0.0-20220411215720-9780585627b5 // indirect
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f // indirect
golang.org/x/sys v0.0.0-20220728004956-3c1f35247d10 // indirect
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211 // indirect
golang.org/x/text v0.3.7 // indirect
golang.org/x/time v0.0.0-20210723032227-1f47c861a9ac // indirect
Expand All @@ -76,7 +82,7 @@ require (
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/ini.v1 v1.66.4 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b // indirect
gopkg.in/yaml.v3 v3.0.0 // indirect
k8s.io/klog v1.0.0 // indirect
k8s.io/klog/v2 v2.30.0 // indirect
k8s.io/kube-openapi v0.0.0-20211115234752-e816edb12b65 // indirect
Expand Down

0 comments on commit f3df29e

Please sign in to comment.