Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-19716: Enable proxy cert test to run in CI #1823

Merged
merged 4 commits into from
Oct 3, 2023

Conversation

saifshaikh48
Copy link
Contributor

@saifshaikh48 saifshaikh48 commented Sep 22, 2023

Proxy tests were not actually running due to an issue with the proxy enabled check.
When enabled, a few of the tests were failing so one was patched and the other was
disabled with the fix coming in another PR (#1800).

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 22, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 22, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@saifshaikh48 saifshaikh48 changed the title Fix proxy tests being skipped WIP: Fix proxy tests being skipped Sep 22, 2023
@saifshaikh48
Copy link
Contributor Author

/test vsphere-proxy-e2e-operator

@saifshaikh48 saifshaikh48 changed the title WIP: Fix proxy tests being skipped [WIP] OCPBUGS-19716: Enable proxy cert test to run in CI Sep 26, 2023
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Sep 26, 2023
@openshift-ci-robot
Copy link

@saifshaikh48: This pull request references Jira Issue OCPBUGS-19716, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @rrasouli

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Proxy tests where not actually running due to an issue with the proxy enabled check.
When enabled, a few of the tests were failing so those are patched as well.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@saifshaikh48 saifshaikh48 force-pushed the fixproxycerttest2 branch 3 times, most recently from b307bcd to 0dc1a99 Compare September 26, 2023 18:56
@saifshaikh48
Copy link
Contributor Author

/test vsphere-proxy-e2e-operator

cherry-picked from #1800 which had green CI. let's see if there is a timing issue in the tests

@saifshaikh48 saifshaikh48 force-pushed the fixproxycerttest2 branch 2 times, most recently from 2cb0f8d to d145394 Compare September 27, 2023 14:06
@saifshaikh48
Copy link
Contributor Author

/test vsphere-proxy-e2e-operator

@openshift-ci-robot
Copy link

@saifshaikh48: This pull request references Jira Issue OCPBUGS-19716, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @rrasouli

In response to this:

Proxy tests where not actually running due to an issue with the proxy enabled check.
When enabled, a few of the tests were failing so one was patched and the other was
disabled with the fix coming in another PR (#1800).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@saifshaikh48 saifshaikh48 changed the title [WIP] OCPBUGS-19716: Enable proxy cert test to run in CI OCPBUGS-19716: Enable proxy cert test to run in CI Sep 27, 2023
Comment on lines +73 to +92
// First requires data to be written to a file and then provide the file path the cert constructor
"Set-Content C:\\Temp\\cert.pem $certString;"+
"$expectedCert=[System.Security.Cryptography.X509Certificates.X509Certificate2]::new(\\\"C:\\Temp\\cert.pem\\\");"+
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried with passing the cert byte data directly to the X509Certificate2 constructor and, interestingly enough, reading from a file was consistently speedier.

Copy link
Contributor

@jrvaldes jrvaldes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saifshaikh48 thanks for working on this, PTAL at my comments.

@@ -160,7 +160,7 @@ if [[ "$TEST" = "all" || "$TEST" = "basic" ]]; then
printf "\n####### Testing service reconciliation #######\n" >> "$ARTIFACT_DIR"/wmco.log
go test ./test/e2e/... -run=TestWMCO/service_reconciliation -v -timeout=20m -args $GO_TEST_ARGS
printf "\n####### Testing cluster-wide proxy #######\n" >> "$ARTIFACT_DIR"/wmco.log
go test ./test/e2e/... -run=TestWMCO/cluster-wide_proxy -v -timeout=10m -args $GO_TEST_ARGS
go test ./test/e2e/... -run=TestWMCO/cluster-wide_proxy -v -timeout=45m -args $GO_TEST_ARGS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a 350% timeout increase. Did you run this with a dev cluster? Wondering what is the average run time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what part of the test is taking a long time to run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

average runtime is 25-30min, cert test is the one that takes a long time as CNO injects 100+ certs into the proxy certs CM. Checking each one on each Win node takes a long while

test/e2e/proxy_test.go Show resolved Hide resolved
@@ -160,7 +160,7 @@ if [[ "$TEST" = "all" || "$TEST" = "basic" ]]; then
printf "\n####### Testing service reconciliation #######\n" >> "$ARTIFACT_DIR"/wmco.log
go test ./test/e2e/... -run=TestWMCO/service_reconciliation -v -timeout=20m -args $GO_TEST_ARGS
printf "\n####### Testing cluster-wide proxy #######\n" >> "$ARTIFACT_DIR"/wmco.log
go test ./test/e2e/... -run=TestWMCO/cluster-wide_proxy -v -timeout=10m -args $GO_TEST_ARGS
go test ./test/e2e/... -run=TestWMCO/cluster-wide_proxy -v -timeout=45m -args $GO_TEST_ARGS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what part of the test is taking a long time to run?

test/e2e/proxy_test.go Show resolved Hide resolved
test/e2e/proxy_test.go Show resolved Hide resolved
@saifshaikh48 saifshaikh48 force-pushed the fixproxycerttest2 branch 6 times, most recently from 1ae9961 to f53040d Compare September 28, 2023 20:14
@openshift-ci-robot
Copy link

@saifshaikh48: This pull request references Jira Issue OCPBUGS-19716, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @rrasouli

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Proxy tests were not actually running due to an issue with the proxy enabled check.
When enabled, a few of the tests were failing so one was patched and the other was
disabled with the fix coming in another PR (#1800).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@saifshaikh48
Copy link
Contributor Author

/test vsphere-proxy-e2e-operator

Green locally 🚀

Copy link
Contributor

@aravindhp aravindhp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[proxy_test] Disable env var removal test

Lets be consistent with commit naming. [tests] Disable proxy env var removal

test/e2e/main_test.go Outdated Show resolved Hide resolved
test/e2e/proxy_test.go Show resolved Hide resolved
This test is flaking consistently after we fix the way we check for proxy
enabled in the e2e tests. This test has been temporarily disabled until we
address it (timing issue).
Comment on lines 91 to 92
// proxyEnabled indicates whether a cluster-wide proxy is in use in the test environment
proxyEnabled bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this be added to the testContext struct?
Based on the way this is used I suggest just calling testContext.client.IsProxyEnabled() instead.
Caching it in this way doesn't seem useful enough to justify, and could cause problems if the proxy status is changed after testContext is initialized.

@@ -93,3 +93,12 @@ func (o *OpenShift) GetInfrastructure() (*config.Infrastructure, error) {
}
return infra, nil
}

// IsProxyEnabled queries the Proxy resource to see if a cluster-wide proxy is enabled in this environment
func (o *OpenShift) IsProxyEnabled() (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider renaming this to ProxyEnabled, just personal preference as it seems more natural to me:

if o.ProxyEnabled{}
vs
if o.IsProxyEnabled{}

Comment on lines 296 to 297
var patches []*patch.JSONPatch
patches = append(patches, patch.NewJSONPatch("replace", "/spec/trustedCA/name", userCABundleName))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patches := []*patch.JSONPatch{patch.NewJSONPatch("replace", "/spec/trustedCA/name", userCABundleName)}

"Where-Object {$_.Subject -eq '%s'}).Count", string(subjectBytes))
out, err := tc.runPowerShellSSHJob(fmt.Sprintf("get-cert-%d", i), command, addr)
// Read in one cert at a time and test it exists in the Windows instance's system store
scanner := bufio.NewScanner(strings.NewReader(trustedCABundle))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at https://pkg.go.dev/encoding/pem#Decode?
It looks like this functionality might be built in.

    for block, rest := pem.Decode(data); block != nil; block, rest = pem.Decode(rest) {
        if  block.Type == "CERTIFICATE"{

Bringing this up because this code + splitAtPEMCert() seems to be duplicated from the WICD code. Keeping things simpler here and using stdlib functions might make things easier to maintain.

Not sure if decoding the actual data causes a problem for this. Re-encoding the individual blocks with pem.Encode before doing the comparison is always an option :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I initially was using pem.Decode in the source code but had to move away from that approach to avoid reading an arbitrary number of certs into memory. I don't see any issue using it in the test code (where we are controlling the # of certs)

saifshaikh48 and others added 3 commits October 2, 2023 13:44
This commit updates the way we are retrieving the proxy
object in test cluster.
The previous check cluster.isProxyEnabled() looks at the env
vars in the environment from where it is called. The test pod
doesn't run on the e2e cluster where proxy is enabled and WMCO
is running. It runs in an ephemeral namespace on the CI cluster
where the check will always return false.

Co-authored-by: Mansi Kulkarni <mankulka@redhat.com>
In proxied clusters, this commit configures the proxy to use
user-provided additional certs by creating a ConfigMap (that CNO merges
with other proxy certs it deems necessary).
This commit fixes the proxy certificate import test. Ensuring certs
where present in the Windows store by comparing subjects was not working
properly since Windows and Golang extract the subject from the cert data
differently, in incompatible formats. Instead, we now compare expected
and actual certificates directly through powershell.

It moves the cert validation test before the environment variable tests
occur since the env var removal test removes the cluster-wide proxy/makes
WMCO delete the TrustedCAConfigMap which the cert test relies on.
This commit also increases the time limit for running proxy test
suite to avoid timeouts now that the fixed tests run.
Copy link
Contributor

@aravindhp aravindhp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from me. I will let @sebsoto give the approval given he had requested changes.

test/e2e/proxy_test.go Show resolved Hide resolved
@saifshaikh48
Copy link
Contributor Author

/test vsphere-proxy-e2e-operator

@openshift-merge-robot
Copy link
Contributor

@saifshaikh48: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mansikulkarni96
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 3, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 3, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: saifshaikh48, sebsoto

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 3, 2023
@aravindhp aravindhp marked this pull request as ready for review October 3, 2023 16:13
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 3, 2023
@saifshaikh48
Copy link
Contributor Author

/cherry-pick release-4.14

@openshift-cherrypick-robot

@saifshaikh48: once the present PR merges, I will cherry-pick it on top of release-4.14 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 3, 2023

@saifshaikh48: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci openshift-ci bot merged commit 6d7b8c0 into openshift:master Oct 3, 2023
15 checks passed
@openshift-ci-robot
Copy link

@saifshaikh48: Jira Issue OCPBUGS-19716: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-19716 has been moved to the MODIFIED state.

In response to this:

Proxy tests were not actually running due to an issue with the proxy enabled check.
When enabled, a few of the tests were failing so one was patched and the other was
disabled with the fix coming in another PR (#1800).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@saifshaikh48: new pull request created: #1860

In response to this:

/cherry-pick release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants