Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve certificate condition checking and error logging #4298

Merged
merged 3 commits into from
Aug 5, 2021

Conversation

inteon
Copy link
Member

@inteon inteon commented Aug 3, 2021

What this PR does / why we need it:

In #4234 there were intermittent test failures in the cainjector E2E tests:

https://prow.build-infra.jetstack.net/view/gs/jetstack-logs/pr-logs/pull/jetstack_cert-manager/4234/pull-cert-manager-e2e-v1-21/1422524737686343681
https://prow.build-infra.jetstack.net/view/gs/jetstack-logs/pr-logs/pull/jetstack_cert-manager/4234/pull-cert-manager-e2e-v1-21/1421109702468571136

In #4239, these flakes were partially resolved, although there is still an uncovered edge case where the flakes still persist.
This PR introduces more robust checking for both conditions Ready=true and Issuing not set.
Additionally it improves the logging in case of an error.
The logic for doing this is now aggregated in test/e2e/framework/helper/certificates.go.
The certificate functions in test/e2e/util/util.go are now not used anymore and deprecation comments have been added.

Release note:

/kind flake

@jetstack-bot
Copy link
Contributor

@inteon: The label(s) kind/bugfix cannot be applied, because the repository doesn't have them.

In response to this:

What this PR does / why we need it:

In #4234 there were intermittent test failures in the cainjector E2E tests:

https://prow.build-infra.jetstack.net/view/gs/jetstack-logs/pr-logs/pull/jetstack_cert-manager/4234/pull-cert-manager-e2e-v1-21/1422524737686343681
https://prow.build-infra.jetstack.net/view/gs/jetstack-logs/pr-logs/pull/jetstack_cert-manager/4234/pull-cert-manager-e2e-v1-21/1421109702468571136

In #4239, these flakes were partially resolved, although there is still an uncovered edge case where the flakes still persist.
This PR introduces more robust checking for both conditions Ready=true and Issuing not set.
Additionally it improves the logging in case of an error.
The logic for doing this is now aggregated in test/e2e/framework/helper/certificates.go.
The certificate functions in test/e2e/util/util.go are now not used anymore and deprecation comments have been added.

Release note:

/kind bugfix

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jetstack-bot jetstack-bot added dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. area/testing Issues relating to testing needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Aug 3, 2021
@jetstack-bot
Copy link
Contributor

Hi @inteon. Thanks for your PR.

I'm waiting for a jetstack or cert-manager member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jetstack-bot jetstack-bot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 3, 2021
@irbekrm
Copy link
Contributor

irbekrm commented Aug 3, 2021

/ok-to-test

@jetstack-bot jetstack-bot added ok-to-test release-note-none Denotes a PR that doesn't merit a release note. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 3, 2021
@inteon inteon force-pushed the fix_test_flake branch 3 times, most recently from 180815f to 6dd0c94 Compare August 3, 2021 19:03
Signed-off-by: Inteon <42113979+inteon@users.noreply.github.com>
Copy link
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @inteon

I'm struggling to get my head round the changes and I've asked a bunch of (potentially stupid) questions below.

Please expand the comments for the helper functions so that it's clear to users like me how they work and when they should be used.

Also please write a bit more about exactly what is wrong with the existing code that causes the tests to still be flaky.

test/e2e/framework/helper/certificates.go Show resolved Hide resolved
test/e2e/framework/helper/certificates.go Outdated Show resolved Hide resolved
test/e2e/framework/helper/certificates.go Outdated Show resolved Hide resolved
test/e2e/framework/helper/certificates.go Outdated Show resolved Hide resolved
test/e2e/framework/helper/certificates.go Outdated Show resolved Hide resolved
test/e2e/util/util.go Outdated Show resolved Hide resolved
@wallrj
Copy link
Member

wallrj commented Aug 4, 2021

/kind cleanup

@jetstack-bot jetstack-bot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Aug 4, 2021
Signed-off-by: Inteon <42113979+inteon@users.noreply.github.com>
@jetstack-bot jetstack-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 4, 2021
@inteon inteon force-pushed the fix_test_flake branch 2 times, most recently from 4d27e2c to e388d68 Compare August 4, 2021 15:08
Copy link
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @inteon

There's one thing that looks like a mistake (waiting for Certificate with name of Secret)
But all my other comments are optional.

Please address the points or answer them and unhold this if you think it's ok to merge.

/hold
/lgtm

test/e2e/framework/helper/certificates.go Outdated Show resolved Hide resolved
test/e2e/suite/issuers/acme/certificate/http01.go Outdated Show resolved Hide resolved
test/e2e/suite/issuers/ca/certificate.go Show resolved Hide resolved

By("changing the name of the corresponding secret in the cert")
secretName := types.NamespacedName{Name: cert.Spec.SecretName, Namespace: f.Namespace.Name}
cert = cert.DeepCopy() // DeepCopy before updating
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, was that the cause of the intermittent failure?
Did you consider doing the DeepCopy in the WaitForCertificate... helper functions, to help avoid this same bug happening elsewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be related, but I fear that there are multiple errors causing failures.
I would like to keep the cert = cert.DeepCopy() visible in the code, so that we can more easily see when DeepCopies are needed.

test/e2e/util/util.go Show resolved Hide resolved
@jetstack-bot jetstack-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 4, 2021
@jetstack-bot jetstack-bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 4, 2021
@jetstack-bot jetstack-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 4, 2021
Signed-off-by: Inteon <42113979+inteon@users.noreply.github.com>
Copy link
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest adjustments @inteon

/lgtm
/approve
/kind cleanup

By("Waiting for Certificate to exist")
err := util.WaitForCertificateToExist(certClient, certificateSecretName, foreverTestTimeout)
cert, err := f.Helper().WaitForCertificateToExist(f.Namespace.Name, certificateSecretName, time.Second*60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here again, we seem to be using the name of the Secret as the name of the Certificate.

Suggested change
cert, err := f.Helper().WaitForCertificateToExist(f.Namespace.Name, certificateSecretName, time.Second*60)
cert, err := f.Helper().WaitForCertificateToExist(f.Namespace.Name, certificateName, time.Second*60)

I guess it works because both names are the same.

I can't face another round of code review and E2E tests. Let's just leave it.

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 5, 2021
@jetstack-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: inteon, wallrj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jetstack-bot jetstack-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 5, 2021
@jetstack-bot jetstack-bot merged commit 8d0c228 into cert-manager:master Aug 5, 2021
@jetstack-bot jetstack-bot added this to the v1.5 milestone Aug 5, 2021
@wallrj
Copy link
Member

wallrj commented Aug 7, 2021

/cherry-pick release-1.5

@jetstack-bot
Copy link
Contributor

@wallrj: new pull request created: #4322

In response to this:

/cherry-pick release-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/testing Issues relating to testing dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. ok-to-test release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants