New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1853253: remove expired TLS secret for Thanos Ruler #878
Bug 1853253: remove expired TLS secret for Thanos Ruler #878
Conversation
simonpasquier
commented
Jul 22, 2020
- I added CHANGELOG entry for this change.
- No user facing changes, so no entry in CHANGELOG was needed.
@simonpasquier: This pull request references Bugzilla bug 1853253, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The first version I've pushed was without the fix to verify that the test was failing and it did.
|
f6919d0
to
85d3f52
Compare
/retest |
1 similar comment
/retest |
/hold cancel |
cc @openshift/openshift-team-monitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
@@ -819,6 +844,11 @@ func assertGRPCTLSRotation(t *testing.T) { | |||
if err != nil { | |||
t.Fatal(err) | |||
} | |||
|
|||
got := countGRPCSecrets(f.Ns) + countGRPCSecrets(f.UserWorkloadMonitoringNs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs retries because the main grpc-tls
secret might have been rotated but not yet propagated to the other secrets.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
85d3f52
to
46a5e1b
Compare
/hold cancel |
if err != nil { | ||
t.Fatalf("error waiting for grpc-tls secret: %v", err) | ||
} | ||
|
||
expected := countGRPCSecrets(f.Ns) + countGRPCSecrets(f.UserWorkloadMonitoringNs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this calculation of expected grpc secret count is racy. The test code here waits until the central grpc-tls
secret is created. However at this point of time there is no guarantee that all derived hashed grpc secrets are also already created as their creation is orchestrated in separate task executions which happen concurrently.
Instead, I propose to change this code here to have a const number of five (5) expected secrets, because we know them in forehand, as they are:
openshift-monitoring/grpc-tls
openshift-monitoring/prometheus-k8s-grpc-tls-[hash]
openshift-user-workload-monitoring/prometheus-user-workload-grpc-tls-[hash]
openshift-monitoring/thanos-querier-grpc-tls-[hash]
openshift-user-workload-monitoring/thanos-ruler-grpc-tls-[hash]
@openshift/openshift-team-monitoring wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoding value here seems fine to me, but I would expect an in-code comment about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulfantom added comment in addition to the commit message, ptal.
/hold |
/cc @openshift/openshift-team-monitoring ptal |
@s-urbaniak: GitHub didn't allow me to request PR reviews from the following users: ptal. Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @openshift/openshift-team-monitoring |
/test e2e-aws |
e2e test failure seems to be unrelated to monitoring |
… secrets. The calculation of expected grpc secret count is potentially racy. The current test code waits until the central grpc-tls secret is created. However at this point of time there is no guarantee that all derived hashed grpc secrets are also already created as their creation is orchestrated in separate task executions which happen concurrently. Instead, this changes it to have a const number of five (5) expected secrets, because we know them in forehand.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: paulfantom, simonpasquier The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e-aws |
/hold cancel |
/retest Please review the full test history for this PR and help us cut down flakes. |
@simonpasquier: All pull requests linked via external trackers have merged: openshift/cluster-monitoring-operator#878. Bugzilla bug 1853253 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |