-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Stackdriver Logging e2e tests using PubSub #45255
Implement Stackdriver Logging e2e tests using PubSub #45255
Conversation
2e9736f
to
6d4d077
Compare
@fejta GCE e2e suite is failing, because of the following error:
Can we enable |
6d4d077
to
5254b89
Compare
@crassirostris which projects need this? all of them? |
@crassirostris We have an internal mechanism for managing IAM grants on our GCP projects. Sending you a link internally this this. Also fyi please |
@k8s-bot gce etcd3 e2e test this |
5254b89
to
fb44c88
Compare
@piosz Tests are passing, please take a look |
As discussed offline you need some syncrhonization around |
ab51f4a
to
f707437
Compare
f707437
to
030ce2d
Compare
WTF this comes from?
|
From HEAD:
So it seems the mentioned lib is not used at all. Did you try to remove it? |
@piosz I mean ofc I can, but wtf, why other people didn't encounter that before? |
ac7c121
to
a9adbff
Compare
@crassirostris: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@crassirostris All PRs going into 1.7 require an associated issue. Can you please file one if it doesn't exist already and reference in this PR body? |
@grodrigues3 Sure, thanks for reminding about this! |
a9adbff
to
018b85e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Does it work out of the box on Jenkins or do we need some service account for PubSub? I think the easiest way is to verify this is to merge it. Please check the Jenkins after the merge.
nodes := framework.GetReadySchedulableNodesOrDie(f.ClientSet).Items | ||
nodeCount := len(nodes) | ||
podCount := 30 * nodeCount | ||
loggingDuration := 10 * time.Minute | ||
linesPerSecond := 1000 * nodeCount | ||
linesPerPod := linesPerSecond * int(loggingDuration.Seconds()) / podCount | ||
ingestionTimeout := 60 * time.Minute | ||
ingestionTimeout := 20 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be, since we start ingesting log entries as soon as they start flowing. From the manual runs that's more than enough, but I don't know of any guarantees/SLOs here (docs say log entries should appear "right away". I suggest leaving this as it is and then applying a fix if that's not enough, though I think timing out would indicate a problem with fluentd not being performant enough, rather than test being too restrictive on time.
@@ -56,9 +60,6 @@ var _ = framework.KubeDescribe("Cluster level logging using GCL [Feature:Stackdr | |||
defer f.PodClient().Delete(podName, &meta_v1.DeleteOptions{}) | |||
} | |||
|
|||
By("Waiting for pods to succeed") | |||
time.Sleep(loggingDuration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this :D
test/e2e/cluster-logging/sd_utils.go
Outdated
} | ||
|
||
framework.Logf("Waiting for log sink to become operational") | ||
time.Sleep(1 * time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a todo here to find a better way of verifying whether the sink is working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a better way, since any solution wouldn't be able to reduce this delay to less than 10 sec, in reality probably more. Which IMO isn't worth additional complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But OTOH I agree that we shouldn't rely on sinks becoming operational faster, than in 1 minute
/retest |
I've done it already |
018b85e
to
3cada4c
Compare
/lgtm |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: crassirostris, dchen1107, piosz Associated issue: 47069 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue |
Makes tests faster and mitigates Stackdriver Logging quotas issue.
Fixes #47069