-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stackdriver logging sink failed to create on large clusters #51700
Comments
This is part of the correctness suite and large cluster tests are release-blocking. |
@x13n could you please take a look since @crassirostris is OOO? |
I can find some time to check this today. @shyamjvs Is it possible to reproduce locally? |
Yes, it should be reproducible on a large cluster. The issue is with filter string growing linearly with #nodes and then hitting the limit of 20000 chars of stackdriver api. |
Alternatively if you don't want to create a large cluster, just artificially bloating it here https://github.com/kubernetes/kubernetes/blob/master/test/e2e/instrumentation/logging/stackdrvier/utils.go#L216 should also reproduce it. |
I don't think we should filter by each node id individually, will try to see what will break if I just remove this. |
After a couple of hours, the test is still running. I will get back to this on Monday. |
[MILESTONENOTIFIER] Milestone Labels Complete Issue label settings:
Additional instructions available here
|
Hah! We need to set up a filter to separate one test running in the same project from another since there's no resource for the K8s node I'll look into it |
Would really appreciate your help here |
@crassirostris Would filtering by cluster name work? |
@igorpeshansky problem is, docker and kubelet logs are written against the gce vm resource. There are two options: remove that filter and let all the logs flow, or make a prefix filter on the vm name |
@crassirostris Ah, then can GCE labels or tags be used? |
@igorpeshansky There's |
…gs-filter Automatic merge from submit-queue Fix Stackdriver Logging tests for large clusters Fixes #51700 Due to the limit on the length of the filter, filtering out all nodes in the cluster is not possible. Removing the filter shouldn't affect the tests, since the checks are made based on the nodeIds in the cluster that are unique anyway
Ref #51718 |
The "Cluster level logging implemented by Stackdriver should ingest system logs from all nodes" e2e test is failing on our 2k-node gce clusters - https://k8s-testgrid.appspot.com/google-gce-scale#gce-large-correctness with the following error:
Seems to be because of creating a filter of O(#nodes) size:
cc @crassirostris @kubernetes/sig-instrumentation-bugs @kubernetes/sig-scalability-misc
The text was updated successfully, but these errors were encountered: