-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pre submit keeps failing: investigate why it takes minutes for pod to be ready some times #537
Comments
Is it the pod or just the ingress. Could we be running out of external ip address ? or Backend Service ? I guess I need someone to help me look at devconsole to make sure. |
I don't know but it's really crippling - pretty much 100% failure lately on my PR: |
I had to recreate the cluster as some firewall rules were missing. Could you try it again? |
is this problem solved? |
@Brian-Xincheng-Zhang it's a bit less frequent but unfortunately no there is still a problem:
https://k8s-gubernator.appspot.com/build/istio-prow/pull/istio_istio/546/istio-presubmit/253/ ps: those seem more like errors or fatals than warning btw |
OK so I am sure we are running out of external ips. I asked for more but this is going to take some time. @mandarjog proposes to use pod ip in the meantime. The e2e framework already support this we just need to add a flag. @kyessenov I am not sure if the pilot e2e supports using the pod ip instead of the ingress. Anyways should we use pod ip instead of ingress ? |
For now I don't think there's a reason we need ingress over using pod ip, and I'm all for it if it means we get faster/more reliable test execution. Long term I think we will will want to use ingress resources since Istio plans to do work on ingress (e.g. policy via Mixer on ingress traffic), but we may still be able to get away with Pod IP of the ingress proxy pod in that case. Either way we're not there yet so I don't see a reason to use ingress/assign public IPs today. |
@ldemailly to provide background on why we use a global ip. |
Use service cluster IP if you want to address Istio services. Istio doesn't support direct access to pods (yet, by design, ...). |
We also should be using ingress resources for externally exposed services. We don't have to expose ingress proxy externally. |
The way e2e tests are structured, we do not need to expose ingress as LoadBalancer type, because all requests to ingress are made from within the cluster, via kubectl exec .... However, I would like to keep the istio.yaml as LoadBalancer type, and use NodePort only when running e2e. The fix is to add a type field to the ingress template, and set it depending on the context. |
We have to wait for @yutongz changes to be in in order to use the NodePort. @Brian-Xincheng-Zhang Can you make sure rbac rules are being updated correclty as this is blocking Yutong PR. |
now it's
https://k8s-gubernator.appspot.com/build/istio-prow/pull/istio_istio/535/istio-presubmit/277/ |
Exactly, we need to solve this first #550 |
This has been resolved by Kuat and Doug. Plus increasing IP on GCP. |
it's back: https://k8s-gubernator.appspot.com/build/istio-prow/pull/istio_istio/592/e2e-suite-rbac-auth/61/ W0823 20:06:22.939] I0823 20:06:22.938362 889 mixer_test.go:458] Error talking to productpage: Get http://10.150.0.6:30298/productpage: dial tcp 10.150.0.6:30298: getsockopt: connection refused |
So I looked at the last run from prow. https://prow.istio.io/?job=e2e-suite-no_rbac-no_auth, The last 36 runs have been passing without flake. The only errors that I see were mine, and I was acutally changing the framework. auth enabled never passes, and rbac + no_auth is flaky. It seems to mostly be the mixer test that is failing. I did find some istioctl exec format issues, and I am thinking this could be because we might be uploading the file (running the presubmit while doing the smoke test) when we are downloading it. I ll add some condition to check if an image exist before pushing (or even better building) it. |
check pilot's ? |
I counted less than 5% (3 actual flakes out of 69 runs) flakiness on the run combining last mixer-e2esmoktest pilot-e2esmoketest and e2e-suite-no_rbac-no_auth runs, which I think is pretty good for an e2e test. Obviously less would be much better :) Coming back on the exec format, it cannot be an issue when we download a binary that we uploading, it has to be related to the way we download and save the file. |
@ldemailly the thing that you opened is for rbac which is not presubmit, maybe it was at some point but by mistake. I would like to close this bug and open specific issues to help resolution. I think we should open different issues on why the rbac and auth test are failing. What do you think ? |
exec format error is that the file is actually missing, I guess we are not checking the http code :). Will file a bug for this. |
It is an RBAC issue only if the logs shows some error type cannot create/list/... . |
The only difference is that we apply the rbac yaml the rest of the test is
unchanged. I think it s safe to assume it is rbac related.
Anyway, we should capture those in different issue as I don't see the point
of leaving an endless issue opened.
…On Aug 27, 2017 7:51 AM, "Andra Cismaru" ***@***.***> wrote:
It is an RBAC issue only if the logs shows some error type cannot
create/list/... .
Otherwise, the simple fact it is rbac test failing does not indicate an
rbac issue.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#537 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AQ80Z-NTLD08G2xZ3thj1QuaefW5qbVsks5scYJ4gaJpZM4Or9y6>
.
|
istio/old_pilot_repo#1102 for instance doesn't seem to go through |
One more time, the issue is with pilot-presubmit not the e2e. So I am closing this. Please open issues for new problems. I will open a new issue for the RBAC one. |
not sure the value of closing and opening new issues but sure... |
* Change mock redis to use fork repo to improve code coverage. * Remove CMakeLists. Former-commit-id: eb83106ee5253af53ca44d05b1edf15d4c9a4565
* Change mock redis to use fork repo to improve code coverage. * Remove CMakeLists. Former-commit-id: 2e6870ddc1af2f404fd95912ad28b314fd39a01d
* Updated mixer config to bring in latest changes Updated to get to 3194864 from istio/istio master branch. * Updated after running make gen
* Remove demo-auth profile Fixes istio#18646 * Really kill the demo-auth
Co-authored-by: maistra-bot <null>
…ntainers [jaeger] Adding optional initContainers for jaeger query and ingester deployments
see #533 and many build failures after > 2mins of retry
The text was updated successfully, but these errors were encountered: