-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kind e2e network policy use only 10 tests in parallel #26490
Conversation
/assign @joestringer let's wait for the CI run and judge based on the results, if it passes and the time to run is acceptable this can not make things worse, so it is safe to merge , at least it will be way better than how it is today 🙃 |
it failed here too https://github.com/cilium/cilium/actions/runs/5382835799/jobs/9768795354?pr=26490, the problem is that some pods fails to be ready
that is surprising since the probe is exec inside the network namespace and connecting to localhost
@squeed how can this be possible?
What these entries mean?
why is tracking the internal communication the network policy, it does not have to trace internal traffic? interesting, I didn't expect the same ip to be reused so soon
|
81e76ce
to
408c7a3
Compare
1/1 success |
failed to install cilium this time
is that normal? |
The network policy tests run a lot of pods and this can cause resource starvation on CI environments. Signed-off-by: Antonio Ojea <aojea@google.com>
It was set to 0s and we observe problems with some pods having weird connectivity issues also in the same network namespace. It is better to use the defaults unless required, this option was carried over from another job Signed-off-by: Antonio Ojea <aojea@google.com>
Sometimes this is caused by a runtime issue in the agent. Given that there were some disruptive changes in the tree that required all PRs to be rebased, it's possible you just hit that issue. Usually when I see this, I dig into the cilium-agent logs to inspect what it's doing and why there's a delay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, but I note that the CI job is failing.
I'm often confused by this, but will GitHub run the changes in this PR for the triggers in this workflow or do we have to merge it first in order to validate?
Don't merge if it fails, I need to narrow this down |
trying to get a failure with debug logs enabled #26639 |
Example of local execution, it got completely stuck in a pretty decent machine