-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load multiple programs for one CollectionSpec loading #22025
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thank you!
/test |
Hi @qmonnet, Thanks for review! Could you re-run failed tests, please? From what I see |
Ok, didn't have to check the test results myself so I trust you. Just linking to the failed test results in case they repeat and we want to compare. /test-1.16-4.9 (previous run: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.16-kernel-4.9/3062/) |
/test-runtime (previous run: https://jenkins.cilium.io/job/Cilium-PR-Runtime-net-next/3882/) |
Update: I see multiple PRs failing because of the same issue, so I guess it's a common problem. |
So, I rebased to the latest master, which includes the fix (#22043) for the latest failure. Could you please start tests once again? |
/test |
Please give me some time to review this, working through my queue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'd love to include this in 1.13. Pointed out some changes I'd like to see.
@alexkats What's the reason for the frequent pushes? Are they just rebases? |
@ti-mo, thanks for the review. I resolved all your comments. Regarding frequent pushes you're right, they were just rebases, wanted to keep my branch up-to-date. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Few more nits to address, then we'll run the full tests.
|
Took a look at all 4 failed tests, looks like a flake not related to these changes to me, but tell me if I'm wrong here. |
@alexkats I'll kick off a round of tests now, no need to keep rebasing the branch. |
/test Job 'Cilium-PR-K8s-1.24-kernel-4.19' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
Thanks, I just wanted to include latest fixes for tests from master |
I took a look at the failures, here's what I see:
|
Hi, please let me know if there's anything else I should do here. |
This is a continuation of the effort towards reducing Pod Startup Latency for Kubernetes. Loading CollectionSpec once for multiple programs from the same object file helps to reduce the latency by ~300ms. Signed-off-by: Alex Katsman <alexkats@google.com>
/test |
Soooo, we're down to 2 failures, and those are different from the previous ones 😊
UPD: Here's an issue for |
The question here is what's the plan? Can we merge it, as I believe we had enough signals that this change is not breaking anything, at least from the CI perspective (all tests passed with the latest changes if we combine 2 last runs, where the difference is just a rebase)? |
Nice job on triaging all those flakes!
I agree. Reviews are in and the two test failures have been identified. Marking as ready to merge. |
And one last question here: can it be a candidate for a backport to 1.12? Or this is a good item for an open session (unfortunately can't take part in it this week)? |
1.13 is the first version of Cilium that will ship with ebpf-go based ELF loader, so not really possible unless we backport all of #19159. |
This is a continuation of the effort towards reducing Pod Startup Latency for Kubernetes (#22023).
This change is a follow-up on #20702 (Parallel reloading of ingress and egress eBPF programs), it introduces the ability to load multiple programs with loading and verifying CollectionSpec just once per objPath. After the change the pod startup latency was reduced by ~300ms (3.6 to 3.3 seconds for 100 nodes).
Signed-off-by: Alex Katsman alexkats@google.com