-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
health: Update Cilium agent to listen on nodeip #26845
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR @tamilmani1989 . Changes look good on the agent side, but I think we should leave the cilium-health-responder behavior as is.
9e52189
to
4d3c8e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good, thanks!
One last thing: could you please change the commit message to adhere to the guide here? Maybe a brief description of the solved issue and a reference to it in a Fixes: ...
line. For the title, I suggest something like health: Update daemon to listen on node ip
.
Thanks! 🙏
4d3c8e1
to
b5a0549
Compare
b5a0549
to
b4749fc
Compare
Hi @tamilmani1989 , what I meant was to change the commit message like this:
I was not referring to the PR title, but to the first line of the commit message. Sorry for the confusion. |
b4749fc
to
ecc6d94
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🚀
@tamilmani1989 CI failures are unrelated to the PR. It has already been fixed in main, could you please rebase? |
ecc6d94
to
becafd5
Compare
/test |
@pippolo84 anything else required for this PR? |
CI failures seem legit, are you up to investigate them? I think it should be the same root cause for all of those. |
Commit 31aaf55006f1772c0e2af222a824fcb78c4e0cbd does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
31aaf55
to
663ef81
Compare
/test |
/test |
1 similar comment
/test |
@pippolo84 not sure why external workloads CI failing. It doesn't look like regular cilium agent running on a node. Is there anyway I can repro it locally? My guess is that cilium is running as non-hostnetworking pod in that node. If that's the case, is there anyway to detect in cilium agent code? Rest of failures look flaky. This is the failure:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this next iteration! 🙏
I've left some comments, mainly around the style.
To setup a local cluster with external workloads you can refer to the doc here.
I think you can check option.Config.JoinCluster
to understand if Cilium is running in a VM, since the feature is enabled based on that config flag.
2058113
to
ee1d021
Compare
/test |
I spent some time in creating repro but couldn't. When I tried creating external workload following that doc, I'm losing connection to VM everytime I ran this script
I skipped this change for external node for now and if needed can open separate issue for that. |
These CI failures look flaky as it succeeded in previous run |
Agree. Let's try a re-run and in case flakyness is confirmed I'm gonna open an issue. |
CI is green now, awesome! 💯 |
ee1d021
to
5b2b69e
Compare
/test |
Hey @tamilmani1989 , I took a closer look at the last changes and I think we can avoid the introduction of the helpers
Thanks in advance! |
The reason i added new api The comment for that function reads: // GetIPv4 returns one of the IPv4 node address available with the following |
The only case I can think of where the InternalIP won't be available if the external workload, but we already take care of that separately. Anyway, there's no harm in having extra care, so let's leave your helpers as you suggested 👍 |
If GetIPv4 or GetIPV6 fails for some reason, then cilium agent would not listen on that ip family as its not appended to address list. so we may need this piece of code as well to address it (it was removed in your version)
|
5b2b69e
to
f92bc1e
Compare
@pippolo84 I updated PR based on above comments and what I felt right. I'm open to hear if there any alternate suggestions and can update accordingly. Thanks for taking time to review this. Appreciate it. 👍 |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the further iteration 💯
I agree with your analysis regarding the GetInternalIPv4
and GetInternalIPv4
returnng nil
. I've left a suggestion to improve the readability of the code when handling those cases.
a095428
to
1242c13
Compare
Update agenthealth to listen on internal node ip instead of listening on all interfaces except for external workload scenario where there is no internal ip concept. Listen on ipv4 or ipv6 or both based on cilium config EnableIPv4 and EnableIPv6. If for some reason, if it fails to get node internal IP it will listen on all addresses. Fixes: cilium#23353 Signed-off-by: Tamilmani <tamanoha@microsoft.com>
1242c13
to
2a72412
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This turned out to be trickier than expected, great job! 🚀
/test |
All required tests passing. Merging |
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.
This PR updates agent health server to listen on nodeip instead of all interfaces.
Fixes:
#23353