-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multus: allow using NADs without inspectable CIDRs #12778
Conversation
deploy/examples/cluster-test.yaml
Outdated
network: | ||
provider: "multus" | ||
selectors: | ||
public: public-net |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: TODO: remove this
8b8ee8e
to
d9e543e
Compare
This is ready for review, but I am adding do not merge label while I test this for DHCP networks. Results of this testing may inform some doc or minor code changes. |
rookVersion: "myversion", | ||
rookImage: "myversion", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes may not be technically necessary, but I wasted at least 5 hours adding the operator config to this struct and all of the code places where it is called by parents before I realized that rookVersion
was being used to hold the rook image string. I'd rather this not happen to anyone else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I have been confused by this as well
// This job is complex to overcome 2 specific limitations: | ||
// - CNI/Multus' `k8s.v1.cni.cncf.io/network-status` annotation only reports IP addrs, not CIDRs | ||
// - `ip addr show` in the pod shows CIDRs but also contains extra IPv6 SLAAC addrs in addition | ||
// to the addr(s) attached by CNI/Multus | ||
// Rook must cross-reference both pieces of info to accurately read the CIDR(s) for the net. | ||
// Both pieces of info must come from the same Pod+Container in order to be cross-ref'd. | ||
// Use downward API to allow the cmd reporter job to report both pieces of info at the same time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest reading this before reviewing.
// TODO: do we need resource requests/requirements? it's just sleeping | ||
// ... if needed, maybe logcollector, with small requirements would be a good option? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@travisn thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some environments are strict about requiring resource requests/limits on all pods, so we at least need an option. The requirements are going to be very low. I'd suggest we define a new category of resources for this in the cluster CR, though by default I'd say we don't need to set them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this todo without resolving it in this PR. It looks like the cmd reporter doesn't set any resource requests/requirements at all, and it looks like DetectCephVersion()
also doesn't set any resources. Can we instead take an item to handle resources for the cmd-reporter as a general work item? I think it'll bloat this more, and it seems like we don't have anyone with any big problems around this today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting, if the cmd reporter already doesn't set requests/requirements, then we can skip it for now.
3220bee
to
48176d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some issue with crds, maybe try make crds
again?
f05f38d
to
c898c3b
Compare
The multus cluster test is failing:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall, just some minor comments and questions.
// TODO: do we need resource requests/requirements? it's just sleeping | ||
// ... if needed, maybe logcollector, with small requirements would be a good option? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some environments are strict about requiring resource requests/limits on all pods, so we at least need an option. The requirements are going to be very low. I'd suggest we define a new category of resources for this in the cluster CR, though by default I'd say we don't need to set them.
}}}}}} | ||
mnt := corev1.VolumeMount{ | ||
Name: "network-status", | ||
MountPath: "/tmp", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: What about a path like this? Maybe I just think of /tmp
as a scratch folder for someone connecting to the pod.
MountPath: "/tmp", | |
MountPath: "/var/lib/rook/multus", |
@@ -1,49 +0,0 @@ | |||
################################################################################################################# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only multus examples are in the docs now? What about updating this example to match the doc example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an accidental deletion. The file should be back now, but let me know if you see it disappear again. I have been having trouble getting it to stay put for some reason
pkg/apis/ceph.rook.io/v1/types.go
Outdated
// A list of CIDRs. | ||
type CIDRList []CIDR | ||
|
||
// ^ Note on kubebuilder:validation:Pattern above: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the note be placed next to the pattern above? It seems disconnected to be several lines below.
rookVersion: "myversion", | ||
rookImage: "myversion", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I have been confused by this as well
c898c3b
to
de88eda
Compare
Working to address feedback now, but I was able to confirm that the new auto-detection method is successfully able to get a valid CIDR when the DHCP IPAM is used. 🎉 |
de88eda
to
f5acbf1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BlaineEXE multus ci is failing because
pod/rook-ceph-network-cluster-canary-mgsqm 0/1 Init:1/2 0 6m18s 10.244.246.151 fv-az1031-835 <none> <none>
pod/rook-ceph-network-public-canary-vk4nr 0/1 Init:1/2 0 6m18s 10.244.246.150 fv-az1031-835 <none> <none>
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-09-06T00:15:59Z"
message: 'containers with incomplete status: [wait-for-network-status-annotation]'
reason: ContainersNotInitialized
status: "False"
type: Initialized
5d4a322
to
0ba1578
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI is looking good, just reran a few tests to confirm they're intermittent. Good to see the multus test passing. I'm ready to approve assuming manual testing is completed.
@@ -363,12 +363,17 @@ ceph osd primary-affinity osd.0 0 | |||
|
|||
## OSD Dedicated Network | |||
|
|||
!!! outdated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outdated is one of the rendered keywords? I'm not even sure where the list is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. It looks like these are the ones we can use: https://squidfunk.github.io/mkdocs-material/reference/admonitions/
I think info
, tip
(more catching icon than info), or warning
could be good choices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm putting LGTM, I had a few questions that have been addressed. Thanks
Change how Rook detects network CIDRs for Multus networks. The IPAM configuration is only defined as an arbitrary string JSON blob with a "type" field and nothing more. Rook's detection of CIDRs for whereabouts had already grown out of date since the initial implementation. Additionally, Rook did not support DHCP IPAM, which is a reasonable choice for users. And more, Rook did not support CNI plugin chaining, which further complicates NADs. Based on the CNI spec, network chaning can result in any changes to network CIDRs from the first-given plugin. All these problems make it more and more difficult for Rook to support Multus by inspecting the NAD itself to predict network CIDRs. Instead, it is better for Rook to treat the CNI process as a black box. To preserve legacy functionality of auto-detecting networks and to make that as robust as possible, change to a canary-style architecture like that used for Ceph mons, from which Rook will detect the network CIDRs if possible. Also allow users to specify overrides for CIDR ranges. This allows Rook to still support esoteric and unexpected NAD or network configurations where a CIDR range is not detectable or where the range detected would be incomplete. Because it may be impossible for Rook to understand the network CIDRs wholistically while residing only on a portion of the network, this feature should have been present from Multus's inception. Improving CIDR auto-detection and allowing users to specify overrides for auto-detected CIDRs rounds out Rook's Multus support for CephCluster (core/RADOS) installations. No further architectural changes should be needed for CephClusters as regards application of public/cluster network CIDRs for Multus networks. Signed-off-by: Blaine Gardner <blaine.gardner@ibm.com>
0ba1578
to
3c43268
Compare
multus: allow using NADs without inspectable CIDRs (backport #12778)
Description of your changes:
Change how Rook detects network CIDRs for Multus networks. The IPAM
configuration is only defined as an arbitrary string JSON blob with a
"type" field and nothing more. Rook's detection of CIDRs for whereabouts
had already grown out of date since the initial implementation.
Additionally, Rook did not support DHCP IPAM, which is a reasonable
choice for users. And more, Rook did not support CNI plugin chaining,
which further complicates NADs. Based on the CNI spec, network chaning
can result in any changes to network CIDRs from the first-given plugin.
All these problems make it more and more difficult for Rook to support
Multus by inspecting the NAD itself to predict network CIDRs. Instead,
it is better for Rook to treat the CNI process as a black box. To
preserve legacy functionality of auto-detecting networks and to make
that as robust as possible, change to a canary-style architecture like
that used for Ceph mons, from which Rook will detect the network CIDRs
if possible.
Also allow users to specify overrides for CIDR ranges. This allows Rook
to still support esoteric and unexpected NAD or network configurations
where a CIDR range is not detectable or where the range detected would
be incomplete. Because it may be impossible for Rook to understand the
network CIDRs wholistically while residing only on a portion of the
network, this feature should have been present from Multus's inception.
Improving CIDR auto-detection and allowing users to specify overrides
for auto-detected CIDRs rounds out Rook's Multus support for CephCluster
(core/RADOS) installations. No further architectural changes should be
needed for CephClusters as regards application of public/cluster network
CIDRs for Multus networks.
Signed-off-by: Blaine Gardner blaine.gardner@ibm.com
Which issue is resolved by this Pull Request:
Resolves #12459
Checklist:
skip-ci
on the PR.