New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a pool configurability design proposal. #942
Conversation
design/pool-configuration.md
Outdated
@@ -0,0 +1,385 @@ | |||
# Address Pool configuration options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a general question for maintainers. Do we have a naming convention for design proposals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, no. But maybe we can move ones already implemented to a subdir, assume most will be implemented so this will still be manageable?
7408c79
to
e992301
Compare
e992301
to
b309edf
Compare
design/pool-configuration.md
Outdated
* Where / to whom the ip address must be advertised to | ||
* How to allocate that ip to a given service | ||
|
||
The general idea of this proposal is that these degree of configuration are all related to the AddressPool object, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: degree -> degrees
b309edf
to
9bc59c6
Compare
design/pool-configuration.md
Outdated
Currently, an IP assigned to a Service is advertised to all the possible destinations from all the nodes. | ||
The only exception is the BGP peer [node selector](https://metallb.universe.tf/configuration/#limiting-peers-to-certain-nodes). | ||
|
||
Here, we suggest to extend the configuration logic with a protocol dependent way to select the target(s) of the announcement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
independent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the question here.
This part is about choosing the targets of the advertisement, which (I believe) is protocol dependent because of the different nature of the "targets" in the BGP and the L2 context.
The part above (node / addresspool mapping) is meant to be protocol independent, and as such the one below (pool / service association)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah and reading over this, I agree. I thought you were proposing doing something with L2 as well (selectively choosing interfaces) but you say below it wasn't included in this proposal. Do we think we will ever need or want to do this?
b65fc5a
to
7a17df7
Compare
design/pool-configuration.md
Outdated
|
||
No issues were raised but this combination is a more flexible variant of the [peer node selector](https://metallb.universe.tf/configuration/#limiting-peers-to-certain-nodes). | ||
|
||
With this combination, the user may choose a different set of targets depending on the node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So node allocation and peer selection are kind of orthogonal. The exceptions are due to a node being unable to reach a peer. So where would we check for those exceptions? The webhook that you mention below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is clearer in the examples below
There are two types of node selection here:
- The pool node selector drives which nodes are allowed to advertise the ip
- The node selector in the bgp peer of a given pool, tells that the ip must be advertised to that particular peer from a given subset of nodes
A webhook would be able to spot configurations with no intersections between the two selectors resulting in a pool that won't be advertised anywhere.
design/pool-configuration.md
Outdated
|
||
### Pool / Service - Namespace association | ||
|
||
The association of the pool with a particular service is orthogonal to the previously described feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to convince myself of this statement :D. So, if I understand this correctly, if I assign an address pool to a particular service, we can reach that service from any node that has a backend for the service - this is regardless of what MetalLB does as it is only advertising its location? .. which would seem to make them independent. Even if we don't advertise from a particular node, we could still actually reach it on that node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is only about the way we associate the pool to a given service. Today is just random. The first pool that has free ips is associated to a given service here.
Once we have an ip associated to the service, MetalLB will advertise it from a node that:
- has a speaker running (L2 only)
- in case of local traffic policy, has a running backend
- in case the node selector is implemented, the node must match the selector
This phase comes after the pool allocation to the service.
The only relation I can see is that if a user cares about pinning a particular pool to a service (or to a namespace), the reason is likely because they want to leverage some of the extra tweakings that we are proposing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for driving this @fedepaol!
I'm not done going through the proposal, however I wanted to provide feedback ASAP for the things I've already read. So, I'm submitting a couple of comments.
Will submit another review in the upcoming days after reviewing the rest 🤞
7a17df7
to
ab28d3d
Compare
ab28d3d
to
512b276
Compare
design/pool-configuration.md
Outdated
The [current implementation](https://github.com/metallb/metallb/blob/312b03cd3065687f25274486cd3ff5c79d6f6068/speaker/layer2_controller.go#L45) announces L2 pools only from those nodes | ||
with a running speaker and an active endpoint for the service. | ||
|
||
In order to be able to announce L2 from a subset of nodes, this logic must be changed, checking for active endpoints only if the TrafficPolicy is local (which is handled by the [BGP implementation](https://github.com/metallb/metallb/blob/5d236989e61683daf1eaad00777b81ce7dd98c5d/speaker/bgp_controller.go#L153)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking for active endpoints only if the TrafficPolicy is local was done:
metallb/speaker/layer2_controller.go
Line 100 in 3616616
if svc.Spec.ExternalTrafficPolicy == v1.ServiceExternalTrafficPolicyTypeLocal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I did it after redacting this proposal. Removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this @fedepaol. I really like your work around classifying the different issues we have around making address pools more flexible. This brings a lot of order and makes it easy to see the big picture 👍
IMO there is too much ambiguity in the "Proposal" section - see my inline comments. I haven't reviewed the sections which follow since I feel like I have to understand the "Proposal" section properly before I can evaluate the rest of the document. Specifically, I'm having a hard time understanding the relation between the various sections (i.e. "I understand the contents of this paragraph but I'm not sure why this paragraph is there").
If you could help me understand your intention by addressing my inline comments and ensure the various section titles make sense - that would be great. Please tag me again when done and I'll do another pass 🙏
design/pool-configuration.md
Outdated
The general idea of this proposal is that these degrees of configuration are all related to the AddressPool object, and | ||
that these configurations are independent between each other. | ||
|
||
### Node / Address Pool mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused. Does this title refer to the 1st "point" in the previous list, namely "Which nodes should advertise the ip"? If this title and the following ones are an elaboration on the previous list, could we use the same text for the "points" and the titles of the respective sections? If this title and the following ones are not an elaboration of the list, could we explain what these are (e.g. are these use cases? Feature requests? Something else?)? In the latter case, how do these sections relate to the 3 "points" above?
Also, what does the /
denote here? Could the slash be substituted with the word "to"? If there is directionality to the mapping, could use use ->
or something similar? If not, maybe <>
would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to re-organize always using the same title.
design/pool-configuration.md
Outdated
|
||
The following issues also fall in the same "speaker announcement logic" category: | ||
|
||
Service with external backends |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the meaning of this line. Is it necessary? Does this line classify the two issues which follow? Could we reduce ambiguity here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, but those issues now closed and fixed by #976
Will remove it.
design/pool-configuration.md
Outdated
* To which BGP neighbors the IP address should be advertised | ||
* How to allocate that ip to a given service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Could we capitalize "IP" consistently throughout the document?
7ae6a16
to
eb703bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work @fedepaol.
I've done a full pass over the design and added some comments. My biggest concern is not being able to influence the formation of BGP sessions separately from prefix filtering. In addition I am not 100% "sold" on the idea of adding a node selector to BGPAdvertisement
- see my inline comments.
eb5e4f1
to
db48a73
Compare
May I request priorities on the L2Advertisement node selector? apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: lsadvertisement
namespace: metallb-system
spec:
pools:
- pool1
- pool2
nodeSelectors:
- priority: 10
matchExpression:
- key: kubernetes.io/hostname
operator: In
values: [hostB, hostC]
- priority: 50
matchExpression:
- key: kubernetes.io/hostname
operator: In
values: [hostA] |
design/pool-configuration.md
Outdated
apiVersion: metallb.io/v1beta1 | ||
kind: L2Advertisement | ||
metadata: | ||
name: lsadvertisement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: l2advertisement
design/pool-configuration.md
Outdated
apiVersion: metallb.io/v1beta1 | ||
kind: L2Advertisement | ||
metadata: | ||
name: lsadvertisement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: l2advertisement
db48a73
to
229a173
Compare
What about the issue of interface selection in the L2 mode? I don't see any mention of it in this proposal. |
229a173
to
87178b2
Compare
87178b2
to
60557f1
Compare
f4fa533
to
4a7ec8f
Compare
Out of interest, with the nodes effectively selecting pools based on the highest priority, what if the priority of each pool differs for each node? |
Not sure what you mean. The node selector will be in the announcement. |
@gclawes after thinking about it, the ux for calico users will be not ideal. I'll rename ippool to |
This proposal tries to collect all the requests done in separate issues in a organic way, and come with a proposal that includes all of them to start a discussion. Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
d08ecc9
to
fd4ecb4
Compare
Sounds good, that sidesteps the issue. For what it's worth, when using the built-in kubectl tab-completion, it will complete the fully qualified name of all CRDs anyways. |
Merging this. |
As discussed in the last community meeting, this proposal tries to collect all the requests done in separate issues in a organic way, and come with a proposal that includes all of them to start a discussion.