Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct egress IP assignment for egress node IPs #1668

Merged
merged 2 commits into from
Oct 19, 2020

Conversation

alexanderConstantinescu
Copy link
Collaborator

@alexanderConstantinescu alexanderConstantinescu commented Sep 7, 2020

When writing e2e tests for Egress IP (upcoming PR soon), I realized a mistake made on my part in the assignment procedure, namely: when specifying an egress IP which is a cluster node X's IP address, we might assign it to another node Y in the assignment procedure. This patch thus fixes this by implementing a check and making sure that if the egress IP specified is a cluster node's, that it is also assigned to exactly that node. If that node does not have the egress label, then we do not perform an assignment and notify the user by triggering an event.

Some of the tests in egressip_test.go were explicitly wrong and testing and verifying the complete opposite to what they should have....I obviously should have paid more attention to that and not let it slip by.

/assign @danwinship

- What this PR does and why is it needed

- Special notes for reviewers

- How to verify it

- Description for the changelog

@coveralls
Copy link

coveralls commented Sep 7, 2020

Coverage Status

Coverage increased (+0.2%) to 56.552% when pulling 15bb754 on alexanderConstantinescu:egressip-node-fix into 71ea9e2 on ovn-org:master.

@danwinship
Copy link
Contributor

when specifying an egress IP which is a cluster node X's IP address, we might assign it to another node Y in the assignment procedure. This patch thus fixes this by implementing a check and making sure that if the egress IP specified is a cluster node's, that it is also assigned to exactly that node.

Hm... in openshift-sdn it's not valid to use a node's actual IP as an egress IP. Because the node IP is going to be used for random traffic from other pods on that node, so it's not useful as an identifiable egress IP

@alexanderConstantinescu
Copy link
Collaborator Author

Hm... in openshift-sdn it's not valid to use a node's actual IP as an egress IP. Because the node IP is going to be used for random traffic from other pods on that node, so it's not useful as an identifiable egress IP

Dohh (insert pic of Homer Simpson)

Do we still want to insure ourselves for such a scenario (i.e, user specifies an egress IP which happens to be a cluster node IP), or does that fall under the "don't do that" category?

@danwinship
Copy link
Contributor

We should check for it and consider it unsatisfiable/an error, like whatever we do if they request an egress IP that's not in any of the available CIDR ranges.

@alexanderConstantinescu
Copy link
Collaborator Author

We should check for it and consider it unsatisfiable/an error, like whatever we do if they request an egress IP that's not in any of the available CIDR ranges.

FYI @danwinship : I've update the PR to do this.

When an egress IP is specified equaling a node's IP address, we
can end up in the situation where the assignment procedure assigns the
egress IP to another cluster node, and thus missing the node actually hosting
that IP address. This patch thus fixes this by invoking an error during the
assignment procedure and triggering an event notifying him/her of this
unsupported request.

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
Previously we only logged an error when the specified
egress IP was invalid. Having worked on the previous commit
I realized that it is more coherent to return an error to the user
notifying him/her of this incorrect definition. The risk is that
if a user specified multiple egress IPs and one is incorrect and we trigger
an event, that this event goes unoticied but causes confusion later on.

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
@alexanderConstantinescu
Copy link
Collaborator Author

@danwinship : could you please take a look at this again?

Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments but I'm guessing they probably apply to a lot of the existing code as well, and could maybe be addressed in a general cleanup later

func (oc *Controller) isAnyClusterNodeIP(ip net.IP) *eNode {
for _, eNode := range oc.eIPAllocator {
if (utilnet.IsIPv6(ip) && eNode.v6IP != nil && eNode.v6IP.Equal(ip)) ||
(!utilnet.IsIPv6(ip) && eNode.v4IP != nil && eNode.v4IP.Equal(ip)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw just if ip.Equal(eNode.v6IP) || ip.Equal(eNode.v4IP) would work.

(!utilnet.IsIPv6(eIPC) && eNodes[i].v4Subnet != nil && eNodes[i].v4Subnet.Contains(eIPC)) {
eNodes[i].tainted, oc.eIPAllocator[eNodes[i].name].allocations[eIPC.String()] = true, true
if (utilnet.IsIPv6(eIPC) && assignableNodes[i].v6Subnet != nil && assignableNodes[i].v6Subnet.Contains(eIPC)) ||
(!utilnet.IsIPv6(eIPC) && assignableNodes[i].v4Subnet != nil && assignableNodes[i].v4Subnet.Contains(eIPC)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, could do just

if (assignableNodes[i].v6Subnet != nil && assignableNodes[i].v6Subnet.Contains(eIPC)) ||
	(assignableNodes[i].v4Subnet != nil && assignableNodes[i].v4Subnet.Contains(eIPC)) {

You're allowed to ask if an IPv6 subnet contains an IPv4 IP or vice versa. (The answer is just guaranteed to be "no".)

@danwinship danwinship merged commit 11f995f into ovn-org:master Oct 19, 2020
alexanderConstantinescu added a commit to alexanderConstantinescu/ovn-kubernetes that referenced this pull request Oct 20, 2020
Now that local gateway mode is removed we can cleanup stuff like modeEgressIP
and the likes and just use one data container as a controller for egress IP
assignment/setup

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Cleanup egress IP conditions

There were a couple of nits on PR: ovn-org#1668
that I didn't get the chance to fix before it merged. So I am doing that here instead

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
alexanderConstantinescu added a commit to alexanderConstantinescu/ovn-kubernetes that referenced this pull request Oct 21, 2020
PR: ovn-org#1668 re-worked
the way egress IP assignment is done, taking into account that users
do not accidentally assign a node IP as egress IP and screws up cluster
networking. However, it introduced a bug, namely: we now have a problem
with reconciliation of nodes for egress, ultimately skipping any UPDATE
event for nodes. Thus, if ovnkube-node does not label a node in question with
k8s.ovn.org/node-primary-ifaddr before ovnkube-master received the
ADD for that node, then we never proceed to update our internal
data with that node when we receive the UPDATE. This commit fixes that

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
alexanderConstantinescu added a commit to alexanderConstantinescu/ovn-kubernetes that referenced this pull request Oct 26, 2020
Now that local gateway mode is removed we can cleanup stuff like modeEgressIP
and the likes and just use one data container as a controller for egress IP
assignment/setup

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Cleanup egress IP conditions

There were a couple of nits on PR: ovn-org#1668
that I didn't get the chance to fix before it merged. So I am doing that here instead

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
mccv1r0 pushed a commit to mccv1r0/ovn-kubernetes that referenced this pull request Oct 28, 2020
PR: ovn-org#1668 re-worked
the way egress IP assignment is done, taking into account that users
do not accidentally assign a node IP as egress IP and screws up cluster
networking. However, it introduced a bug, namely: we now have a problem
with reconciliation of nodes for egress, ultimately skipping any UPDATE
event for nodes. Thus, if ovnkube-node does not label a node in question with
k8s.ovn.org/node-primary-ifaddr before ovnkube-master received the
ADD for that node, then we never proceed to update our internal
data with that node when we receive the UPDATE. This commit fixes that

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
mccv1r0 pushed a commit to mccv1r0/ovn-kubernetes that referenced this pull request Oct 28, 2020
Now that local gateway mode is removed we can cleanup stuff like modeEgressIP
and the likes and just use one data container as a controller for egress IP
assignment/setup

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Cleanup egress IP conditions

There were a couple of nits on PR: ovn-org#1668
that I didn't get the chance to fix before it merged. So I am doing that here instead

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
alexanderConstantinescu added a commit to alexanderConstantinescu/ovn-kubernetes-1 that referenced this pull request Nov 16, 2020
PR: ovn-org/ovn-kubernetes#1668 re-worked
the way egress IP assignment is done, taking into account that users
do not accidentally assign a node IP as egress IP and screws up cluster
networking. However, it introduced a bug, namely: we now have a problem
with reconciliation of nodes for egress, ultimately skipping any UPDATE
event for nodes. Thus, if ovnkube-node does not label a node in question with
k8s.ovn.org/node-primary-ifaddr before ovnkube-master received the
ADD for that node, then we never proceed to update our internal
data with that node when we receive the UPDATE. This commit fixes that

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
alexanderConstantinescu added a commit to alexanderConstantinescu/ovn-kubernetes-1 that referenced this pull request Jan 4, 2021
Now that local gateway mode is removed we can cleanup stuff like modeEgressIP
and the likes and just use one data container as a controller for egress IP
assignment/setup

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Cleanup egress IP conditions

There were a couple of nits on PR: ovn-org/ovn-kubernetes#1668
that I didn't get the chance to fix before it merged. So I am doing that here instead

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
kyrtapz pushed a commit to kyrtapz/ovn-kubernetes that referenced this pull request Sep 25, 2023
…y-pick-1665-to-release-4.13

[release-4.13] OCPBUGS-13170:Fix bug that resulted in routes not be restored to a new vnic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants