Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes errors when adding NAT with existing entry #2105

Merged
merged 1 commit into from Mar 18, 2021

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Mar 14, 2021

When adding SNAT per pod with an existing entry, the code will fail and
report an error because with OVN we cannot do lr-nat-del and lr-nat-add
in the same OVSDB transaction. This is because OVSDB evaluates the
lr-nat-add and thinks there will be a conflict because an entry already
exists.

Additionally, during gateway init for a node, this failure may also
occur. In which case the GR will be left with missing config.

To handle this case, split add and delete into separate transactions,
and only del/add if the entry is missing.

Signed-off-by: Tim Rozet trozet@redhat.com

@trozet
Copy link
Contributor Author

trozet commented Mar 14, 2021

@fedepaol PTAL

@coveralls
Copy link

coveralls commented Mar 14, 2021

Coverage Status

Coverage decreased (-0.09%) to 55.093% when pulling a3978f2 on trozet:fix_snat_del_add into d5c5205 on ovn-org:master.

go-controller/pkg/ovn/egressgw.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/egressgw.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/egressgw.go Outdated Show resolved Hide resolved
@trozet trozet changed the title Fixes errors when adding pod SNAT with existing entry Fixes errors when adding NAT with existing entry Mar 15, 2021
@trozet trozet requested review from aojea and fedepaol March 15, 2021 20:29
@aojea
Copy link
Contributor

aojea commented Mar 15, 2021

/lgtm

@girishmg
Copy link
Member

@trozet ovn-nbctl lr-nat-del is weird.

The usage for it is....

lr-nat-del ROUTER [TYPE [IP]]

.. in that when the TYPE is

  • snat the IP should be Logical IP
  • dnat or dnat_and_snat the IP should be External IP

So, your UpdateRouterNAT() will not work for dnat and dnat_and_snat NAT Types.

@girishmg
Copy link
Member

@trozet I also didn't understand your comment below

// Combining lr-nat-del and lr-nat-add will fail OVSDB transaction check because
// the entry will already exist so need to do these as two separate transactions

On my test machine, I could do:

# ovn-nbctl lr-nat-list GR_`hostname`
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             10.0.1.10                           10.192.0.0/16
# ovn-nbctl lr-nat-del GR_`hostname` snat 10.192.0.0/16 -- lr-nat-add GR_`hostname` snat 10.0.1.10 10.192.0.0/16
# echo $?
0
# ovn-nbctl lr-nat-list GR_`hostname`
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             10.0.1.10                           10.192.0.0/16
# 

So, I was able to delete and add in the same transaction. I might be missing something.

@aojea
Copy link
Contributor

aojea commented Mar 16, 2021

@trozet I also didn't understand your comment below

// Combining lr-nat-del and lr-nat-add will fail OVSDB transaction check because
// the entry will already exist so need to do these as two separate transactions

On my test machine, I could do:

# ovn-nbctl lr-nat-list GR_`hostname`
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             10.0.1.10                           10.192.0.0/16
# ovn-nbctl lr-nat-del GR_`hostname` snat 10.192.0.0/16 -- lr-nat-add GR_`hostname` snat 10.0.1.10 10.192.0.0/16
# echo $?
0
# ovn-nbctl lr-nat-list GR_`hostname`
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             10.0.1.10                           10.192.0.0/16
# 

So, I was able to delete and add in the same transaction. I might be missing something.

IIRC the problem is when you use --may-exist, quoting Dimitru from an internal discussion about this topic

it depends on the way the table is defined in the schema.. In this case, table "NAT" is "isRoot: false" which means that when the NAT record is not referenced anymore, ovsdb-server garbage collects it. All ovn-nbctl does at "lr-nat-del" is to detach the NAT record from the logical router. When this change is seen by ovsdb-server, it will also remove the NAT record. So, arguably, this is not a bug, but expected behavior for NAT. Logical_Switch otoh has "isRoot: true" so it would be safe to "ls-del -- ls-add "

@trozet
Copy link
Contributor Author

trozet commented Mar 16, 2021

@trozet I also didn't understand your comment below

// Combining lr-nat-del and lr-nat-add will fail OVSDB transaction check because
// the entry will already exist so need to do these as two separate transactions

On my test machine, I could do:

# ovn-nbctl lr-nat-list GR_`hostname`
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             10.0.1.10                           10.192.0.0/16
# ovn-nbctl lr-nat-del GR_`hostname` snat 10.192.0.0/16 -- lr-nat-add GR_`hostname` snat 10.0.1.10 10.192.0.0/16
# echo $?
0
# ovn-nbctl lr-nat-list GR_`hostname`
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             10.0.1.10                           10.192.0.0/16
# 

So, I was able to delete and add in the same transaction. I might be missing something.

@girishmg maybe you are using an older version of OVN/OVS? This is what I see:

[root@ovn-control-plane ~]# ovn-nbctl lr-nat-list GR_`hostname`
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             172.18.0.3                          10.244.0.0/16
[root@ovn-control-plane ~]# ovn-nbctl lr-nat-del GR_`hostname` snat 10.244.0.0/16 -- lr-nat-add GR_`hostname` snat 172.18.0.3 10.244.0.0/16
ovn-nbctl: 172.18.0.3, 10.244.0.0/16: a NAT with this external_ip and logical_ip already exists
[root@ovn-control-plane ~]# rpm -qa | grep ovn
ovn-20.12.0-25.fc33.x86_64
ovn-central-20.12.0-25.fc33.x86_64
ovn-host-20.12.0-25.fc33.x86_64
ovn-vtep-20.12.0-25.fc33.x86_64
[root@ovn-control-plane ~]# rpm -qa | grep openvs
openvswitch-2.15.0-1.fc33.x86_64
python3-openvswitch-2.15.0-1.fc33.x86_64

@trozet
Copy link
Contributor Author

trozet commented Mar 16, 2021

@trozet ovn-nbctl lr-nat-del is weird.

The usage for it is....

lr-nat-del ROUTER [TYPE [IP]]

.. in that when the TYPE is

  • snat the IP should be Logical IP
  • dnat or dnat_and_snat the IP should be External IP

So, your UpdateRouterNAT() will not work for dnat and dnat_and_snat NAT Types.

sigh...we really need to move to go-bindings. I'll change the function to only handle SNAT.

@trozet
Copy link
Contributor Author

trozet commented Mar 17, 2021

@dceara FYI this looks like a regression where lr-nat-del -- lr-nat-add used to work. I dont expect you to rush to fix this or anything, just wanted to highlight it in case there is a bigger regression here that went unnoticed.

@girishmg
Copy link
Member

@trozet thank you for not using lr-nat-list and regex :-)

Your changes LGTM. Looks like it is complaining about gofmt -s on master_test.go

When adding SNAT per pod with an existing entry, the code will fail and
report an error because with OVN we cannot do lr-nat-del and lr-nat-add
in the same OVSDB transaction. This is because OVSDB evaluates the
lr-nat-add and thinks there will be a conflict because an entry already
exists.

Additionally, during gateway init for a node, this failure may also
occur. In which case the GR will be left with missing config.

To handle this case, split add and delete into separate transactions,
and only del/add if the entry is missing.

Signed-off-by: Tim Rozet <trozet@redhat.com>
@girishmg
Copy link
Member

/gtm

@girishmg girishmg merged commit 651f333 into ovn-org:master Mar 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants