-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: Handle fragments in SNAT flows #25340
Conversation
fb0577d
to
cfbb5ec
Compare
cfbb5ec
to
debc63c
Compare
debc63c
to
7c88238
Compare
/test |
This pull request has been automatically marked as stale because it |
7c88238
to
fed40e5
Compare
/test |
fed40e5
to
e33a839
Compare
/test |
This pull request has been automatically marked as stale because it |
e33a839
to
3d7e806
Compare
This pull request has been automatically marked as stale because it |
3d7e806
to
e5fef4b
Compare
b4f954d
to
e2ac9d1
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this! It's been a long time coming :).
A few smaller notes, but the only thing that really needs addressing is how ct_lazy_lookup4()
obtains the is_fragment
variable. Happy to brain-storm further if passing around an additional parameter is too much hassle.
8b83858
to
595f9e3
Compare
/test |
1 similar comment
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good, just the ct_lazy_lookup4()
that needs adjusting.
Should we maybe expedite the checksum fixes through a separate PR ?
Separated to another PR: #28768 |
3f6a6b7
to
4e5ca0a
Compare
The code that extracts ports in snat_v4_nat and snat_v4_rev_nat doesn't take into account that the packet may be a second or further fragment. In this case, garbage will be read instead of the real port numbers. Fix this by using the existing ipv4_ct_extract_l4_ports that takes into account fragmentation. This change makes the behavior of the switch inside the aforementioned functions closer to ct_extract_ports4. Consider fragments in snat_v4_rewrite_ingress, avoid rewriting ports if it's not the first fragment. Fixes: 14a653a ("bpf/nat: introduce snat_v4_nat() and snat_v4_rev_nat functions") Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
a0c1f0f
to
00a25b7
Compare
The next commit will use it to update metrics. Note that revalidate_data shouldn't be done in place, because the L3 header offset may be not equal to ETH_HLEN in XDP flows. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
The previous commit ("bpf: Handle fragments in SNAT flows") added an extra call to ipv4_ct_extract_l4_ports into the SNAT paths. It can lead to double-accounting of the same fragments, failing the "Supports IPv4 fragments" test, because ipv4_handle_fragmentation is called one more time in those flows. Fix it by moving the metric update out of ipv4_handle_fragmentation directly to __ct_lookup4. Suggested-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Replace snat_v4_rewrite_ingress by a generic snat_v4_rewrite_headers for better code reuse. This change should not change the behavior, the new code should be functionally equivalent to the old one. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
00a25b7
to
4906275
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ty Maxim, looks great!
The code that extracts ports in snat_v4_nat and snat_v4_rev_nat doesn't take into account that the packet may be a second or further fragment. In this case, garbage will be read instead of the real port numbers. Fix this by using the existing ipv4_ct_extract_l4_ports that takes into account fragmentation.
This change makes the behavior of the switch inside the aforementioned functions closer to ct_extract_ports4.
Consider fragments in snat_v4_rewrite_ingress, avoid rewriting ports if it's not the first fragment.
Fixes: 14a653a ("bpf/nat: introduce snat_v4_nat() and snat_v4_rev_nat functions")
This should be applied on top of #25112.