Drop packets with invalid NAT requirements in flow-filter#1341
Drop packets with invalid NAT requirements in flow-filter#1341
Conversation
Add a helper function to fake flow session information for a packet, setting the destination VPC, but using some mock-up data for stateful NAT or port forwarding context, because we don't need to read it in the tests (and setting the real data might involve sorting out circular dependencies). This is in prevision for the addition of more checks setting flow session information. Signed-off-by: Quentin Monnet <qmo@qmon.net>
There was a problem hiding this comment.
Pull request overview
This PR adds early validation in the flow-filter stage to drop packets with unsupported NAT requirements (destination masquerade or source port-forwarding) when no flow session info is attached. Previously these packets would pass through to downstream NAT stages, which would fail with confusing error messages.
Changes:
- Added
check_nat_requirements()validation before setting NAT requirements for single-match lookups inflow-filter - Updated tests to expect packets to be filtered/dropped in invalid NAT scenarios, and added new test cases with flow info attached
- Extracted
fake_flow_sessiontest helper to reduce duplication
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| flow-filter/src/lib.rs | Added check_nat_requirements() function and call it before set_nat_requirements for single-match lookups |
| flow-filter/src/tests.rs | Updated assertions for now-dropped packets, added test cases with flow info, extracted fake_flow_session helper |
You can also share your feedback on Copilot code review. Take the survey.
| if check_nat_requirements(packet, &dst_data).is_err() { | ||
| debug!( | ||
| "{nfi}: Invalid NAT requirements found for flow {tuple}, dropping packet" | ||
| ); | ||
| packet.done(DoneReason::Filtered); | ||
| return; | ||
| } |
Some NAT requirements are not currently supported, including:
- Masquerading for destination IP address, when the packet has no flow
information attached
- Port forwarding for the source IP address/port, when the packet has no
flow information attached
The flow-filter stage has visibility on these NAT requirements, and on
the availability of flow session information for the packet. And yet, on
non-ambiguous lookup results, it will let packets go through even if the
NAT requirements are not valid. One consequence is that additional
processing is required, because it falls down to the relevant NAT stages
to check their context and dump the packet in that case. Another
consequence is that, once a NAT stage eventually dumps the packet, it
may do so for reasons that may not obvious when looking at the log. For
example, we've observed logs such as:
ERROR dp-worker-8 dataplane_nat::stateful::apalloc: 256: No address pool found for source address 10.50.2.2.
Did we hit a bug when building the stateful NAT allocator?
ERROR dp-worker-8 dataplane_nat::stateful: 513: stateful-NAT: Error processing packet: allocation failed:
new NAT session creation denied
These logs are not incorrect, in the sense that in the context of the
stateful NAT stage, reaching that point might be a bug if we assumed
that the packet did require to be NAT-ed.
So in this commit, we add a check to the flow-filter stage to check the
two cases described above, and to drop the packet with more helpful log
information when we get invalid NAT requirements.
We also adjust and compensate the unit tests affected by the change.
Signed-off-by: Quentin Monnet <qmo@qmon.net>
a173135 to
9609b0e
Compare
|
What issue does this relate to? |
|
I think the changes are okay. I have the concern whether this could block legitimate traffic. Also, I believe the error occurs in the multiple matches case, which is not yet covered. I'd opt not to add risky logic there and let instead the next NFs drop as they deem, because the checks are already there, but may make a decision when the multiple case match is addressed. |
|
Moved to draft until we clear up the discussion on the relationship with #1342.
This PR was initially driven by a report about the confusing logs mentioned above, on Slack, I don't think we've got a corresponding issue on GitHub.
Conversely, if we assume in the NAT stages that we only receive traffic that we should indeed NAT, we risk letting packets go through even though they're not legit 🤷. I still prefer the risk of dropping legit traffic.
As I replied to Copilot, I've been looking into it.
👍 I'll work on completing it
I don't believe it does, I think it addresses something different. #1342 bypasses flow-filter in the case when we have a valid, established flow for the packet; if there's no flow in place, it doesn't change the logic (and I expect we'd still get the same logs from stateful NAT as described above, for example). On the contrary, in the current PR, |
Ok. I remember seeing something, somewhere, but could not recall the source.
Ok. I see what you mean |
NOT BLOCKING FOR 26.01
Some NAT requirements are not currently supported, including:
The flow-filter stage has visibility on these NAT requirements, and on the availability of flow session information for the packet. And yet, on non-ambiguous lookup results, it will let packets go through even if the NAT requirements are not valid. One consequence is that additional processing is required, because it falls down to the relevant NAT stages to check their context and dump the packet in that case. Another consequence is that, once a NAT stage eventually dumps the packet, it may do so for reasons that may not obvious when looking at the log. For example, we've observed logs such as:
These logs are not incorrect, in the sense that in the context of the stateful NAT stage, reaching that point might be a bug if we assumed that the packet did require to be NAT-ed.
So in this PR, we add a check to the flow-filter stage to check the two cases described above, and to drop the packet with more helpful log information when we get invalid NAT requirements.
@Fredi-raspall Let me know what you think of it, it sounds like the right place to drop packets when we have invalid NAT requirements, but that's pushing a bit more logic to the flow-filter stage.