-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use DSCP instead of tun_metadata0 in Traceflow #1466
Conversation
Thanks for your PR. The following commands are available:
|
Codecov Report
@@ Coverage Diff @@
## master #1466 +/- ##
==========================================
+ Coverage 68.24% 68.47% +0.23%
==========================================
Files 165 165
Lines 13107 13118 +11
==========================================
+ Hits 8945 8983 +38
+ Misses 3226 3195 -31
- Partials 936 940 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, some minor comments
pkg/agent/openflow/pipeline.go
Outdated
// OfTraceflowMarkRange stores dataplaneTag at range 2-7 in DSCP field of IP header. | ||
// IPv4/v6 DSCP (Bits 2-7) Field supports exact match only. | ||
OfTraceflowMarkRange = binding.Range{2, 7} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit confusing. The DSCP field is bits 2 through 7 of the IP ToS field. I don't think "range 2-7 in DSCP field" makes sense.
We are only using this constant in the code below:
packetOutBuilder = packetOutBuilder.AddLoadAction(binding.NxmFieldIPTos, uint64(dataplaneTag), OfTraceflowMarkRange)
it may make more sense to define the constant locally where it is used, rather than here in the middle of the "register ranges"
BTW, isn't there an OVS field for DSCP we can use directly? OXM_OF_IP_DSCP
? @wenyingd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to traceflowMarkToSRange, move to a separate section and updated comments.
I prefer to store all ranges in same place, because it is easy to query.
Yes, OXM_OF_IP_DSCP is what we want, will test with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested OXM_OF_IP_DSCP, still requires range{0, 5}. We don't need to move to this.
FYI. - I think OVS folks are evaluating the tun_metadata fix and promised to give us an update today. It might still be possible to have the fix for Antrea 0.11. |
06d430d
to
88e864e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM, but would let team decide whether we go this way or wait for OVS fix.
OVS made a fix on kernel. https://lore.kernel.org/netdev/1604448694-19351-1-git-send-email-yihung.wei@gmail.com/T/#u |
@jianjuns @antoninbas , The fix made by OVS team requires the latest kernel version, I think it's a strict limitation for users to enable Antrea Traceflow. I still prefer to leverage DSCP to support Traceflow for both encap and no-encap. Please share your concerns about this solution. |
@vicky-liu that was my original concern, that the fix would require a patch to the OVS kernel datapath yes, we should move forward with the DSCP workaround |
Sure. Let us go the PR for 0.11. We can decide if we want to change to DSCP + INT in future releases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to support TF for NoEncap and NetworkPolicyOnly mode? If so, we should update the AgentOptions.validate(), and also the antrea-agent.conf.
return nil, nil, err | ||
} | ||
} | ||
if outputPort == config.DefaultTunOFPort || outputPort == config.HostGatewayOFPort { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - if the destination IP of the TF packet is indeed the gw0 IP, what should be the expected behavior here? And the action should be forwarded or delivered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we restrict destination must be Pod. If the destination is other IP, we cannot identify if the packet is properly delivered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In TraceflowSpec the destination could be an IP? And if it is the Pod IP, I think we still track "delivered"?
In theory we can handle the case destination IP is gw0 IP too. If you want not to support that in this PR, probably add a comment in InstallTraceflowFlows() to explain that.
Even when the IP is an external one, I think antrea-controller should handle it (as we know it is forwarded through gw0), and not to report timeout failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can handle this case by checking whether the dstIP is in Pod cache to identify if the action is Delivered or Forwarded, but I think we should submit another patch after this patch merged.
Thanks for your PR. The following commands are available:
|
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments only, lgtm
func (c *client) traceflowL2ForwardOutputFlows(dataplaneTag uint8, category cookie.Category) []binding.Flow { | ||
flows := []binding.Flow{} | ||
// Output and SendToController if output port is tunnel or gateway port. | ||
// The gw0 IP as Traceflow destination is not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should open an issue for this, in order to document the limitation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitted #1500 for this enhancement.
c9e6634
to
7a67b5f
Compare
This patch changes Traceflow dataplaneTag from tun_metadata0 to DSCP, and remove the tunnel/encap mode restriction for inter-Node Traceflow.
/test-all |
/test-networkpolicy |
This patch changes Traceflow dataplaneTag from tun_metadata0 to DSCP, and remove the tunnel/encap mode restriction for inter-Node Traceflow.
This patch changes Traceflow dataplaneTag from tun_metadata0 to DSCP, and remove the tunnel/encap mode restriction for inter-Node Traceflow.
Seems I forgot to confirm this one. Could you answer? @gran-vmv |
I checked Traceflow in noEncap mode, and it can work. I think AgentOptions.validate() doesn't validate traceflow options, and the validation in agent/traceflow/traceflow_comtroller is removed by this PR. |
Then at least change the yaml and documentation. |
Currently no changes needed for yaml and doc. |
Ok. You are right. I thought we commented in the yaml that TF requires encap and Geneve. |
This patch changes Traceflow dataplaneTag from tun_metadata0 to DSCP, and remove the tunnel/encap mode restriction for inter-Node Traceflow.
This patch changes Traceflow dataplaneTag from tun_metadata0 to DSCP, and
removes the tunnel/encap mode restriction for inter-Node Traceflow.
This PR closes #1357