New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass vlan tagged packets back to the kernel stack. #16772
Conversation
Please can you add a release note to the PR description, as documented in step 11 here. |
fixed! |
test-me-please |
I've updated PR to fix some CI problems - cmdref docs. |
Any updates ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from a CLI perspective, but I need the experienced datapath folk to review the datapath aspects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems much closer to striking the right balance between providing the configurability you're looking for without surprising users too much with traffic bypassing Cilium logic.
I've requested a few changes below to make the change easier to reason about, both from a user perspective (semantics/naming and traceability) but also from a developer perspective (make it super simple to follow the flow of logic, test it so we can be sure that different variations of the option produce the desired behaviour).
The only thing I didn't raise below is the complexity concern. I strongly suspect that if you configure too many VLANs in this option, bpf program load will fail and the Cilium container will crash out because it cannot configure the datapath in the way you have requested. This could happen either when you first enable the option (because you specified too many tags), or it could plausibly happen later on when you upgrade to a new release, because each release can change the datapath. When you upgrade, there could be just enough additional BPF instructions introduced by this logic that when combined with the rest of the new datapath logic, it pushes the program complexity over some limit. I don't know where that limit is, and this PR is not proposing any testing for it. For now if this is a super-advanced option, maybe you're willing to take on that risk and deal with the consequences / roll back & report issues upstream when you hit them during upgrade. But we may want to think a bit more about integration testing and how we can detect such problems earlier in the development cycle so we can start the discussions early on how to address them.
From my point of view it's rather safe to just bypass any non-zero vlan tag to kernel and allow it to do it's job, bit cilium policy is to control everything on devices controlled by cilium. In 99% situations there will be no or 1-2 vlan tags for 1-2 devices. But maybe it's a good idea to generate just |
Thanks for comments. Will get back tomorrow with changes. |
Commit 3e74eb655d512e97d6a87478787341ba0e7126ca does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
I've pressed 'update branch' by an accident... Sorry... |
What are next steps ? Is there anything more I can do to help this finalize ? |
Thanks. Marking as ready to merge. @kvaster As a follow up, could you extend the kube-proxy-replacement guide by documenting your change? |
Unasigning myself as this PR has 3 reviews already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a hubble and api perspective. LGTM. The datapath stuff makes sense to me too.
The bot has marked not ready-to-merge because there are still @cilium/helm and @cilium/hubble reviews outstanding. @errordeveloper , it looks like you were pinged for those aspects (including the hubble API changes). Could you take a brief look specifically at those areas or consider raising this in the sig-hubble so that someone with the relevant expertise can take a look? |
I will do this as a follow up PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Protobuf and helm changes LGTM!
It seems that all reviews are passed... |
// if it's vlan device and we're controlling vlan main device | ||
// and either all vlans are allowed, or we're controlling vlan device or vlan is explicitly allowed | ||
if ok && devices[vlan.ParentIndex] && (devices[vlan.Index] || allowedVlans[vlan.VlanId]) { | ||
vlansByIfIndex[vlan.ParentIndex] = append(vlansByIfIndex[vlan.ParentIndex], vlan.VlanId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appending of vlanIDs that may not be ordered introduced a new unit test flake #17104.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kvaster would you be able to look into this? Usually the solution is just to sort the slice once it's been assembled but before it is used to generate code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, looks like @jrajahalme beat you to it :) Feel free to review though: #17105
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was away... Seems problem was already solved... I will make one more PR soon as discussed - docs update.
Is there an upcoming 1.10 release that will include this fix in the near future? |
Hi @jhutchins , typically more invasive features / changes are scheduled for the next upcoming release in order to minimize the risk of breakage for existing users. This will be included in the upcoming v1.11 release. |
Follow-up for PR: cilium#16772 Signed-off-by: Viktor Kuzmin <kvaster@gmail.com>
Follow-up for PR: #16772 Signed-off-by: Viktor Kuzmin <kvaster@gmail.com>
VLAN packets will be catched with bpf on main interface first. We
need to passs it back to the kernel stack. VLAN info/tag will be stripped and
packet will be reenqueued on proper interface or dropped.
Also we do not need to process VLAN packets on egress, cause either
we've already processed such packets in case bpf program is attached
to VLAN interface or we do not need to process such packets at all in
other way.
VLAN tagged packets will be processed in the folowing way:
--vlan-bpf-bypass
),--vlan-bpf-bypass 0
may be used for allowing all available vlan tags.Concerns:
*
should be used insteadof0
for allowing all available vlan tags.Previous pull request and discussion: #15534
Fixes: #14579
Signed-off-by: Viktor Kuzmin kvaster@gmail.com