-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add masquerade binding mechanism. #1746
Conversation
/cc @rmohr Can you please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some testing coverage expansion would be great.
@@ -868,6 +868,12 @@ func ValidateVirtualMachineInstanceSpec(field *k8sfield.Path, spec *v1.VirtualMa | |||
Message: fmt.Sprintf("Slirp interface only implemented with pod network"), | |||
Field: field.Child("domain", "devices", "interfaces").Index(idx).Child("name").String(), | |||
}) | |||
} else if iface.Proxy != nil && networkData.Pod == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a unit test for this scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -798,6 +798,15 @@ func getCniInterfaceList(vmi *v1.VirtualMachineInstance) (ifaceListString string | |||
return | |||
} | |||
|
|||
func haveProxyInterface(vmi *v1.VirtualMachineInstance) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is it used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
@@ -935,7 +935,7 @@ func Convert_v1_VirtualMachine_To_api_Domain(vmi *v1.VirtualMachineInstance, dom | |||
domainIface.Address = addr | |||
} | |||
|
|||
if iface.Bridge != nil { | |||
if iface.Bridge != nil || iface.Proxy != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to get a unit test for converting a vmi with a proxy interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done
|
||
return iptablesObject.Append(table, chain, rulespec...) | ||
} | ||
func (h *NetworkUtilsHandler) GetAvailableAddrsFromCIDR(s string) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not exactly clear to me why these functions should be part of the handler and not just regular functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is here because I use it also as mock for the podinterface unit tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it makes sense in context of what we already have here.
ips = append(ips, fmt.Sprintf("%s/%d", ip.String(), subnet)) | ||
} | ||
|
||
if len(ips) < 4 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain in a comment why 4 is a special number here, and why you think this check belongs to this method (and not to its consumer who may probably be a better fit to tell whether the number is enough for their needs)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I change the function name and also add a comment.
} | ||
|
||
func (p *ProxyPodInterface) getVMNetwork() (string, string, error) { | ||
ips, err := Handler.GetAvailableAddrsFromCIDR(p.vmNetworkCIDR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how huge do we expect the CIDRs to be? Is it a problem that, while you need just two addresses, you are going to generate all of them inside the handler method? (Which, in theory could be thousands of throw away entries).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I change the function to GetHostAndGwAddressesFromCIDR
just return the gw and the vm ip address.
The function stop after find 4 ips
|
||
err = Handler.LinkSetMaster(b.podNicLink, bridge) | ||
if err != nil { | ||
log.Log.Reason(err).Errorf("faild to connect interface %s to bridge %s", b.podInterfaceName, bridge.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"failed"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
if len(ips) < 2 { | ||
return "", "", fmt.Errorf("need 2 ip address in the CIDR but find only %d", len(ips)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"2 ip addresses", "found"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this. the function now return gw and vm addresses or error.
tests/vmi_networking_test.go
Outdated
_, err = virtClient.VirtualMachineInstance(tests.NamespaceTestDefault).Create(clientVMI) | ||
Expect(err).ToNot(HaveOccurred()) | ||
|
||
By("waiting for the virtual machines to by ready") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by ready -> become ready?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tests/vmi_networking_test.go
Outdated
&expect.BSnd{S: "echo $?\n"}, | ||
&expect.BExp{R: "0"}, | ||
}, 30) | ||
Expect(err).ToNot(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please also add a negative check that ports not registered with the container are NOT accessible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done
api/openapi-spec/swagger.json
Outdated
@@ -4538,6 +4538,9 @@ | |||
"$ref": "#/definitions/v1.Port" | |||
} | |||
}, | |||
"proxy": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider to call this masquerade
or nat
.
To me at least, proxy is rather a process which does the relaying.
And I'd also like to keep the process bound proxying free, because I hope that we'll have some process bound proxying in future, as it's the only K8s compatible way to do things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would the App Operator care if the proxy is implemented by a process, or as suggested here, by the kernel? I would assume that if we find an efficient way to do what you hope, we can just flip the implementation.
Why is kernel-based less k8s compatible?
(I'm not going for a holy war about the name proxy
, just willing to understand your position)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reach out to @SchSeba to understand the differences between a kernel and process based proxy approach for kubernetes.
A user would usually not care, but if sees a spec, then I would like to leave the best hints to give a user transparency. That#s why I favor to have a more kernel-ish name for this implementation and proxy
for a more process-ish implementation.
0180c54
to
39162e7
Compare
Rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. There is a comment from @fabiand about naming of the feature that I think we should address (answer or adopt in code) before merging.
39162e7
to
3561cec
Compare
There is an error for the masquerade test
I will try to run the tests again |
ci test please |
@fabiand Change the interface as requested can you please take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wfm - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wfm
3561cec
to
5ef0385
Compare
Multiple error on the tests
Run functional tests with focus /hold |
2241a27
to
cb4c06e
Compare
cb4c06e
to
993d76a
Compare
993d76a
to
373d77f
Compare
@SchSeba you need to rebase it. Also, is hold label valid still? |
4b205e8
to
c914d54
Compare
tests/vmi_networking_test.go
Outdated
@@ -594,6 +594,70 @@ var _ = Describe("Networking", func() { | |||
}) | |||
}) | |||
|
|||
FContext("VirtualMachineInstance with masquerade binding mechanism", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, you disabled all tests but this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I know just want to be sure this test pass on the CI before I run it with all the other tests.
c914d54
to
d9acc0f
Compare
@booxter I will wait for the CI to finish and then I will remove the hold |
The logs look good now. Now I think this PR is ready to be merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM but I added a few comments.
|
||
err = Handler.LinkSetUp(bridgeNic) | ||
if err != nil { | ||
log.Log.Reason(err).Errorf("failed to bring link up for interface: %s", bridgeNic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong type, bridgeNic
is not a string.
FilteredLogger.Errorf format %s has arg bridgeNic of wrong type *kubevirt.io/kubevirt/vendor/github.com/vishvananda/netlink.Dummy
should be changed to bridgeNic.Name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one!
tests/vmi_networking_test.go
Outdated
pingVirtualMachine(serverVMI, "8.8.8.8", "\\$ ") | ||
pingVirtualMachine(clientVMI, "google.com", "\\$ ") | ||
|
||
By("start a tcp server") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the same style everywhere, as example By("start a tcp server")
should be changed to By("starting a tcp server")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
vmi.Spec.Domain.Devices.Interfaces = []v1.Interface{iface1} | ||
|
||
domain := vmiToDomain(vmi, c) | ||
Expect(domain).ToNot(Equal(nil)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be changed to Expect(domain).ToNot(BeNil())
tests/vmi_networking_test.go
Outdated
}, 30) | ||
Expect(err).ToNot(HaveOccurred()) | ||
|
||
By("Reject the connection from the client to un registered port") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be unregistered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
p.podNicLink = link | ||
|
||
// Get interface MTU | ||
p.vif.Mtu = uint16(p.podNicLink.Attrs().MTU) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this conversion looks dangerous, the range of uint16
is 0 to 65535 and MTU is int
type. I would add additional checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to add any other check here. I get the podNicLink from the system. I tried to create an interface with a bad MTU ( < 0 and > 65535) and it fails on the system level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, but what about jumbograms
? Do we support it or not?
According to RFC doc
A "jumbogram" is an IPv6 packet containing a payload longer than
65,535 octets. This document describes the IPv6 Jumbo Payload
option, which provides the means of specifying such large payload
lengths. It also describes the changes needed to TCP and UDP to make
use of jumbograms.
Jumbograms are relevant only to IPv6 nodes that may be attached to
links with a link MTU greater than 65,575 octets, and need not be
implemented or understood by IPv6 nodes that do not support
attachment to links with such large MTUs.
So if some crazy guy will need to use it then this conversion will fail.
Or maybe there is some check somewhere else which prevents using such a high MTU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah forget about jumbogram
comment, virtio
does not support it. But I still think there should be some additional check or VIF.Mtu
field should be the same type. It's just my opinion after all, it's up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be hard to chnage the variable to int because of the dhcp option
mtuArray := make([]byte, 2)
binary.BigEndian.PutUint16(mtuArray, mtu)
dhcpOptions := dhcp.Options{
dhcp.OptionSubnetMask: []byte(clientMask),
dhcp.OptionRouter: []byte(routerIP),
dhcp.OptionDomainNameServer: bytes.Join(dnsIPs, nil),
dhcp.OptionInterfaceMTU: mtuArray,
}
I can add if statement to check this < 0 and > 65535
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to guard the code and not rely on virtio limitations. A if statement is the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said... Isn't it the same bug that is perhaps present in other places of code? We may want to fix them too. I would be fine if e.g. this part is moved into a separate PR, as long as we are assured this happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this bug in at least one other place, for BridgePodInterface. (Not a surprise considering that a lot of this code is a copy-pasted and then adjusted code for bridge implementation.) @SchSeba please tell us what you prefer: fixing the issue here or sending a follow up PR. I am fine with both approaches. (I am not fine with leaving it not handled for long though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@booxter I just add this to the code for the bridge and the masquerade bindings.
Also added a unit test for that please see my last commit
Handler.StartDHCP(p.vif, p.gatewayAddr, p.bridgeInterfaceName, p.iface.DHCPOptions) | ||
} | ||
|
||
func (p *MasqueradePodInterface) decorateConfig() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point fo returning an error here when you always return nil
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is part of an interface. if I change it I will need to check all the other bindings
tests/vmi_networking_test.go
Outdated
It("should allow regular network connection", func() { | ||
By("creating two virtual machines") | ||
ports := []v1.Port{{Name: "http", Port: 8080}} | ||
serverVMI = tests.NewRandomVMIWithMasqueradeInterfaceEphemeralDiskAndUserdata(tests.ContainerDiskFor(tests.ContainerDiskCirros), "#!/bin/bash\necho 'hello'\n", ports) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's quite a long func name NewRandomVMIWithMasqueradeInterfaceEphemeralDiskAndUserdata
:), may I suggest to rename it and move it to local Context
scope? You're using this function only here so I have doubts if we need it in utils
. My opinion is, if someone will need to use it somewhere else than he/she should moved it then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Changes: * added new interface type (proxy) * added coreos/iptables package into glide * added unit tests * added functinal tests * updated some packages
d9acc0f
to
b514edf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfranczy Thanks for the review I updated the last commit with the requested changes can you please make another review?
p.podNicLink = link | ||
|
||
// Get interface MTU | ||
p.vif.Mtu = uint16(p.podNicLink.Attrs().MTU) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to add any other check here. I get the podNicLink from the system. I tried to create an interface with a bad MTU ( < 0 and > 65535) and it fails on the system level.
|
||
err = Handler.LinkSetUp(bridgeNic) | ||
if err != nil { | ||
log.Log.Reason(err).Errorf("failed to bring link up for interface: %s", bridgeNic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one!
Handler.StartDHCP(p.vif, p.gatewayAddr, p.bridgeInterfaceName, p.iface.DHCPOptions) | ||
} | ||
|
||
func (p *MasqueradePodInterface) decorateConfig() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is part of an interface. if I change it I will need to check all the other bindings
tests/vmi_networking_test.go
Outdated
It("should allow regular network connection", func() { | ||
By("creating two virtual machines") | ||
ports := []v1.Port{{Name: "http", Port: 8080}} | ||
serverVMI = tests.NewRandomVMIWithMasqueradeInterfaceEphemeralDiskAndUserdata(tests.ContainerDiskFor(tests.ContainerDiskCirros), "#!/bin/bash\necho 'hello'\n", ports) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tests/vmi_networking_test.go
Outdated
}, 30) | ||
Expect(err).ToNot(HaveOccurred()) | ||
|
||
By("Reject the connection from the client to un registered port") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tests/vmi_networking_test.go
Outdated
pingVirtualMachine(serverVMI, "8.8.8.8", "\\$ ") | ||
pingVirtualMachine(clientVMI, "google.com", "\\$ ") | ||
|
||
By("start a tcp server") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -193,6 +193,10 @@ func (b *BridgePodInterface) discoverPodNetworkInterface() error { | |||
b.vif.MAC = mac | |||
} | |||
|
|||
if b.podNicLink.Attrs().MTU < 0 || b.podNicLink.Attrs().MTU > 65535 { | |||
return fmt.Errorf("MTU value out of range ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this trailing space... :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, you could include the failed value too.
@SchSeba please add user guide docs |
Changes:
What this PR does / why we need it:
Add a new binding mechanism call proxy.
This implementation insert iptable rules on the virt-launcher pod.
MASQUERADE all the outgoing traffic from the vm with the ip of the pod.
NAT (SNAT and DNAT) all the ports configured on the vm yaml.
NAT all the localhost traffic from the configured ports to the vm.
Release note:
This change is