Skip to content
This repository has been archived by the owner on Jul 25, 2019. It is now read-only.

document how to run openvswitch over ipsec tunnels between nodes #215

Closed
ibotty opened this issue Nov 26, 2015 · 21 comments
Closed

document how to run openvswitch over ipsec tunnels between nodes #215

ibotty opened this issue Nov 26, 2015 · 21 comments

Comments

@ibotty
Copy link
Contributor

ibotty commented Nov 26, 2015

It would be great if that could be documented. It is as straightforward as for flannel (vxlan): https://github.com/ibotty/blog-posts/blob/master/2015/08/01/Securing%20a%20flannel%20network.md

Or is it more involved?

@dcbw
Copy link
Contributor

dcbw commented Dec 3, 2015

The best practice recommendation from the security team is to run host-level IPSec between the nodes and master of the cluster using the OpenShift-generated certificates for the AAA with the IKE daemon (eg, libreswan). Flannel doesn't really need to get involved since it's traffic would simply run on top of the IPSec secured routes between each node/master.

But yes, this should certainly be documented. We've also got a card/task for figuring out how this could be better integrated into OpenShift installs so that it's easier to set up.

@ibotty
Copy link
Contributor Author

ibotty commented Dec 3, 2015 via email

@dcbw
Copy link
Contributor

dcbw commented Dec 3, 2015

When using openshift-sdn, flannel is not involved because openshift-sdn (and its subnet and multitenant plugins) do all pod addressing internally. If you are using flannel, you probably want to re-file this bug against the "openshift/origin" github component. openshift-sdn is only relevant if you have given your openshift processes the '--network-plugin=redhat/openshift-ovs-XXXX' argument.

But in any case the current plan for IPSec with openshift-sdn's subnet/multitenant plugins is to use host-based encryption so that the data plane and control plane between nodes and the master is encrypted. There are two IP addresses/subnets that are relevant here: (a) the node's IP address and (b) the pod network subnet.

Using host-based IPSec would secure all communication between the master and nodes by setting IPSec up for (a), including data and control planes. Any traffic that the OVS vswitch sent via the VXLAN tunnels would automatically be encrypted because the routes between all the nodes were encrypted, regardless of what traffic would be running on top (VXLAN or anything else).

It's not really necessary to secure (b) because at least with the multitenant plugin, nodes that are not part of the same OpenShift project are prevented from seeing each other's traffic anyway.

@ibotty
Copy link
Contributor Author

ibotty commented Dec 3, 2015

No everything is all right with flannel. I was asking about openshift-sdn.

Do I understand you right that transport-mode ipsec between the hosts will automagically encrypt traffic between the private vxlan subnets? That's great to know and something I will test soon(ish).

@dcbw
Copy link
Contributor

dcbw commented Dec 3, 2015

I think there's a misunderstanding about subnets here. What I'm proposing has nothing to do with openshift and containers and vxlan at all, it's encryption between all the hosts in the system. So even 'ping master' and 'ping node3' would be encrypted. Any other inter-cluster traffic (which happens to include any VXLAN traffic between nodes) would be encrypted as a side-effect of all cluster traffic being encrypted.

Also, to be clear, with openshift-sdn there are no "vxlan subnets"; I think what you mean is the "node container subnet". The VXLAN interface simply provides a tunnel between the container subnet on each node.

@ibotty
Copy link
Contributor Author

ibotty commented Jan 12, 2016

We did not have that kind of misunderstanding, just that I was confused about whether I needed a net-to-net ipsec configuration.

I am now running it and it works exactly how you told me, with a transport ipsec between the hosts.

There remains a bigger problem though. openshift-sdn (plugins/osdn/common.go) pokes holes into my firewall:

-A INPUT -p udp -m multiport --dports 4789 -m comment --comment "001 vxlan incoming" -j ACCEPT

I would prefer if I could specify the chain on where it inserts these rules, so that I can specify the chain that only gets ipsec traffic. Should I file a different issue?

@danwinship
Copy link
Contributor

I would prefer if I could specify the chain on where it inserts these rules, so that I can specify the chain that only gets ipsec traffic. Should I file a different issue?

Yes

@AlbertoPeon
Copy link

We are also trying to set up IPSec for communication between nodes but we weren't able to make it work completely. In our deployment we are using Racoon and IPSec-tools.

It would be great if @ibotty could give us some instructions on what to do or if Red Hat could release some documentation on how to secure the Openshift traffic with IPSec.

@ibotty
Copy link
Contributor Author

ibotty commented Feb 10, 2016

I can provide a complete example with libreswan tomorrow, but it's standard transport mode between each host. So for n hosts there are n-1 transport connections on each host. Do you need anything else?

@AlbertoPeon
Copy link

That will be great @ibotty :)

That is exactly our use case, encrypting all the N to N communication between nodes/masters

@ibotty
Copy link
Contributor Author

ibotty commented Feb 11, 2016

Here is my ansible template. It uses explicit rsa keys (not a ca), but that's not important. It is pretty much the stock transport-mode ipsec tunnel. There might be some firewall rules I had to install, but I'm not sure.

Do you have a working ipsec tunnel in transport mode between the hosts (can you ping the other hosts via ipsec)? If so, and vxlan traffic is (still) not working, we can compare firewall rules.

# {{ ansible_managed }}

{% for host in groups[ipsec_ansible_group] 
   if  hostvars[host].ipsec is defined
   and hostvars[host].ipsec.rsasigkey is defined 
   and host != inventory_hostname %}
{% set hvars = hostvars[host] %}

conn {{ ansible_fqdn }}-to-{{ hvars.ansible_fqdn }}
  type=transport
  leftid=@{{ ansible_fqdn }}
  left={{ ansible_default_ipv4.address }}
  leftrsasigkey={{ ipsec.rsasigkey }}
  rightid=@{{ hvars.ansible_fqdn }}
  right={{ hvars.ansible_default_ipv4.address }}
  rightrsasigkey={{ hvars.ipsec.rsasigkey }}
  authby=rsasig
  # load and initiate automatically
  auto=start

{% endfor %}

@ibotty
Copy link
Contributor Author

ibotty commented Feb 15, 2016

When upgrading to the new centos atomic release, I got the "surprising behavior", that bigger requests timed out/stalled. I figured it's because of the MTU, and it was. So I had to change MTU from 1450 to 1400. Maybe that is expected, but it should be documented!

@AlbertoPeon
Copy link

Hi @ibotty,

I'm back to this :)

In the end I managed to configure libreswan to encrypt all traffic between the nodes.
Everything seems to be working but the connections to the internal cluster IPs, which time out.

For example, pulling and pushing images from/to the internal registry hang from the other nodes. However, querying with the internal hostsubnet IP (10.64.X.X) of the registry works as expected. Do you have an idea of what could be happening?

Thanks in advance!

@ibotty
Copy link
Contributor Author

ibotty commented Mar 30, 2016

Have a look at the route table (ip r). There was a version of openshift-sdn that did not add the required route.

@AlbertoPeon
Copy link

Yes, the route is there. I've also tried to lower the MTU to 1400 as suggested in the previous post but no luck.

@ibotty
Copy link
Contributor Author

ibotty commented Mar 30, 2016

I don't have much time right now (or ever), but can you paste the following command, so I might have a look at what's wrong?

> ip xfrm state
> ip a
> ip r
> iptables-save

you can edit out most openshift/kubernetes output from the iptables output. The chains KUBE-... are particularly not interesting.

@AlbertoPeon
Copy link

Hi again @ibotty,

Actually we managed to properly configured the MTU on the nodes and we can now connect and encrypt the traffic! Thank you very much for your help!

@ibotty
Copy link
Contributor Author

ibotty commented Mar 30, 2016

Great! Was there anything else to do than set it in the origin node's config file?

@AlbertoPeon
Copy link

No, in the end it was as simple as that! Thanks again for the help!

@dcbw
Copy link
Contributor

dcbw commented Mar 31, 2016

@AlbertoPeon any chance you could share the details of how you configured it so that we could update the documentation with some examples?

@AlbertoPeon
Copy link

Yes, of course! Although it's mainly what @ibotty described in his ansible template.

The only difference is that an MTU of 1400 wasn't working for us, we had to lower it even more (to 1390). Apparently the overhead of ipsec is a little bit more of 50 bytes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants