[WIP, READY FOR REVIEW] Integration of policies with services and the Internet access by milanlenco · Pull Request #609 · contiv/vpp

milanlenco · 2018-02-23T16:02:33Z

WORK IN PROGRESS: PLEASE DO NOT MERGE YET

TO-BE-DONE:

end-to-end testing (none done yet, just via UTs; proof of concept done manually via VPP CLI)
documentation (algorithm explanation, diagrams)

This pull request primarily includes the refactor of the policy rendering code, which was necessary to adapt to the limitations of the VPP/NAT plugin. For policies we always need to evaluate rules against the local IP addresses and not the NATed addresses of services or the node itself. This is, however, not the case for inbound ACLs, which makes them unusable. Previously we were using both directions, but now we combine ingress with egress and install all rules into outbound ACLs. Furthermore, to apply access control on the inter-node and pod-to-internet traffic, we need to reflect the ingress policies into a "global" ACL, installed on the node's output interfaces, also outbound side.
A detailed algorithm description + diagrams depicting the order of VPP nodes will be part of the documentation.

Similar restrictions are also present in the VPPTCP stack - each pod has only a single "local" table of rules assigned (evaluated in the ingress direction) and the stack additionally provides a single "global" table, evaluated in the ingress direction for traffic entering the node.
The equivalent limitations of VPPTCP stack and ACL+NAT (just different orientation of tables) have allowed us to unify the cache and the rendering algorithm to a large degree between the two renderers. That's the second contribution of the pull request.

The third contribution is in the service plugin: the plugin now also installs SNAT configuration which allows Internet access from pods. The SNAT is configured on the physical interface which acts as the default GW (host-VPP interconnect is not supported for Internet access). The implementation is DHCP-aware.
The only issue is that we cannot SNAT inter-node traffic, otherwise policies on the destination node will be evaluated against the NATed address of the source node and not the source pod. The solution is to split inter-node traffic from the pod-internet traffic. This is possible with VXLANs (inter-node traffic is encapsulated. whereas pod-internet is not), or by having an additional physical interface which acts as the default GW. With VXLANs disabled and only one physical interface available, SNAT gets therefore disabled (and needs to be performed by and external NAT device). This is a limitation for which we don't have a workaround at the moment.

Both policies and services (+SNAT) have resync fully implemented, i.e. restart scenarious are supported.

Pods should be able to access kubernetes services (e.g. DNS) even if they are isolated from the kube-system namespace by the installed K8s network policies. However, this is not the case in the opposite direction. Policy may disallow kube-system pod to conntact pod from another namespace.

This commit implements source NATing for all traffic leaving the cluster network, which in effect opens up the Internet access for all pods. The SNAT was included into the Service plugin in order to keep the NAT-related configuration all in one place. The solution is to add the IP address of the default GW interface into the pool of VPP/NAT44 addresses and to enable postrouting on that interface. The traffic going between cluster nodes should not be NATed otherwise the ACLs of the destination node would no longer match against pod IPs, but rather against node IPs, which breaks the semantic. It is possible to separate external traffic from the internal one only with the assistance of VXLANs, therefore the SNAT is not supported and gets disabled in the L2-only mode.

RendererCache combines capabilites of the VPPTCP and ACL caches under a unified interface. The rules are grouped into tables (ContivRuleTable type) and the configuration is represented as a list of local tables, applied on the ingress or the egress side of pods, and a single global table, applied on the interfaces connecting the node with the rest of the cluster. The list of local tables is minimalistic in the sense that pods with the same set of rules will share the same local table. Whether shared tables are installed in one instance or as separate copies for each associated pod is up to the renderer (usually determined by the capabilities of the destination network stack). All tables match only one side of the traffic - either ingress or egress, depending on the cache orientation as selected in the Init method. The cache combines the received ingress and egress Contiv rules into the single chosen direction in a way that maintains the original semantic (the global table is introduced to accomplish the task). The rules are ordered in tables such that if rule *r1* matches subset of the traffic matched by *r2*, then r1 precedes r2 in the list. It is the order at which the rules should by applied by the rule matching algorithm in the destination network stack (otherwise the more specific rules could be overshadowed and never matched). There are two types of tables distinguished: 1. Local table: should be applied to match against traffic leaving (IngressOrientation) or entering (EgressOrientation) a selected subset of pods. Every pod has at most one local table installed at any given time. For a given local table, the set of rules is immutable. Different content is treated as a new local table (and the original table may get unassigned from some or all originally associated pods). Local table has always at least one rule, otherwise it is simply not tracked and returned by the cache. 2. Global table: should be applied to match against traffic entering (IngressOrientation) or leaving (EgressOrientation) the node. There is always exactly one global table installed (per node). The global table may contain an empty set of rules (meaning ALLOW-ALL).

Update() method is still to-be-done.

In Resync we are not able to *easily* fully reconstruct the policy configuration, most notably the IP addresses of pods. For pods no longer existing after the resync it should not be necessary to know the IP address anyway, therefore it can be nil.

coveralls · 2018-02-23T16:10:31Z

Coverage decreased (-0.2%) to 75.711% when pulling 5b13c9d on milanlenco:integration into 4d3ee29 on contiv:master.

…nstalled." This reverts commit 2049644. Based on: https://github.com/ahmetb/kubernetes-network-policy-recipes/blob/master/11-deny-egress-traffic-from-an-application.md it is clear that policies should apply to the kube-system namespace as normally.

brecode · 2018-03-01T06:48:13Z

LGTM

Milan Lenco added 30 commits February 13, 2018 09:32

Update NAT binary APIs.

e610c90

Merge remote-tracking branch 'upstream/master' into integration

00ad274

Expose default GW IP via Contiv plugin API.

71f2729

Add API to the Contiv plugin to get pod ID by the application namespace

1041588

Get rid of the uneccessary ContivRule ID.

55a49f3

Order ContivRules by the size of the traffic matched.

d4e52c9

Refactor ACL Renderer to use the newly introduced RendererCache

6b94a1d

Refactor VPPTCP Renderer to use the newly introduced RendererCache

292d2f2

Merge remote-tracking branch 'upstream/master' into integration

e349245

RendererCache: implemented Resync, Commit, Changes and all getters

825e88a

Update() method is still to-be-done.

RendererCache: implemented Update() method.

ab45424

Commit forgotten file ports.go.

bdd46d4

RendererCache: fixed few bugs and golint issues.

4475968

RendererCache: First UT.

af958e3

Merge remote-tracking branch 'upstream/master' into integration

0a313fe

RendererCache: fixed few bugs + added helper methods for UTs.

f14b011

RendererCache: clean up of UTs.

a1e7c07

Integrate DHCP with SNAT.

12d401d

Update Contiv Mock with the newly added APIs.

4115cf9

RendererCache: Finalize UTs.

842f488

Fix VPPTCP renderer UTs.

3f58287

Fix golint issues.

f4fc01e

Mock for VPP/ACL engine.

dcb54a6

ACL renderer: UTs covering basic functionality

f4bd969

Merge remote-tracking branch 'origin/master' into integration

7db2716

Merge remote-tracking branch 'upstream/master' into integration

620e29e

ACL renderer: finalize UTs.

6e989fc

Milan Lenco added 13 commits February 23, 2018 17:26

containermap: Rename LookupPodNs to LookupPodAppNs and add UT.

102350e

ACL renderer: add UT focused on sharing local tables.

569c9e5

Remote CNI server: cover newly added methods with UTs.

5e3abf5

Merge remote-tracking branch 'upstream/master' into integration

fd1d115

Fix policy cleanup after pod removal.

7b76912

Agent's fix temporarily added directly to vendor.

1c32e24

Policies should apply to kube-system namespace as well.

dcd7894

Policy Processor: newly added pod may have IP address already assigned.

09b4c24

Wait with policy resync until Node IP is set.

1e55d5e

Some extra logs for better debugging.

b8ed25f

Cleanup PolicyProcessor: Move repeated code to Process().

3e1b558

Fix formatting.

5b13c9d

brecode approved these changes Mar 1, 2018

View reviewed changes

brecode merged commit 7221ca1 into contiv:master Mar 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP, READY FOR REVIEW] Integration of policies with services and the Internet access#609

[WIP, READY FOR REVIEW] Integration of policies with services and the Internet access#609
brecode merged 44 commits intocontiv:masterfrom
milanlenco:integration

milanlenco commented Feb 23, 2018 •

edited

Loading

Uh oh!

coveralls commented Feb 23, 2018 •

edited

Loading

Uh oh!

brecode commented Mar 1, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

milanlenco commented Feb 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Feb 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brecode commented Mar 1, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

milanlenco commented Feb 23, 2018 •

edited

Loading

coveralls commented Feb 23, 2018 •

edited

Loading