This repository has been archived by the owner on Jun 20, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 663
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2585 from weaveworks/merge-weave-npc
Merge weave npc
- Loading branch information
Showing
24 changed files
with
1,429 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
# Overview | ||
|
||
# ipsets | ||
|
||
The policy controller maintains a number of ipsets which are | ||
subsequently referred to by the iptables rules used to effect network | ||
policy specifications. These ipsets are created, modified and | ||
destroyed automatically in response to Pod, Namespace and | ||
NetworkPolicy object updates from the k8s API server: | ||
|
||
* A `hash:ip` set per namespace, containing the IP addresses of all | ||
pods in that namespace | ||
* A `list:set` per distinct (across all network policies in all | ||
namespaces) namespace selector mentioned in a network policy, | ||
containing the names of any of the above hash:ip sets whose | ||
corresponding namespace labels match the selector | ||
* A `hash:ip` set for each distinct (within the scope of the | ||
containing network policy's namespace) pod selector mentioned in a | ||
network policy, containing the IP addresses of all pods in the | ||
namespace whose labels match that selector | ||
|
||
ipset names are generated deterministically from a string | ||
representation of the corresponding label selector. Because ipset | ||
names are limited to 31 characters in length, this is done by taking a | ||
SHA hash of the selector string and then printing that out as a base | ||
85 string with a "weave-" prefix e.g.: | ||
|
||
weave-k?Z;25^M}|1s7P3|H9i;*;MhG | ||
|
||
Because pod selectors are scoped to a namespace, we need to make sure | ||
that if the same selector definition is used in different namespaces | ||
that we maintain distinct ipsets. Consequently, for such selectors the | ||
namespace name is prepended to the label selector string before | ||
hashing to avoid clashes. | ||
|
||
# iptables chains | ||
|
||
The policy controller maintains two iptables chains in response to | ||
changes to pods, namespaces and network policies. One chain contains | ||
the ingress rules that implement the network policy specifications, | ||
and the other is used to bypass the ingress rules for namespaces which | ||
have an ingress isolation policy of `DefaultAllow`. | ||
|
||
## Dynamically maintained `WEAVE-NPC-DEFAULT` chain | ||
|
||
The policy controller maintains a rule in this chain for every | ||
namespace whose ingress isolation policy is `DefaultAllow`. The | ||
purpose of this rule is simply to ACCEPT any traffic destined for such | ||
namespaces before it reaches the ingress chain. | ||
|
||
``` | ||
iptables -A WEAVE-NPC-DEFAULT -m set --match-set $NSIPSET dst -j ACCEPT | ||
``` | ||
|
||
## Dynamically maintained `WEAVE-NPC-INGRESS` chain | ||
|
||
For each namespace network policy ingress rule peer/port combination: | ||
|
||
``` | ||
iptables -A WEAVE-NPC-INGRESS -p $PROTO [-m set --match-set $SRCSET] -m set --match-set $DSTSET --dport $DPORT -j ACCEPT | ||
``` | ||
|
||
## Static `WEAVE-NPC` chain | ||
|
||
Static configuration: | ||
|
||
``` | ||
iptables -A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT | ||
iptables -A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT | ||
iptables -A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS | ||
``` | ||
|
||
# Steering traffic into the policy engine | ||
|
||
To direct traffic into the policy engine: | ||
|
||
iptables -A FORWARD -o weave -m physdev ! --physdev-out vethwe-bridge -j WEAVE-NPC | ||
iptables -A FORWARD -o weave -m physdev ! --physdev-out vethwe-bridge -j DROP | ||
|
||
Note this only affects traffic which egresses the bridge on a physical | ||
port which is not the Weave Net router - in other words, it is | ||
destined for an application container veth. The following traffic is | ||
affected: | ||
|
||
* Traffic bridged between local application containers | ||
* Traffic bridged from the router to a local application container | ||
* Traffic originating from the internet destined for nodeports - this | ||
is routed via the FORWARD chain to a container pod IP after DNAT | ||
|
||
The following traffic is NOT affected: | ||
|
||
* Traffic bridged from a local application container to the router | ||
* Traffic originating from processes in the host network namespace | ||
(e.g. kubelet health checks) | ||
* Traffic routed from an application container to the internet | ||
|
||
See these resources for helpful context: | ||
|
||
* http://ebtables.netfilter.org/br_fw_ia/br_fw_ia.html | ||
* https://commons.wikimedia.org/wiki/File:Netfilter-packet-flow.svg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
package npc | ||
|
||
import ( | ||
"fmt" | ||
|
||
"k8s.io/client-go/pkg/api" | ||
extnapi "k8s.io/client-go/pkg/apis/extensions/v1beta1" | ||
"k8s.io/client-go/pkg/util/intstr" | ||
|
||
"github.com/weaveworks/weave/npc/ipset" | ||
) | ||
|
||
func (ns *ns) analysePolicy(policy *extnapi.NetworkPolicy) ( | ||
rules map[string]*ruleSpec, | ||
nsSelectors, podSelectors map[string]*selectorSpec, | ||
err error) { | ||
|
||
nsSelectors = make(map[string]*selectorSpec) | ||
podSelectors = make(map[string]*selectorSpec) | ||
rules = make(map[string]*ruleSpec) | ||
|
||
dstSelector, err := newSelectorSpec(&policy.Spec.PodSelector, ns.name, ipset.HashIP) | ||
if err != nil { | ||
return nil, nil, nil, err | ||
} | ||
podSelectors[dstSelector.key] = dstSelector | ||
|
||
for _, ingressRule := range policy.Spec.Ingress { | ||
if ingressRule.Ports != nil && len(ingressRule.Ports) == 0 { | ||
// Ports is empty, this rule matches no ports (no traffic matches). | ||
continue | ||
} | ||
|
||
if ingressRule.From != nil && len(ingressRule.From) == 0 { | ||
// From is empty, this rule matches no sources (no traffic matches). | ||
continue | ||
} | ||
|
||
if ingressRule.From == nil { | ||
// From is not provided, this rule matches all sources (traffic not restricted by source). | ||
if ingressRule.Ports == nil { | ||
// Ports is not provided, this rule matches all ports (traffic not restricted by port). | ||
rule := newRuleSpec(nil, nil, dstSelector, nil) | ||
rules[rule.key] = rule | ||
} else { | ||
// Ports is present and contains at least one item, then this rule allows traffic | ||
// only if the traffic matches at least one port in the ports list. | ||
withNormalisedProtoAndPort(ingressRule.Ports, func(proto, port string) { | ||
rule := newRuleSpec(&proto, nil, dstSelector, &port) | ||
rules[rule.key] = rule | ||
}) | ||
} | ||
} else { | ||
// From is present and contains at least on item, this rule allows traffic only if the | ||
// traffic matches at least one item in the from list. | ||
for _, peer := range ingressRule.From { | ||
var srcSelector *selectorSpec | ||
if peer.PodSelector != nil { | ||
srcSelector, err = newSelectorSpec(peer.PodSelector, ns.name, ipset.HashIP) | ||
if err != nil { | ||
return nil, nil, nil, err | ||
} | ||
podSelectors[srcSelector.key] = srcSelector | ||
} | ||
if peer.NamespaceSelector != nil { | ||
srcSelector, err = newSelectorSpec(peer.NamespaceSelector, "", ipset.ListSet) | ||
if err != nil { | ||
return nil, nil, nil, err | ||
} | ||
nsSelectors[srcSelector.key] = srcSelector | ||
} | ||
|
||
if ingressRule.Ports == nil { | ||
// Ports is not provided, this rule matches all ports (traffic not restricted by port). | ||
rule := newRuleSpec(nil, srcSelector, dstSelector, nil) | ||
rules[rule.key] = rule | ||
} else { | ||
// Ports is present and contains at least one item, then this rule allows traffic | ||
// only if the traffic matches at least one port in the ports list. | ||
withNormalisedProtoAndPort(ingressRule.Ports, func(proto, port string) { | ||
rule := newRuleSpec(&proto, srcSelector, dstSelector, &port) | ||
rules[rule.key] = rule | ||
}) | ||
} | ||
} | ||
} | ||
} | ||
|
||
return rules, nsSelectors, podSelectors, nil | ||
} | ||
|
||
func withNormalisedProtoAndPort(npps []extnapi.NetworkPolicyPort, f func(proto, port string)) { | ||
for _, npp := range npps { | ||
// If no proto is specified, default to TCP | ||
proto := string(api.ProtocolTCP) | ||
if npp.Protocol != nil { | ||
proto = string(*npp.Protocol) | ||
} | ||
|
||
// If no port is specified, match any port. Let iptables executable handle | ||
// service name resolution | ||
port := "0:65535" | ||
if npp.Port != nil { | ||
switch npp.Port.Type { | ||
case intstr.Int: | ||
port = fmt.Sprintf("%d", npp.Port.IntVal) | ||
case intstr.String: | ||
port = npp.Port.StrVal | ||
} | ||
} | ||
|
||
f(proto, port) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
package npc | ||
|
||
// These types are defined in https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/network-policy.md | ||
// but not in the k8s API yet. Copies are included here for decoding the `net.beta.kubernetes.io/network-policy` | ||
// annotation specified in the above document. | ||
|
||
type IngressIsolationPolicy string | ||
|
||
const ( | ||
// Deny all ingress traffic to pods in this namespace. Ingress means | ||
// any incoming traffic to pods, whether that be from other pods within this namespace | ||
// or any source outside of this namespace. | ||
DefaultDeny IngressIsolationPolicy = "DefaultDeny" | ||
) | ||
|
||
// Standard NamespaceSpec object, modified to include a new | ||
// NamespaceNetworkPolicy field. | ||
type NamespaceSpec struct { | ||
// This is a pointer so that it can be left undefined. | ||
NetworkPolicy *NamespaceNetworkPolicy `json:"networkPolicy,omitempty"` | ||
} | ||
|
||
type NamespaceNetworkPolicy struct { | ||
// Ingress configuration for this namespace. This config is | ||
// applied to all pods within this namespace. For now, only | ||
// ingress is supported. This field is optional - if not | ||
// defined, then the cluster default for ingress is applied. | ||
Ingress *NamespaceIngressPolicy `json:"ingress,omitempty"` | ||
} | ||
|
||
// Configuration for ingress to pods within this namespace. | ||
// For now, this only supports specifying an isolation policy. | ||
type NamespaceIngressPolicy struct { | ||
// The isolation policy to apply to pods in this namespace. | ||
// Currently this field only supports "DefaultDeny", but could | ||
// be extended to support other policies in the future. When set to DefaultDeny, | ||
// pods in this namespace are denied ingress traffic by default. When not defined, | ||
// the cluster default ingress isolation policy is applied (currently allow all). | ||
Isolation *IngressIsolationPolicy `json:"isolation,omitempty"` | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
package npc | ||
|
||
const ( | ||
TableFilter = "filter" | ||
|
||
MainChain = "WEAVE-NPC" | ||
DefaultChain = "WEAVE-NPC-DEFAULT" | ||
IngressChain = "WEAVE-NPC-INGRESS" | ||
) |
Oops, something went wrong.