Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report policy revision implemented by the proxy in Endpoint model #3151

Merged
merged 8 commits into from
Mar 16, 2018

Conversation

joestringer
Copy link
Member

@joestringer joestringer commented Mar 15, 2018

This PR plumbs the Repository revision and endpoints into the NetworkPolicyDiscoveryService, where these are cached and associated with NetworkPolicyCache versions. When Envoy receives the relevant resources and acknowledges them, it looks up in this cache to determine which policy revision is being acked and calls back into the endpoint objects to update a new field, proxyPolicyRevision.

For more details, see #3144.

Fixes: #3122
Fixes: #3144

@joestringer joestringer added wip area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Mar 15, 2018
@joestringer joestringer requested a review from a team March 15, 2018 06:51
@joestringer joestringer requested review from a team as code owners March 15, 2018 06:51
@joestringer joestringer requested a review from a team March 15, 2018 06:51
@joestringer joestringer requested a review from a team as a code owner March 15, 2018 06:51
@joestringer joestringer requested a review from a team March 15, 2018 06:51
@joestringer joestringer requested a review from a team as a code owner March 15, 2018 06:51
"github.com/cilium/cilium/test/helpers"

. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not use dot imports


"github.com/cilium/cilium/test/helpers"

. "github.com/onsi/ginkgo"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not use dot imports

return
}

func NewNetworkPolicyEventList() *NetworkPolicyEventList {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewNetworkPolicyEventList should have comment or be unexported

@@ -68,6 +72,13 @@ func (m *NetworkPolicy) GetEgressPerPortPolicies() []*PortNetworkPolicy {
return nil
}

func (m *NetworkPolicy) GetPolicy() uint64 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method NetworkPolicy.GetPolicy should have comment or be unexported

}

func (m *NetworkPolicy) Reset() { *m = NetworkPolicy{} }
func (m *NetworkPolicy) String() string { return proto.CompactTextString(m) }
func (*NetworkPolicy) ProtoMessage() {}
func (*NetworkPolicy) Descriptor() ([]byte, []int) { return fileDescriptor3, []int{0} }

func (m *NetworkPolicy) GetPolicy() uint64 {
func (m *NetworkPolicy) GetName() string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method NetworkPolicy.GetName should have comment or be unexported

@@ -823,6 +823,9 @@ definitions:
policy-revision:
description: The policy revision this endpoint is running on
type: integer
proxy-policy-revision:
description: The policy revision currently in use in the proxy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in use -> enforced (?)

)

var (
defaultipToBindingSize = 64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

GetIdentity() identity.NumericIdentity

// OnProxyPolicyUpdate is called when the proxy confirms that it
// has implemented a policy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implemented -> applied


var (
defaultipToBindingSize = 64
unackedNPDSUpdates *NetworkPolicyEventList
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO(rlenglet) comment: move this into the XDSServer.

}
}

// Remove an endpoint from the list so that it no longer receives callbacks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove removes ...

name := strconv.FormatUint(uint64(id), 10)
NetworkPolicyCache.Upsert(NetworkPolicyTypeURL, name, networkPolicy, false)
// When successful, push them into the cache.
unackedNPDSUpdates.Insert(ep, policy.Revision, NetworkPolicyCache.GetVersion())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call wg.AddCompletion() to get a Completion and pass that to the Insert call to store it in the binding.

@@ -148,21 +148,21 @@ func (d *Daemon) RemoveProxyRedirect(e *endpoint.Endpoint, id string) error {

// UpdateNetworkPolicy adds or updates a network policy in the set
// published to L7 proxies.
func (d *Daemon) UpdateNetworkPolicy(id identity.NumericIdentity, policy *policy.L4Policy,
func (d *Daemon) UpdateNetworkPolicy(e *endpoint.Endpoint, policy *policy.L4Policy,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a wg *completion.WaitGroup parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// The policy identifier associated with the network policy. Corresponds to
// remote_policies entries in PortNetworkPolicyRule.
// Optional.
uint64 policy = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nitpick. I would move this up and use the following labels:

string name = 1;
uint64 policy = 2;
repeated PortNetworkPolicy ingress_per_port_policies = 3;
repeated PortNetworkPolicy egress_per_port_policies = 4;

@@ -1883,6 +1885,17 @@ func (e *Endpoint) bumpPolicyRevision(revision uint64) {
e.Mutex.Unlock()
}

// OnProxyPolicyUpdate is a callback used to update the Endpoint's
// proxyPolicyRevision when the specified revision has been implemented in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implemented -> applied

@@ -53,12 +53,12 @@ type Owner interface {

// UpdateNetworkPolicy adds or updates a network policy in the set
// published to L7 proxies.
UpdateNetworkPolicy(id identity.NumericIdentity, policy *policy.L4Policy,
UpdateNetworkPolicy(e *Endpoint, policy *policy.L4Policy,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a wg *completion.WaitGroup parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: Joe Stringer <joe@covalent.io>
}

func (m *NetworkPolicy) Reset() { *m = NetworkPolicy{} }
func (m *NetworkPolicy) String() string { return proto.CompactTextString(m) }
func (*NetworkPolicy) ProtoMessage() {}
func (*NetworkPolicy) Descriptor() ([]byte, []int) { return fileDescriptor3, []int{0} }

func (m *NetworkPolicy) GetName() string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method NetworkPolicy.GetName should have comment or be unexported

@joestringer joestringer dismissed rlenglet’s stale review March 15, 2018 22:56

Addressed comments and rewrote to reuse AckingResourceMutatorWrapper. Please review.

labelsMap identity.IdentityCache, deniedIngressIdentities, deniedEgressIdentities map[identity.NumericIdentity]bool) error {
if d.l7Proxy == nil {
return fmt.Errorf("can't update network policy, proxy disabled")
}
return d.l7Proxy.UpdateNetworkPolicy(id, policy, labelsMap, deniedIngressIdentities, deniedEgressIdentities)
// TODO(jrajahalme): Pass e.ProxyWaitGroup down
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • @jrajahalme you'll need to pass e.ProxyWaitGroup down as part of your changes.

labelsMap identity.IdentityCache, deniedIngressIdentities, deniedEgressIdentities map[identity.NumericIdentity]bool) error {
if d.l7Proxy == nil {
return fmt.Errorf("can't update network policy, proxy disabled")
}
return d.l7Proxy.UpdateNetworkPolicy(id, policy, labelsMap, deniedIngressIdentities, deniedEgressIdentities)
// TODO(jrajahalme): Pass e.ProxyWaitGroup down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrajahalme heads up. We can't pass a WaitGroup yet, because no proxy is querying + acking the NetworkPolicy resources yet, so all regenerations would fail if we did.

Copy link
Contributor

@rlenglet rlenglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for doing this! I just have 4 nitpicks. Nothing blocking.

ep.GetIPv4Address(),
}
policies := []*cilium.NetworkPolicy{}
for i := range ips {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: for _, ip := range ips { then use ip instead of ips[i].

name := strconv.FormatUint(uint64(id), 10)
NetworkPolicyCache.Upsert(NetworkPolicyTypeURL, name, networkPolicy, false)
// When successful, push them into the cache.
for i := range policies {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: for _, p := range policies { then use p instead of policies[i].

} else {
c = wg.AddCompletionWithCallback(callback)
}
nodeIDs := []string{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick:

nodeIDs := make([]string, 0, 1)

@@ -254,3 +254,8 @@ func (c *Cache) GetResources(ctx context.Context, typeURL string, lastVersion *u
cacheLog.Debugf("returning %d resources out of %d requested", len(res.Resources), len(resourceNames))
return res, nil
}

// GetVersion returns the version for the cache at the current point in time.
func (c *Cache) GetVersion() uint64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove. That seems unused.
Anyway it is not thread-safe as-is. This would require locking.

@joestringer
Copy link
Member Author

Something fishy happening in the unit tests, upserts into AckingNetworkPolicyMutator are not showing up in NetworkPolicyCache.

@rlenglet
Copy link
Contributor

@joestringer Maybe the order of initialization of variables in resources.go?

@rlenglet
Copy link
Contributor

You could move the initialization of the variables into an init()? Or move that into the StartXDSServer() in pkg/envoy/server.go.

@rlenglet
Copy link
Contributor

@joestringer which unit tests are failing?

@joestringer
Copy link
Member Author

joestringer commented Mar 16, 2018

@rlenglet DaemonEtcdSuite.TestRemovePolicy is failing, it sees an empty NetworkPolicyCache after inserting network policies.

I tried shifting the mutator update into an init() in pkg/envoy/resources.go, that didn't work. Will try the StartXDSServer() approach next.

@joestringer
Copy link
Member Author

joestringer commented Mar 16, 2018

I think the problem is that we moved endpoint.updateNetworkPolicy() but the tests are expecting it to be updated during dry runs. Fix incoming.

@joestringer joestringer force-pushed the submit/report-npds-revision branch 2 times, most recently from e36829f to 5830763 Compare March 16, 2018 00:42
@rlenglet
Copy link
Contributor

rlenglet commented Mar 16, 2018

Good call.
You should just call d.l7Proxy.UpdateNetworkPolicy with a nil WaitGroup in the dry run block.

Since there is endpoint-specific configuration which affects policy,
this patch changes network policies to be applied per endpoint (via IPs)
rather than per identity. Do this by defining an interface of the
relevant portions of the endpoint that are used when updating network
policy (NetworkPolicyEndpoint), and simplify UpdateNetworkPolicy() and
RemoveNetworkPolicy() to pass this new NetworkPolicyEndpoint around.

Fixes: cilium#3122

Signed-off-by: Joe Stringer <joe@covalent.io>
The existing path was updating the L7 Network Policies in the
potentially dry-run path inside regeneratePolicy(), however it should
only be applied when we are actually going to apply the policy changes
to the datapath. Refactor the function and call it later from
regenerateBPF().

Signed-off-by: Joe Stringer <joe@covalent.io>
A while ago, the spec was updated to use double graves to indicate
fields, to be compatible with sphinx notation for where these fields end
up in the documentation. We didn't update the API definitions at the
time. Do so. This should be a functional no-op.

Signed-off-by: Joe Stringer <joe@covalent.io>
An upcoming commit will need to identify which policy revision was used
to generate an L4/L7 policy, so when we construct the L4Policy, cache
the current policy repository revision in the L4Policy object.

Signed-off-by: Joe Stringer <joe@covalent.io>
This new field will report the policy revision that has been
acknowledged by the proxy, so it is clear to see whether the proxy is
synchronised with Cilium or not.

Signed-off-by: Joe Stringer <joe@covalent.io>
This callback will be called when the completion socket is closed. It
allows an interested party to immediately apply some behaviour at the
time that the completion occurs, rather than waiting on the completion
in the outer code. It will be used in an upcoming commit to reflect the
policy revision from proxy acknowledgements up to the endpoint.

Signed-off-by: Joe Stringer <joe@covalent.io>
When the NetworkPolicyDiscoveryService acknowledges oustanding
NetworkPolicyUpdates for an endpoint, push the corresponding policy
revision number into the endpoint.

Fixes: cilium#3144

Signed-off-by: Joe Stringer <joe@covalent.io>
@rlenglet
Copy link
Contributor

test-me-please

@rlenglet
Copy link
Contributor

The last test failed. A priori unrelated to this PR. This test seems flaky.

@joestringer
Copy link
Member Author

Suspecting another test flake in tests PolicyEnforcement updates, will re-run.

@joestringer
Copy link
Member Author

test-me-please

@rlenglet
Copy link
Contributor

@tgraf could you please review this? Thanks!

@tgraf tgraf merged commit 99a73a2 into cilium:master Mar 16, 2018
@joestringer joestringer deleted the submit/report-npds-revision branch March 16, 2018 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants