Allow replacing ingress network #2028

aboch · 2017-03-09T23:02:12Z

Please see moby/moby/pull/31714 for details.

Signed-off-by: Alessandro Boch aboch@docker.com

aboch · 2017-03-09T23:02:46Z

This will now need a rebase, but wanted the review process to start.

aaronlehmann · 2017-03-09T23:19:59Z

api/objects.proto

+	// this information, so that on reboot or leader re-election, new leader
+	// will not automatically create the routing mesh if it cannot find one
+	// in store. 
+	bool routingMeshOff = 10;


A few comments here:

protobuf names should use underscores, for example routing_mesh_off. This gets converted to CamelCase by the protobuf compiler for Go code.

Maybe routing_mesh_disabled is better?

It's too bad we don't have a message inside Cluster to store networking options, to organize this and stuff like network_bootstrap_keys and encryption_key_lamport_clock. Maybe it's not too late to start one, though.

I did not opt for enabled/disabled because this variable indicates a state, which could be just temporary until user creates a custom ingress network, I thought on/off was more appropriate for a state while enabled/disabled is more for a config (which we do not have).

But not being a native speaker, routing_mesh_disabled works just as fine for me. Will change to that.

I think it may be possible to avoid keeping this state. Right now we create the default cluster object here:

https://github.com/docker/swarmkit/blob/a5eb9c0c8c82c544819f3811f9bfba44c1546998/manager/manager.go#L882

What if we created the ingress network at the same time, but only if the CreateCluster call succeeded? This would ensure that a swarm cluster always gets an ingress network by default, but if it is deleted later on, it would not be recreated automatically.

I'm not sure if this would be sort of hacky. It seems like it would simplify things a bit, but I don't have a strong opinion.

I am happy to avoid the extra state, but the other options I had before were also sort of hacky, like to add a mock ingress network object to store, to denote the ingress nw need not to be created.

IIUC, the allocator would attempt to create the default cluster object, the failure will indicate this is a leader reelection, or a cluster start after a shutdown. In that case I can

if ErrNoIngress && cannot create cluster object { routingMeshDisabled == true }

But what does it mean that allocator succeeds in creating the default cluster object ?
Once allocator gets a chance to run, isn't the default cluster obj already present in store, as created by the manager ? Or is the allcoator cluster object a mock one not seen by anybody else ?

I rejected adding the mock ingress network, because if user downgrade docker than previous version would not handle the mock network properly.

I was suggesting moving ingress network creation out of the allocator, to where the initial cluster and node objects are created (see link). It kind of makes sense to do that, since there would be a single place where initial resources are created.

Every time the manager enters the leader state, it tries to create a default cluster object, because this might be a fresh swarm that has never had a leader before. We could also create an ingress network at this time, if creating the default cluster object succeeded (which means this is indeed a fresh swarm).

Interesting.
Ideally would prefer allocating the networking stuff all in one place, in allocator/network.go.
But what you are suggesting would mean to just push to store the default ingress network object, not yet allocating it.
Let me give it a try.

Tested it, it works. Thanks.

aaronlehmann · 2017-03-09T23:21:42Z

api/specs.proto

@@ -321,6 +321,9 @@ message NetworkSpec {
 	// enabled(default case) no manual attachment to this network
 	// can happen.
 	bool attachable = 6;
+
+	// Ingress indicates this network will provide the routing-mesh.


Let's mention in the comment that legacy ingress networks won't have this flag set, and instead have the com.docker.swarm.internal label, and the name ingress.

aaronlehmann · 2017-03-09T23:24:23Z

manager/allocator/network.go

+		clusters, err = store.FindClusters(readTx, store.ByName(store.DefaultClusterName))
+	}); err != nil {
+		return errors.Wrapf(err,
+			"failed to retrieve cluster object to check routing mesh state during init: %v", err)


Remove the %v with the err argument. This will include err twice in the string.

thanks, I will change it.

aaronlehmann · 2017-03-09T23:26:15Z

manager/allocator/network.go

-	// If ingress network is not found, create one right away
-	// using the predefined template.
-	if len(networks) == 0 {
+	nc.clusterID = clusters[0].ID


You should check that clusters has a nonzero length. FindClusters will not return an error if it doesn't find any matching clusters.

There should always be a default cluster, so it would be a bug if one didn't exist. But I think it's better to return an error than crash in that case.

Good to know, I was in fact not sure about this.
I will follow the length check logic I saw in other part of the code.

aaronlehmann · 2017-03-09T23:29:44Z

manager/allocator/network.go

@@ -992,3 +1090,40 @@ func updateTaskStatus(t *api.Task, newStatus api.TaskState, message string) {
 	t.Status.Message = message
 	t.Status.Timestamp = ptypes.MustTimestampProto(time.Now())
 }
+
+func GetIngressNetwork(s *store.MemoryStore) (*api.Network, error) {


You'll need a comment to pass lint.

aaronlehmann · 2017-03-09T23:32:29Z

manager/controlapi/network.go

-	// if you change this function, you have to change createInternalNetwork in
-	// the tests to match it (except the part where we check the label).
-	if err := validateNetworkSpec(request.Spec, s.pg); err != nil {
+	if err := s.validateNetworkRequest(request.Spec, s.pg); err != nil {


This should be called inside the s.store.Update call below, to make sure multiple CreateNetwork invocations can't race to create multiple ingress networks.

You might consider changing GetIngressNetwork to take a store.ReadTx instead of a store.

Ah, good suggestion. I was under the impression we only run best effort check, and a concurrent ingress nw creation will anyway fail to be allocated by the allocator.
If we can protect against the race is better.

aaronlehmann · 2017-03-09T23:33:26Z

manager/allocator/network.go

+	nc.clusterID = clusters[0].ID
+
+	// Check if we have the ingress network. If not found create
+	// it before reading all network objects for allocation.


All of the code to check for the ingress network and create it if necessary should be inside a store.Update transaction. Otherwise, something else could create an ingress network through the API at the same time.

Thanks for the advice. I will see if I can refactor it.

Actually this code is only executed on the elected leader once the allocator is started (run()).
Allocator is the only component which will check and create the ingress network if needed.
Also, I do not think the system can receive a user request to create an ingress network at this time.
Hopefully we do not need to worry about concurrent creations here.

aaronlehmann · 2017-03-09T23:33:44Z

manager/allocator/network.go

 		if err != nil {
-			return errors.Wrap(err, "failed to find ingress network after creating it")
+			return errors.Wrapf(err, "failed to find ingress network after creating it: %v", err)


%v", err is redundant

aaronlehmann · 2017-03-09T23:46:24Z

manager/controlapi/service.go

+		if doesServiceNeedIngress(service) {
+			if _, err := allocator.GetIngressNetwork(s.store); err != nil {
+				if grpc.Code(err) == codes.NotFound {
+					return grpc.Errorf(codes.PermissionDenied, "service needs ingress network, but ingress network is not present")


I think this should use codes.FailedPrecondition.

aaronlehmann · 2017-03-09T23:46:37Z

manager/controlapi/service.go

+	if doesServiceNeedIngress(service) {
+		if _, err := allocator.GetIngressNetwork(s.store); err != nil {
+			if grpc.Code(err) == codes.NotFound {
+				return nil, grpc.Errorf(codes.PermissionDenied, "service needs ingress network, but ingress network is not present")


I think this should use codes.FailedPrecondition.

Makes sense. Will change it.

aaronlehmann · 2017-03-09T23:47:43Z

manager/controlapi/service.go

@@ -424,6 +425,33 @@ func (s *Server) checkSecretExistence(tx store.Tx, spec *api.ServiceSpec) error
 	return nil
 }

+func doesServiceNeedIngress(srv *api.Service) bool {
+	if srv.Spec.Endpoint.Mode != api.ResolutionModeVirtualIP {


Endpoint will be nil if unspecified in the API request, so be sure to check before dereferencing it.

I have not hit this problem during testing. Probably the cli client fills in a default enpoint object, as it defaults to VIP mode. Here the right thing to do is to check for nil. Thanks.

aaronlehmann · 2017-03-09T23:48:51Z

manager/allocator/network.go

@@ -691,6 +786,9 @@ func (a *Allocator) allocateService(ctx context.Context, s *api.Service) error {
 		// world. Automatically attach the service to the ingress
 		// network only if it is not already done.
 		if isIngressNetworkNeeded(s) {
+			if nc.ingressNetwork == nil {
+				return fmt.Errorf("Ingress network is missing")


lowercase ingress

aaronlehmann · 2017-03-09T23:50:11Z

manager/allocator/network.go

+			return n, nil
+		}
+	}
+	return nil, grpc.Errorf(codes.NotFound, "no ingress network found")


I think it's better not to return a gRPC error here, because the allocator doesn't use gRPC at all. Maybe define an exported error and return that.

var ErrNoIngressNetwork = errors.New("no ingress network found")

Then callers can easily check if this error was returned, and translate to a gRPC error if appropriate.

It's because I have the controlapi functions which validate the service and network specs call this method.

Got it, I will export an error for that.

codecov · 2017-03-10T04:04:09Z

Codecov Report

Merging #2028 into master will decrease coverage by 0.11%.
The diff coverage is 43.81%.

@@            Coverage Diff             @@
##           master    #2028      +/-   ##
==========================================
- Coverage   53.93%   53.82%   -0.12%     
==========================================
  Files         109      109              
  Lines       19100    19187      +87     
==========================================
+ Hits        10302    10327      +25     
- Misses       7561     7608      +47     
- Partials     1237     1252      +15

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1b08186...f39ead8. Read the comment docs.

aboch · 2017-03-10T04:59:25Z

@aaronlehmann Updated. Tested reelection with the new logic you suggested, it works fine.

aaronlehmann · 2017-03-10T17:38:07Z

manager/controlapi/network.go

@@ -112,6 +114,13 @@ func (s *Server) CreateNetwork(ctx context.Context, request *api.CreateNetworkRe
 	}

 	err := s.store.Update(func(tx store.Tx) error {
+		if request.Spec.Ingress {
+			if n, err := allocator.GetIngressNetwork(s.store); err == nil {
+				return grpc.Errorf(codes.PermissionDenied, "ingress network (%s) is already present", n.ID)


codes.AlreadyExists

aaronlehmann · 2017-03-10T17:38:42Z

manager/controlapi/network.go

+		}
+		for _, srv := range services {
+			if doesServiceNeedIngress(srv) {
+				return grpc.Errorf(codes.PermissionDenied, "ingress network cannot be removed because service %s depends on it", srv.ID)


codes.FailedPrecondition

aaronlehmann · 2017-03-10T17:39:20Z

Design LGTM

dongluochen · 2017-03-13T17:59:39Z

manager/allocator/network.go

@@ -163,8 +111,11 @@ func (a *Allocator) doNetworkInit(ctx context.Context) (err error) {
 		}
 	}

+skipIngressNetworkAllocation:


I think this label confusing because the flow from line 89 nc.ingressNetwork = ingressNetwork could also reach here.

Yes, code from line 89 is expected to reach here.

Let me see if modifying the if blocks into a switch makes it more clear, so that I will avoid using the goto.

Signed-off-by: Alessandro Boch <aboch@docker.com>

aboch · 2017-03-13T18:37:38Z

Thanks @dongluochen . I reworked the code to avoid the goto, via a switch, and added some comments.
I think the code is easier to follow now. PTAL.

dongluochen

LGTM

dongluochen · 2017-03-13T18:59:31Z

manager/controlapi/network.go

@@ -112,6 +114,13 @@ func (s *Server) CreateNetwork(ctx context.Context, request *api.CreateNetworkRe
 	}

 	err := s.store.Update(func(tx store.Tx) error {
+		if request.Spec.Ingress {


nit: I think it better to move this check into store.CreateNetwork(tx, n) where it already checks name duplicate.

Not sure if it is appropriate to percolate the ingress notion down into the store.createNetwork().
It is true it is doing now a check on duplicated network name, though, but that is more generic, top level resource name.
If you guys think it makes sense, I can move it there now.

I think either is fine.

aaronlehmann · 2017-03-13T19:29:27Z

LGTM

aaronlehmann · 2017-03-22T23:04:17Z

manager/controlapi/service.go

+		}
+	}
+	return false
+}


@aboch: Sorry for the post-merge comment. I was just looking at this again, and I noticed that this function is similar (but not the same as) isIngressNetworkNeeded in the allocator. Are the differences correct?

Would it be possible to combine these functions as an exported function the allocator? I think it's better for controlapi not to implement this logic directly.

Would it be possible to combine these functions as an exported function the allocator

Agree

Regarding the differences, I want to double check. Probably some are redundant given the spec validation that happen first.

archisgore · 2017-05-09T15:08:20Z

Was this intended to prevent service connections to any "Internal" network at all, or merely the ingress network? From documentation, I assumed internal was a generic network type that application developers could use to place sensitive services in.

aboch · 2017-05-09T15:29:35Z

@archisgore I am not sure I follow, I am adding some more background so that we are on the same page for further clarifications.

This change is to allow user to remove and/or replace the swarm ingress network. It gives interested user control over the ingress network configuration. The ingress network is the infrastructure network which provides the routing mesh for traffic ingressing the host.

In the swarmkit code this network was previously marked with a ...swarm.internal label, just for identification purpose.

In docker networking model, there is also an internal top level operator option which can be used when creating a network. It does isolate the containers on that network from the outside world.

This internal option cannot be accepted for the ingress network, they are conflicting concepts.

archisgore · 2017-05-09T15:59:12Z

Oh okay. I see. I'm seeing some strange behavior with the latest builds where:

docker network create --driver=overlay --internal foobar

Followed by a RemoteAPI call to start a service connected to that network returns:

"Error":"Error response from daemon: rpc error: code = 3 desc = Service cannot be explicitly attached to \"foobar\" network which is a swarm internal network"

But if I do:

docker network create --driver=overlay foobar

The same call works. I searched for that error string and found this commit.

aboch · 2017-05-09T16:04:36Z

Thank you @archisgore What you found is a bug. Will take care of it.

aboch mentioned this pull request Mar 9, 2017

Allow user to replace ingress network moby/moby#31714

Merged

aaronlehmann reviewed Mar 9, 2017

View reviewed changes

aaronlehmann reviewed Mar 10, 2017

View reviewed changes

aaronlehmann added the status/2-code-review label Mar 10, 2017

aboch mentioned this pull request Mar 12, 2017

Add verbose flag to network inspect to show all services & tasks in swarm mode moby/moby#31710

Merged

dongluochen reviewed Mar 13, 2017

View reviewed changes

Allow replacing ingress network

f39ead8

Signed-off-by: Alessandro Boch <aboch@docker.com>

dongluochen reviewed Mar 13, 2017

View reviewed changes

aaronlehmann merged commit 2b1b24b into moby:master Mar 13, 2017

aaronlehmann mentioned this pull request Mar 14, 2017

Vendor swarmkit 9fdea50 moby/moby#31808

Closed

aaronlehmann reviewed Mar 22, 2017

View reviewed changes

aboch mentioned this pull request May 9, 2017

Fix service network validation #2172

Merged

Allow replacing ingress network #2028

Allow replacing ingress network #2028

Conversation

aboch commented Mar 9, 2017

aboch commented Mar 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronlehmann Mar 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Mar 10, 2017 • edited Loading

Codecov Report

aboch commented Mar 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronlehmann commented Mar 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aboch commented Mar 13, 2017

dongluochen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronlehmann commented Mar 13, 2017

Choose a reason for hiding this comment

aboch Mar 23, 2017 • edited Loading

Choose a reason for hiding this comment

archisgore commented May 9, 2017

aboch commented May 9, 2017

archisgore commented May 9, 2017

aboch commented May 9, 2017

aaronlehmann Mar 9, 2017 •

edited

Loading

codecov bot commented Mar 10, 2017 •

edited

Loading

aboch Mar 23, 2017 •

edited

Loading