Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: allow NW name that is the prefix of a swarm NW ID #27938

Merged
merged 1 commit into from Dec 16, 2016

Conversation

AkihiroSuda
Copy link
Member

- What I did

Allows creating a network of which name is the prefix of the ID of a swarm network

Previously, it doesn't allow creating such a network:

e.g.

$ docker network inspect -f '{{.Id}}' ingress
84xh9knigj6zyt00u31e26nj3
$ docker network create 84
Error response from daemon: network with name 84 already exists

Fix #27866 (Flaky test: DockerSwarmSuite.TestPruneNetwork)

- How I did it
Please look at api/server/router/network/network_routes.go

- How to verify it

$ docker network inspect -f '{{.Id}}' ingress
84xh9knigj6zyt00u31e26nj3
$ docker network create 84
40a945ea84e643bd5234751846e72aadd08a3afab91fbcd4ab3b6e44d3fda8f3

- Description for the changelog

api: allow NW name that is the prefix of a swarm NW ID

- A picture of a cute animal (not mandatory but encouraged)
image

Signed-off-by: Akihiro Suda suda.akihiro@lab.ntt.co.jp

@cpuguy83
Copy link
Member

cpuguy83 commented Nov 1, 2016

ping @mrjana

@@ -79,8 +80,16 @@ func (n *networkRouter) postNetworkCreate(ctx context.Context, w http.ResponseWr
return err
}

if _, err := n.clusterProvider.GetNetwork(create.Name); err == nil {
return libnetwork.NetworkNameError(create.Name)
if nw, err := n.clusterProvider.GetNetwork(create.Name); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check doesn't seem very useful. First, it's racy. Also, it does a prefix match, which is not what we want here.

n.clusterProvider.CreateNetwork will fail if the name conflicts. Could we just remove this check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, however, we need to check the conflict before calling backend.CreateNetwork(), so as to detect a conflict like this:

$  docker service create -d overlay foo
$  docker service create -d bridge foo 
Error response from daemon: network with name foo already exists

A solution I can come up with is to call clusterProvider.CreateNetwork() before calling backend.CreateNetwork() for swarm-scoped networks, but I'm not sure how we can detect the scope here.

WDYT?
cc @mrjana @mavenugo @aboch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion we should not even try to fix this. We should just fix the test to use longer names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand your opinion. But even for production, people may try to create a network named "n1", "n2"..
It would be great if we can find a solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. But I am afraid this cannot be guaranteed unless we serialize the network creation at API level. Which seems too much of a penalty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @AkihiroSuda. It's ridiculous not to allow creating a network if its name is a prefix of another network's ID.

And using longer names in the test won't solve the problem, it will just make the failures less likely - so when they do happen, they will be very hard to reproduce and cause a lot of confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire check needs to be coordinated by the swarm manager. The prefix/name/id game can be handled but doing so correctly is subtle. I've written out the approach in moby/swarmkit#1279 (comment) and a few other times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it the same as partial ID matches for containers; ID > full name match > partial ID match

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it the same as partial ID matches for containers; ID > full name match > partial ID match

I am not sure. Typically, there should be a resolution component and the getter should only take ID. In this case, I think we hit the docker API, so ambiguity of id/name may be a problem.

@AkihiroSuda
Copy link
Member Author

Updated PR.
PTAL @aaronlehmann @aboch @stevvooe

This fix should be enough for the original issue #27866 .

However, this doesn't fix an issue about removing networks.

e.g.

$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
0db1e31d181f        bridge              bridge              local
4f8a8c0144ad        docker_gwbridge     bridge              local
4aa7a6b845ef        host                host                local
kveaqnegpm66        ingress             overlay             swarm
82ba9477298d        kvea                bridge              swarm
7190e8d532c5        none                null                local
$ docker network rm kvea                                                                  
Error response from daemon: rpc error: code = 7 desc = ingress (kveaqnegpm66y75q9wj6urfwq) is a pre-defined network and cannot be removed

(docker network rm kvea should remove the network of which name iskvea, of which ID starts with 82ba9477298d)

For removal, we would need to implement the "set" approach which @stevvooe repeatedly mentioned (e.g. moby/swarmkit#1279 (comment), moby/swarmkit#1194 (comment)) in another PR.

@@ -80,7 +80,7 @@ func (n *networkRouter) postNetworkCreate(ctx context.Context, w http.ResponseWr
return err
}

if _, err := n.clusterProvider.GetNetwork(create.Name); err == nil {
if _, err := n.clusterProvider.GetNetworkByName(create.Name); err == nil {
return libnetwork.NetworkNameError(create.Name)
Copy link
Member

@thaJeztah thaJeztah Nov 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to this PR, but wondering; should this check only be performed if create.CheckDuplicate is true? (see daemon/network.go#L256-L258)

Since allowing duplicate names is by design (#18864 (comment)), I wonder if it should error out.

(t.b.h. I would rather stop allowing duplicate names, because I think the name is used in many cases as the only reference)

I see this was added in #24158, so ping @mavenugo ^^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, probably in another PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense, then we can confirm if it should be fixed / changed on that PR

@mavenugo
Copy link
Contributor

@AkihiroSuda @thaJeztah I can't think of a good use-case for this to be a useful fix. Yes. this is an inconvenience for some cases, but I don't see a strong need to get this PR. IMO living with this restriction is better than trying to fix it and cause more head-aches.

@aaronlehmann
Copy link
Contributor

@mavenugo: What downside do you see? The old code was querying for a duplicate network in a way that would do prefix matching on IDs, which is clearly wrong. This was causing random integration test failures (sometimes "n1" would be a prefix of a network with ID "n1..."). I know there are still some remaining issues, but would like to understand what headaches it may cause to correct this check.

@mavenugo
Copy link
Contributor

@aaronlehmann the existing UX accepts name, id and partial-id in many places. And this will cause confusion on those areas. and that is the reason @AkihiroSuda mentioned this : However, this doesn't fix an issue about removing networks. and I expect the same problem in other areas such as inspect.
am trying to understand the use-case for such a requirement. Especially because names are user-provided and can be anything. Why would someone need a overlapping name with an existing id ?

@aaronlehmann
Copy link
Contributor

One of the use cases is what I mentioned above. If you try to create a network named n1, and there is an existing network whose ID is the base36 string starting with n1..., it fails with the current code. This caused problems in an integration test. With the current code, it would be impossible to choose a name that is always okay, because any name can potentially overlap with an ID prefix.

I don't see this as adding a feature, but instead correcting incorrect validation. It's completely legitimate to have a name overlap with an ID prefix. We allow it for other kinds of objects. Networks are handled inconsistently and this is a bug.

There should be no issue with inspect. We always prefer exact name matches to ID prefix matches. This is what is done for all other objects I know of, such as images, containers, services, and so on. We need to make networks consistent with these other objects.

@mavenugo
Copy link
Contributor

@aaronlehmann okay. if this is something unique to network then i agree we should fix it. Do you prefer fixing the IT issue alone in this PR and address the rest later in other PRs ?

Also, the simpler fix for the IT issue would be to change the TestPruneNetwork to use a longer randomized network name. And we do rely on the uniqueness of IDs (since the randomization really works and doesn't cause flaky tests for other tests :) )

@aaronlehmann
Copy link
Contributor

okay. if this is something unique to network then i agree we should fix it. Do you prefer fixing the IT issue alone in this PR and address the rest later in other PRs ?

Yes, I think we should start by allowing names to overlap with ID prefixes, and in separate PRs we should fix the other issues.

Also, the simpler fix for the IT issue would be to change the TestPruneNetwork to use a longer randomized network name. And we do rely on the uniqueness of IDs (since the randomization really works and doesn't cause flaky tests for other tests :)

I agree that this would technically work and that we rely on ID uniqueness. My preference is to allow names to overlap with ID prefixes for consistency with other objects, as done in the latest revision of this PR.

if err != nil {
return apitypes.NetworkResource{}, err
}
return convert.BasicNetworkFromGRPC(*network), nil
}

// GetNetwork returns a cluster network by an ID.
func (c *Cluster) GetNetwork(input string) (apitypes.NetworkResource, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename input to id to make it clearer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// GetNetworkByName returns a cluster network by name.
func (c *Cluster) GetNetworkByName(input string) (apitypes.NetworkResource, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename input to name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@AkihiroSuda
Copy link
Member Author

updated PR, about @tiborvass 's comments

@AkihiroSuda
Copy link
Member Author

rebased

@AkihiroSuda AkihiroSuda changed the title api: allow NW name that is the prefix of a swarm NW ID [DNM] api: allow NW name that is the prefix of a swarm NW ID Dec 9, 2016
@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented Dec 9, 2016

@thaJeztah thank you a lot for testing!
I'll look into that..

update: updated PR and added two tests: TestSwarmNetworkCreateDup, TestSwarmNetworkCreateDup2
hope CI will be green..

@AkihiroSuda AkihiroSuda force-pushed the ovlnw-27866 branch 3 times, most recently from 38d0542 to 9a36f17 Compare December 9, 2016 08:38
@AkihiroSuda AkihiroSuda changed the title [DNM] api: allow NW name that is the prefix of a swarm NW ID api: allow NW name that is the prefix of a swarm NW ID Dec 9, 2016
@aaronlehmann
Copy link
Contributor

daemon/cluster/cluster.go:1392:1: comment on exported method Cluster.GetNetworksByName should be of the form "GetNetworksByName ..."

@aaronlehmann aaronlehmann added the status/failing-ci Indicates that the PR in its current state fails the test suite label Dec 10, 2016
@AkihiroSuda AkihiroSuda removed the status/failing-ci Indicates that the PR in its current state fails the test suite label Dec 11, 2016
@AkihiroSuda
Copy link
Member Author

Now CI green

c.Assert(err, checker.IsNil, check.Commentf("out: %v", out))
}

// Reversed case for TestSwarmNeetworkCreateDup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: TestSwarmNeetworkCreateDup

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done (and rebased)

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

we should probably address #27938 (comment) though (in a separate PR)

@thaJeztah
Copy link
Member

oh, and needs a rebase 😢

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small comments (that could be seen as nits) but otherwise looks good

c.Assert(err, checker.IsNil, check.Commentf("out: %v", out))
}

// Test case for https://github.com/docker/docker/pull/27938#issuecomment-265768303
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cool if there could also be a description on what it does 👼 (linking the issue is 👍, but it would be even better with a small explanation so the reader doesn't have to necessary open his browser to know why it's there 👼)

Same above for TestSwarmNetworkCreateIssue27866

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

}

// Reversed case for TestSwarmNetworkCreateDup
func (s *DockerSwarmSuite) TestSwarmNetworkCreateDup2(c *check.C) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure these 2 tests could be on the same one 👼

@thaJeztah
Copy link
Member

ping @AkihiroSuda this needs a rebase, also, could you address @vdemeester's comments, then we should be ready to merge for 1.13.0-rc4

…f a swarm network

Previously, it doesn't allow creating such a network:

e.g.

    $ docker network inspect -f '{{.Id}}' ingress
    84xh9knigj6zyt00u31e26nj3
    $ docker network create 84
    Error response from daemon: network with name 84 already exists

Fix moby#27866

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

@thaJeztah @vdemeester updated PR, sorry for being late 😅

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐸

c.Assert(err, checker.IsNil, check.Commentf("out: %v", out))
}

// Test case for https://github.com/docker/docker/pull/27938#issuecomment-265768303
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@thaJeztah
Copy link
Member

all green 💚

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flaky test: DockerSwarmSuite.TestPruneNetwork