Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to specify default address pools for docker networks #29376

Closed
wants to merge 2 commits into from
Closed

Conversation

aboch
Copy link
Contributor

@aboch aboch commented Dec 13, 2016

- Description
When user creates a network without specifying a --subnet, docker will pick a subnet for the network from the static set 172.[17-31].0.0/16 and 192.168.[0-240].0/20 for the local scope networks and from the static set 10.[0-255].[0-255].0/24 for the global scope networks.

For different reasons, several users have asked to be able to control these defaults.
This PR brings in a change to allow users to control the default address pools at daemon boot.

As an example,
dockerd --default-address-pools scope=local,base=10.10.0.0/16,size=24
would allow user to set the 256 pools 10.10.[0-255].0/24 as default for the local scope networks.

Multiple --default-address-pools with scope=[local|global] can be specified.
To specify the option in the config json file:

{"default-address-pools":[{"scope":"local","base":"172.80.0.0/16","size":24},{"scope":"global","base":"172.90.0.0/16","size":24}]}

- Notes
Fixes #21776

- A picture of a cute animal (not mandatory but encouraged)

canenero

@thaJeztah
Copy link
Member

This should probably also be configurable in the daemon.json configuration file

@aboch
Copy link
Contributor Author

aboch commented Dec 14, 2016

This should probably also be configurable in the daemon.json configuration file

Yes. It is taken care automatically. I verified it works via dockerd --config-file <json file>

@zarbis
Copy link

zarbis commented Dec 14, 2016

@aboch
I'm looking at those options and --local-pools-size has default of zero. Is it a special case and true default subnet mask is taken from somewhere else?
Taken at face value it makes no sense and probably would break things if only specified option would be --local-base-pool, without --local-pools-size.
Maybe some meaningful default of 24 or 16?

@thaJeztah
Copy link
Member

Yes. It is taken care automatically. I verified it works via dockerd --config-file

Great! Looks like that needs some documentation as well in dockerd.md#daemon-configuration-file

@aboch would it make sense to have this option for overlay networks as well? For people running on older kernel versions, it's important that the overlay IP-addresses don't overlap with IP-addresses on the underlay network; having an option to automatically pick a range that doesn't overlap could benefit those users as well?

@aboch
Copy link
Contributor Author

aboch commented Dec 14, 2016

@zarbis
That would be rejected: https://github.com/docker/docker/pull/29376/files#diff-67587c6d1152c2e1fe9e54fa4b2ffc04R67

if size <= 0 || size < ones {
	return fmt.Errorf("invalid pool size: %d", size)
}

I in fact added to the manual that the two options go hand-in-hand: Specified along with --local-pools-size, it defines the set of predefined address pools [...] and This option must be specified along with --local-base-pool[...]

I would like to keep this explicit, I would not add a default for the pool size if user did not pass one. I prefer start to fail and user to correct the config.

@thaJeztah

Looks like that needs some documentation as well in dockerd.md#daemon-configuration-file

Thanks, I will take care of it

would it make sense to have this option for overlay networks as well? For people running on older kernel versions

I thought of it, but I felt addressing the defaults for local networks was the priority given the number of requests and given how easy this problem can be hit. In the multihost scenario where host kernel version is < 3.13, user would anyway have to first make sure the chosen subnet for the overlay network does not overlap with any route on any of the cluster hosts, like to say reaching to the --subnet option would anyway be something he was going to do.

BTW, this change does not prevent us for addressing the defaults for global networks if we see the need in future, it would anyway be an addition of a couple of options to the UX.

@aboch
Copy link
Contributor Author

aboch commented Dec 14, 2016

@zarbis In regards to your questions, I now see a point in not having two separate options (--local-base-pool and --local-pools-size) for one configuration. Two options which cannot be specified independently.

@thaJeztah was suggesting me offline that we support the "csv" notation, so that we could represent this as something like:

--predefined-pools scope=local, base=10.20.0.0/16, size=24

this is also future proof if we ever decide to support this for global scope network as well.

I will rework the PR.

@aboch aboch changed the title Allow user to specify predefined address pools for local networks Allow user to specify default address pools for docker networks Dec 20, 2016
@aboch
Copy link
Contributor Author

aboch commented Dec 20, 2016

Thanks @thaJeztah. I Reworked the PR to allow user to set both local and global scope default address pools, via csv options. PTAL when you get a chance.

@xeor
Copy link

xeor commented Dec 28, 2016

Will this most likely be in 1.14?

@estesp
Copy link
Contributor

estesp commented Jan 12, 2017

What is the relationship between this feature and the current --fixed-cidr[-v6] options for the daemon (which I believe relates more to "local" scope networks)? I understand the --default-address-pools option can be multiply specified to add more than one range, and also impact the global scope defaults (which I don't think is possible today with any other flag). But, will this replace/deprecate or just augment the other existing flag(s)? Given the --fixed-cidr-v6 allows an IPv6 range specified, will there be support in --default-address-pools for IPv6?

@aboch
Copy link
Contributor Author

aboch commented Jan 12, 2017

@estesp
None.

If you specify a --fixed-cidr-v6 then you are explicitly telling docker what IPv6 subnet to use for the default bridge network.

The default pools are used when user does not specify a subnet for the network and only for IPv4 subnets for now.

It does not deprecate any existing IPAM related options. It adds control over the default pools which today are hard coded. Read {172.[17-31].0.0/16} and {192.168.[0-240].0/20} for local scope networks and {10.[0-255].[0-255].0/24} for global scope networks.

@@ -653,6 +653,9 @@ func configureKernelSecuritySupport(config *Config, driverName string) error {
}

func (daemon *Daemon) initNetworkController(config *Config, activeSandboxes map[string]interface{}) (libnetwork.NetworkController, error) {
if err := processPoolsConfig(config); err != nil {
return nil, err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can configure --default-address-pools and decide to change it on a daemon reload. We can allow that change only if all the networks created before have been deleted. So the libnetwork controller has to be instantiated. But currently ipamutils is an independent package.

How about letting libnetwork controller get the --default-address-pools config, do the required validity checks and then call ipamutils.InitAddressPools with the new config or fail the controller creation if the validity checks fail ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we will allow the pools to be modified on reload.
Currently only two networking configs are allowed to be changed on cofig reload --cluster-store and --cluster-advertise.
Even if in future we decide to support the default pool re-config, I don't think we should remove the existing networks using the old pools. User is configuring a default, which takes place from this moment on, it is understandable existing networks will still be using the pool assigned to them at time of creation.

The reason why we have the default pool controlled in the ipamutils pkg itself, is because they need to be available to both libnetwork and swarm.

We could let libnetwork controller handle the the default pool config, then program the ipamutils pkg, but then how can we guarantee docker or swarm code does wait for the network controller to be up before invoking ipamutils methods ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we will allow the pools to be modified on reload.

I was referring to the case where the daemon is stopped and restarted with a new range. For modification through the config file, agree that we don't have to support it.

From an operator pov they want to configure the pool to avoid conflicts with the underlay networks or the host subnets.. so after configuring a range, the only reason one would modify the range is because some subnets in the range can no longer be used by docker.. so the user should have deleted the old networks before making this config change.. if he doesn’t its a user’s mistake..But unless its absolutely impossible for us to detect that misconfig and error out we should do it.. allowing that erroneous config can lead to hard to debug issues.

For the swarm mode we have to think about what is the best way to support this config. Since the swarm init is done independent of the daemon configs its more natural to associate the global range config with swarm init rather than the engine config. But it should be possible to do the validity checks in that case also because swarm mode will deal only with global range and the engine will only deal with local range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given docker networks are persisted across reboot, I do not think the new default pools must be applied to existing networks, and I do not think there would be such an expectation.

The defaults take place when user does not specify the option at time of the resource creation. Once the resource is created, the allocated properties should stay for the life of the resource and not be redefined based on new defaults.

In the scenario you depicted, I would expect user to remove any network which may conflict with an underlying network anyway. Then he can reload with the new defaults and start creating networks again. On a related note, if user's environment is tightly dependent on underlaying network, I would expect him to explicitly choose the subnet when he creates the networks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aboch I actually have an expectation of networks change after daemon reload.
Real life situation why I'm subbed to this PR: I've ran out of networks spinning project with Docker Compose through CI pipeline.
I expect to change daemon options, reload daemon and have my networks compacted and bunch of address space freed. It will be a life saver in OH SHI~ moments.
I understand that it introduces a lot of complication, but just wanted to point out that some people will have those expectations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're constantly running out of networks, during local development and CI.
We're calling docker network ls -q | xargs docker rm -fv every 2 hours in our runners.

@LK4D4
Copy link
Contributor

LK4D4 commented Jan 27, 2017

@aboch any updates here?

@mlaventure
Copy link
Contributor

ping @aboch still working on this?

@tedwardd
Copy link

Fix here: tedwardd@e144a29

@schmunk42
Copy link
Contributor

Would be so great if this could get merged.

CC: @handcode

@shoenig
Copy link

shoenig commented Mar 12, 2017

+1000 Anything blocking this from being merged? Anything we can do to help?

Signed-off-by: Alessandro Boch <aboch@docker.com>
@aboch
Copy link
Contributor Author

aboch commented Nov 9, 2017

@aboch : why have you closed the PR?

Ops sorry ! I was cleaning my old branches, I totally forgot about this old PR.
Thanks @thaJeztah for retrieving the diffs !

Although I was able to recover the diffs from GH tree, I cannot reopen this PR, given the original branch is lost. I'll have to open a new PR, with changes in line with the comments.

@sstolz
Copy link

sstolz commented Nov 17, 2017

Hi guyz, not sure to fully understand if a new PR is going to be created to solve that problem ?

Thanks a lot !

@Vanuan
Copy link

Vanuan commented Nov 17, 2017

It looks like there are no volunteers :)

@stephgosling
Copy link

just another person chiming in that this a) an issue for them and b) is in need of a fix.

@sstolz
Copy link

sstolz commented Nov 17, 2017

Yep I confirm that choosing the subnet pattern for automatic network container creation is a must have or should-have-been-must-have :) Just joking but if you guyz could integrate this PR, it's very disapointing and destabilizing not to have the ability to choose the pattern (and so the size) of subnet.

@ppooppoo
Copy link

+1

@esben
Copy link

esben commented Nov 22, 2017

@aboch So, you are going to open a new PR?

@tboerger
Copy link

Seriously a discussion of nearly one year for such an important fix on the daemon side without any result? To force the people to set a static range for every docker network is really a poor option.

@zarbis
Copy link

zarbis commented Dec 19, 2017

@tboerger I've just moved to Swarm, lifts pretty much all limitations of this discussion with no drawbacks.

@obitech
Copy link

obitech commented Dec 19, 2017

+1, would love to see this as a feature

@Vanuan
Copy link

Vanuan commented Dec 19, 2017

@zarbis did you manage to use swarm for CI use case too? E.g. running tests, etc. It looks like swarm is cumbersome to use for that use case.

@zarbis
Copy link

zarbis commented Dec 19, 2017

@Vanuan it's just a matter of replacing docker-compose up with docker stack deploy and docker service update. I don't see added complexity here, unless you can point to the specific case.

@Vanuan
Copy link

Vanuan commented Dec 19, 2017

@zarbis The main difference I see is that there's no alternative for docker-compose run. I.e. you have to make sure it's not restarted, do some watch loop and parse out the running state so that you know when tests finish.

@selansen
Copy link
Contributor

selansen commented Jan 4, 2018

I started working on this. Will pick up @aboch changes and make required changes.
Like @mavenugo has pointed out, I will try to close local-scope case first and will take care of swarm-scope .

@Vanuan
Copy link

Vanuan commented Jan 17, 2018

@selansen Any update?

@selansen
Copy link
Contributor

selansen commented Jan 17, 2018 via email

@tboerger
Copy link

Hopefully we can get this into a release soon... Running lots of docker compose setups for CI on a large machine currently sucks pretty much.

@Vanuan
Copy link

Vanuan commented Jan 27, 2018

It looks like we should move discussion to #36054

@benyanke
Copy link

@aboch - any chance you'd be able to edit your initial post to state that the PR is moved here? I'm sure I can't be the only one who just read through the thread looking for the final status, only to find it's actually moved.

@bluikko
Copy link

bluikko commented Apr 3, 2019

The built-in default address pools should be documented in docs.docker.com - I believe one reason why all the docs pages have been overwhelmingly ranked negative is because of the annoying lack of information. It is like the docs pages are written by someone who has no idea how to write technical documentation.

@trapier
Copy link
Contributor

trapier commented Apr 17, 2019

Thanks @bluikko for indicating this gap in documentation. Relayed that report to docker/docs#8663.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment