libnet/ipams/default: introduce a linear allocator #47768

akerouanton · 2024-04-26T16:34:16Z

Close IPv6 address pool subnet smaller than /80 causes dockerd to consume all available RAM #40275
Close splitNetwork and addIntToIP fails for IPv6 addresses on /64+ size pools #42801
Supersede / Close libnet/ipam: Lazily sub-divide pools into subnets #43033
Stacked on top of libnet/ipam: Various clean-ups #47727

- What I did

This previous allocator was subnetting address pools eagerly when the daemon started, and would then just iterate over that list whenever RequestPool was called. This was leading to high memory usage whenever IPv6 pools were configured with a target subnet size too different from the pools prefix size.

For instance: pool = fd00::/8, target size = /64 -- 2 ^ (64-8) subnets would be generated upfront. This would take approx. 9 * 10^18 bits -- way too much for any human computer in 2024.

Another noteworthy issue, the previous implementation was allocating a subnet, and then in another layer was checking whether the allocation was conflicting with some 'reserved networks'. If so, the allocation would be retried, etc... To make it worse, 'reserved networks' would be recomputed on every iteration. This is totally ineffective as there could be 'reserved networks' that fully overlap a given address pool (or many!).

To fix this issue, a new field Exclude is added to RequestPool. It's up to each driver to take it into account. Since we don't know whether this retry loop is useful for some remote IPAM driver, it's reimplemented bug-for-bug directly in the remote driver.

The new allocator uses a linear-search algorithm. It takes advantage of all lists (predefined pools, allocated subnets and reserved networks) being sorted and logically combines 'allocated' and 'reserved' through a 'double cursor' to iterate on both lists at the same time while preserving the total order. At the same time, it iterates over 'predefined' pools and looks for the first empty space that would be a good fit.

Currently, the size of the allocated subnet is still dictated by each 'predefined' pools. We should consider hardcoding that size instead, and let users specify what subnet size they want. This wasn't possible before as the subnets were generated upfront. This new allocator should be able to deal with this easily.

The method used for static allocation has been updated to make sure the ascending order of 'allocated' is preserved. It's bug-for-bug compatible with the previous implementation.

One consequence of this new algorithm is that we don't keep track of where the last allocation happened, we just allocate the first free subnet we find.

Before:

Allocate: 10.0.1.0/24, 10.0.2.0/24 ; Deallocate: 10.0.1.0/24 ; Allocate 10.0.3.0/24.

Now, the 3rd allocation would yield 10.0.1.0/24 once again.

As it doesn't change the semantics of the allocator, there's no reason to worry about that.

Finally, about 'reserved networks'. The heuristics we use are now properly documented. It was discovered that we don't check routes for IPv6 allocations -- this can't be changed because there's no such thing as on-link routes for IPv6.

(Kudos to Rob Murray for coming up with the linear-search idea.)

- How to verify it

CI -- a bunch of tests have been added, some have been rewritten.

Or manually by creating, deleting and re-creating networks.

- Description for the changelog

- Introduce a new subnet allocator that can deal with IPv6 address pools of any size

- A picture of a cute animal (not mandatory but encouraged)

libnetwork/internal/netiputil/netiputil.go

robmry

LGTM!

libnetwork/ipams/defaultipam/address_space_test.go

robmry · 2024-04-29T10:49:03Z

libnetwork/ipams/defaultipam/allocator.go

 		}

-		if p.Addr().Is4() {
-			v4 = append(v4, p)
+		if n.Base.Addr().Is4() {


This hasn't changed - but Is4 is false for an IPv4-mapped IPv6 addresses. It might be worth checking for that and storing the unmapped prefix in the IPv4 list, or just bailing out?

(We don't deal with mapped addresses in command line options either, but it might be less obvious here - the address pool just won't do anything useful. I'm not quite sure why someone would want to write IPv4 addresses as IPv6, but there's an issue somewhere asking us allow it.)

libnetwork/ipbits/ipbits.go

akerouanton · 2024-04-30T16:17:40Z

@corhere Please make sure this won't break Swarm in any way (eg. when a new leader is elected, etc...).

corhere

Review WIP; I haven't even finished with address_space.go. I'll be back tomorrow.

corhere · 2024-05-07T20:26:16Z

libnetwork/ipams/defaultipam/address_space.go

+	var last *ipamutils.NetworkToSplit
+	var discarded int
+	for i, imax := 0, len(predefined); i < imax; i++ {
+		p := predefined[i-discarded]
+		if last != nil && last.Overlaps(p.Base) {
+			predefined = slices.Delete(predefined, i-discarded, i-discarded+1)
+			discarded++
+			continue
+		}
+		last = p
+	}


Since the slices package is already being used, may as well take full advantage.

Suggested change

var last *ipamutils.NetworkToSplit

var discarded int

for i, imax := 0, len(predefined); i < imax; i++ {

p := predefined[i-discarded]

if last != nil && last.Overlaps(p.Base) {

predefined = slices.Delete(predefined, i-discarded, i-discarded+1)

discarded++

continue

}

last = p

}

predefined = slices.CompactFunc(predefined, func(last, p *ipamutils.NetworkToSplit) bool {

return last.Overlaps(p.Base)

})

slices.CompactFunc works a bit differently. It expects a strict equality as it doesn't compare to the last non-duplicate found, but to 'current-1'. If you have the following subnets:

10.0.0.0/8

10.0.0.0/16

10.10.0.0/16

It tries to compare s1 == s2, and then s2 == s3. That's not what we want.

slices.CompactFunc works a bit differently.

That's too bad; it would have been such an elegant solution!

I don't like that slices.Delete is being called in a loop:

Delete is O(len(s)-i)

Therefore the worst case time complexity is O(N^2).

It is doable in O(len(predefined)) time.

Yeah, at first I thought predefined wouldn't be filled with that many entries to really matter. But better safe than sorry, I guess. That's now fixed.

corhere · 2024-05-07T20:38:40Z

libnetwork/ipams/defaultipam/address_space.go

+	for i, allocated := range aSpace.allocated {
+		if nw.Addr().Compare(allocated.Addr()) < 0 {


aSpace.allocated is a sorted slice, which means binary searching is possible. Turn that O(n) search into O(log n) time complexity!

func (aSpace *addrSpace) allocatePool(nw netip.Prefix) error { n, _ := slices.BinarySearchFunc(aSpace.allocated, nw, func(allocated, nw netip.Prefix) int { return nw.Addr().Compare(allocated.Addr()) }) aSpace.allocated = slices.Insert(aSpace.allocated, n, nw) aSpace.subnets[nw] = newPoolData(nw) return nil }

Also, are duplicate allocations allowed? It would be trivial to detect this situation and return an error instead of inserting the duplicate entry into the slice.

libnetwork/ipams/defaultipam/address_space.go

corhere

The new allocator should work fine with Swarm. The CNM network allocator replays allocations as static assignments if there is an existing allocation in the Swarm state.

moby/libnetwork/cnmallocator/networkallocator.go

Lines 866 to 873 in 4554d87

    
           // If there is non-nil IPAM state always prefer those subnet 
        
           // configs over Spec configs. 
        
           if n.IPAM != nil { 
        
           	ipamConfigs = n.IPAM.Configs 
        
           } else if n.Spec.IPAM != nil { 
        
           	ipamConfigs = make([]*api.IPAMConfig, len(n.Spec.IPAM.Configs)) 
        
           	copy(ipamConfigs, n.Spec.IPAM.Configs) 
        
           }

libnetwork/ipams/defaultipam/structures.go

libnetwork/ipams/defaultipam/address_space.go

libnetwork/ipams/defaultipam/allocator.go

libnetwork/netutils/utils_linux.go

libnetwork/ipams/defaultipam/allocator.go

libnetwork/ipams/defaultipam/structures_test.go

libnetwork/netutils/utils_linux.go

libnetwork/ipams/defaultipam/address_space.go

libnetwork/drivers/bridge/bridge_linux_test.go

The previous allocator was subnetting address pools eagerly when the daemon started, and would then just iterate over that list whenever RequestPool was called. This was leading to high memory usage whenever IPv6 pools were configured with a target subnet size too different from the pools prefix size. For instance: pool = fd00::/8, target size = /64 -- 2 ^ (64-8) subnets would be generated upfront. This would take approx. 9 * 10^18 bits -- way too much for any human computer in 2024. Another noteworthy issue, the previous implementation was allocating a subnet, and then in another layer was checking whether the allocation was conflicting with some 'reserved networks'. If so, the allocation would be retried, etc... To make it worse, 'reserved networks' would be recomputed on every iteration. This is totally ineffective as there could be 'reserved networks' that fully overlap a given address pool (or many!). To fix this issue, a new field `Exclude` is added to `RequestPool`. It's up to each driver to take it into account. Since we don't know whether this retry loop is useful for some remote IPAM driver, it's reimplemented bug-for-bug directly in the remote driver. The new allocator uses a linear-search algorithm. It takes advantage of all lists (predefined pools, allocated subnets and reserved networks) being sorted and logically combines 'allocated' and 'reserved' through a 'double cursor' to iterate on both lists at the same time while preserving the total order. At the same time, it iterates over 'predefined' pools and looks for the first empty space that would be a good fit. Currently, the size of the allocated subnet is still dictated by each 'predefined' pools. We should consider hardcoding that size instead, and let users specify what subnet size they want. This wasn't possible before as the subnets were generated upfront. This new allocator should be able to deal with this easily. The method used for static allocation has been updated to make sure the ascending order of 'allocated' is preserved. It's bug-for-bug compatible with the previous implementation. One consequence of this new algorithm is that we don't keep track of where the last allocation happened, we just allocate the first free subnet we find. Before: - Allocate: 10.0.1.0/24, 10.0.2.0/24 ; Deallocate: 10.0.1.0/24 ; Allocate 10.0.3.0/24. Now, the 3rd allocation would yield 10.0.1.0/24 once again. As it doesn't change the semantics of the allocator, there's no reason to worry about that. Finally, about 'reserved networks'. The heuristics we use are now properly documented. It was discovered that we don't check routes for IPv6 allocations -- this can't be changed because there's no such thing as on-link routes for IPv6. (Kudos to Rob Murray for coming up with the linear-search idea.) Signed-off-by: Albin Kerouanton <albinker@gmail.com>

This ensures such address pools are part of the IPv4 address space. Signed-off-by: Albin Kerouanton <albinker@gmail.com>

Nothing was validating whether address pools' `base` prefix were larger than the target subnet `size` they're associated to. As such invalid address pools would yield no subnet, the error could go unnoticed. Signed-off-by: Albin Kerouanton <albinker@gmail.com>

[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker/docker](https://togithub.com/docker/docker) | major | `26.1.4` -> `27.0.3` | --- ### Release Notes <details> <summary>docker/docker (docker/docker)</summary> ### [`v27.0.3`](https://togithub.com/moby/moby/releases/tag/v27.0.3) [Compare Source](https://togithub.com/docker/docker/compare/v27.0.2...v27.0.3) #### 27.0.3 For a full list of pull requests and changes in this release, refer to the relevant GitHub milestones: - [docker/cli, 27.0.3 milestone](https://togithub.com/docker/cli/issues?q=is%3Aclosed+milestone%3A27.0.3) - [moby/moby, 27.0.3 milestone](https://togithub.com/moby/moby/issues?q=is%3Aclosed+milestone%3A27.0.3) - Deprecated and removed features, see [Deprecated Features](https://togithub.com/docker/cli/blob/v27.0.3/docs/deprecated.md). - Changes to the Engine API, see [API version history](https://togithub.com/moby/moby/blob/v27.0.3/docs/api/version-history.md). ##### Bug fixes and enhancements - Fix a regression that incorrectly reported a port mapping from a host IPv6 address to an IPv4-only container as an error. [moby/moby#48090](https://togithub.com/moby/moby/pull/48090) - Fix a regression that caused duplicate subnet allocations when creating networks. [moby/moby#48089](https://togithub.com/moby/moby/pull/48089) - Fix a regression resulting in "fail to register layer: failed to Lchown" errors when trying to pull an image with rootless enabled on a system that supports native overlay with user-namespaces. [moby/moby#48086](https://togithub.com/moby/moby/pull/48086) ### [`v27.0.2`](https://togithub.com/moby/moby/releases/tag/v27.0.2) [Compare Source](https://togithub.com/docker/docker/compare/v27.0.1-rc.1...v27.0.2) #### 27.0.2 For a full list of pull requests and changes in this release, refer to the relevant GitHub milestones: - [docker/cli, 27.0.2 milestone](https://togithub.com/docker/cli/issues?q=is%3Aclosed+milestone%3A27.0.2) - [moby/moby, 27.0.2 milestone](https://togithub.com/moby/moby/issues?q=is%3Aclosed+milestone%3A27.0.2) - Deprecated and removed features, see [Deprecated Features](https://togithub.com/docker/cli/blob/v27.0.2/docs/deprecated.md). - Changes to the Engine API, see [API version history](https://togithub.com/moby/moby/blob/v27.0.2/docs/api/version-history.md). ##### Bug fixes and enhancements - Fix a regression that caused port numbers to be ignored when parsing a Docker registry URL. [docker/cli#5197](https://togithub.com/docker/cli/pull/5197), [docker/cli#5198](https://togithub.com/docker/cli/pull/5198) ##### Removed - api/types: deprecate `ContainerJSONBase.Node` field and `ContainerNode` type. These definitions were used by the standalone ("classic") Swarm API, but never implemented in the Docker Engine itself. [moby/moby#48055](https://togithub.com/moby/moby/pull/48055) ### [`v27.0.1`](https://togithub.com/moby/moby/releases/tag/v27.0.1) [Compare Source](https://togithub.com/docker/docker/compare/v26.1.4...v27.0.1-rc.1) #### 27.0.1 For a full list of pull requests and changes in this release, refer to the relevant GitHub milestones: - [docker/cli, 27.0.0 milestone](https://togithub.com/docker/cli/issues?q=is%3Aclosed+milestone%3A27.0.0) - [moby/moby, 27.0.0 milestone](https://togithub.com/moby/moby/issues?q=is%3Aclosed+milestone%3A27.0.0) - Deprecated and removed features, see [Deprecated Features](https://togithub.com/docker/cli/blob/v27.0.1/docs/deprecated.md). - Changes to the Engine API, see [API version history](https://togithub.com/moby/moby/blob/v27.0.1/docs/api/version-history.md). ##### New - containerd image store: Add `--platform` flag to `docker image push` and improve the default behavior when not all platforms of the multi-platform image are available locally. [docker/cli#4984](https://togithub.com/docker/cli/pull/4984), [moby/moby#47679](https://togithub.com/moby/moby/pull/47679) - Add support to `docker stack deploy` for `driver_opts` in a service's networks. [docker/cli#5125](https://togithub.com/docker/cli/pull/5125) - Consider additional `/usr/local/libexec` and `/usr/libexec` paths when looking up the userland proxy binaries by a name with a `docker-` prefix. [moby/moby#47804](https://togithub.com/moby/moby/pull/47804) ##### Bug fixes and enhancements - `*client.Client` instances are now always safe for concurrent use by multiple goroutines. Previously, this could lead to data races when the `WithAPIVersionNegotiation()` option is used. [moby/moby#47961](https://togithub.com/moby/moby/pull/47961) - Fix a bug causing the Docker CLI to leak Unix sockets in `$TMPDIR` in some cases. [docker/cli#5146](https://togithub.com/docker/cli/pull/5146) - Don't ignore a custom seccomp profile when used in conjunction with `--privileged`. [moby/moby#47500](https://togithub.com/moby/moby/pull/47500) - rootless: overlay2: support native overlay diff when using rootless-mode with Linux kernel version 5.11 and later. [moby/moby#47605](https://togithub.com/moby/moby/pull/47605) - Fix the `StartInterval` default value of healthcheck to reflect the documented value of 5s. [moby/moby#47799](https://togithub.com/moby/moby/pull/47799) - Fix `docker save` and `docker load` not ending on the daemon side when the operation was cancelled by the user, for example with <kbd>Ctrl+C</kbd>. [moby/moby#47629](https://togithub.com/moby/moby/pull/47629) - The `StartedAt` property of containers is now recorded before container startup, guaranteeing that the `StartedAt` is always before `FinishedAt`. [moby/moby#47003](https://togithub.com/moby/moby/pull/47003) - The internal DNS resolver used by Windows containers on Windows now forwards requests to external DNS servers by default. This enables `nslookup` to resolve external hostnames. This behaviour can be disabled via `daemon.json`, using `"features": { "windows-dns-proxy": false }`. The configuration option will be removed in a future release. [moby/moby#47826](https://togithub.com/moby/moby/pull/47826) - Print a warning when the CLI does not have permissions to read the configuration file. [docker/cli#5077](https://togithub.com/docker/cli/pull/5077) - Fix a goroutine and file-descriptor leak on container attach. [moby/moby#45052](https://togithub.com/moby/moby/pull/45052) - Clear the networking state of all stopped or dead containers during daemon start-up. [moby/moby#47984](https://togithub.com/moby/moby/pull/47984) - Write volume options JSON atomically to avoid "invalid JSON" errors after system crash. [moby/moby#48034](https://togithub.com/moby/moby/pull/48034) - Allow multiple macvlan networks with the same parent. [moby/moby#47318](https://togithub.com/moby/moby/pull/47318) - Allow BuildKit to be used on Windows daemons that advertise it. [docker/cli#5178](https://togithub.com/docker/cli/pull/5178) ##### Networking - Allow sysctls to be set per-interface during container creation and network connection. [moby/moby#47686](https://togithub.com/moby/moby/pull/47686) - In a future release, this will be the only way to set per-interface sysctl options. For example, on the command line in a `docker run` command,`--network mynet --sysctl net.ipv4.conf.eth0.log_martians=1` will be rejected. Instead, you must use `--network name=mynet,driver-opt=com.docker.network.endpoint.sysctls=net.ipv4.conf.IFNAME.log_martians=1`. ##### IPv6 - `ip6tables` is no longer experimental. You may remove the `experimental` configuration option and continue to use IPv6, if it is not required by any other features. - `ip6tables` is now enabled for Linux bridge networks by default. [moby/moby#47747](https://togithub.com/moby/moby/pull/47747) - This makes IPv4 and IPv6 behaviors consistent with each other, and reduces the risk that IPv6-enabled containers are inadvertently exposed to the network. - There is no impact if you are running Docker Engine with `ip6tables` enabled (new default). - If you are using an IPv6-enabled bridge network without `ip6tables`, this is likely a breaking change. Only published container ports (`-p` or `--publish`) are accessible from outside the Docker bridge network, and outgoing connections masquerade as the host. - To restore the behavior of earlier releases, no `ip6tables` at all, set `"ip6tables": false` in `daemon.json`, or use the CLI option `--ip6tables=false`. Alternatively, leave `ip6tables` enabled, publish ports, and enable direct routing. - With `ip6tables` enabled, if `ip6tables` is not functional on your host, Docker Engine will start but it will not be possible to create an IPv6-enabled network. ##### IPv6 network configuration improvements - A Unique Local Address (ULA) base prefix is automatically added to `default-address-pools` if this parameter wasn't manually configured, or if it contains no IPv6 prefixes. [moby/moby#47853](https://togithub.com/moby/moby/pull/47853) - Prior to this release, to create an IPv6-enabled network it was necessary to use the `--subnet` option to specify an IPv6 subnet, or add IPv6 ranges to `default-address-pools` in `daemon.json`. - Starting in this release, when a bridge network is created with `--ipv6` and no IPv6 subnet is defined by those options, an IPv6 Unique Local Address (ULA) base prefix is used. - The ULA prefix is derived from the Engine host ID such that it's unique across hosts and over time. - IPv6 address pools of any size can now be added to `default-address-pools`. [moby/moby#47768](https://togithub.com/moby/moby/pull/47768) - IPv6 can now be enabled by default on all custom bridge networks using `"default-network-opts": { "bridge": {"com.docker.network.enable_ipv6": "true"}}` in `daemon.json`, or `dockerd --default-network-opt=bridge=com.docker.network.enable_ipv6=true`on the comand line. [moby/moby#47867](https://togithub.com/moby/moby/pull/47867) - Direct routing for IPv6 networks, with `ip6tables` enabled. [moby/moby#47871](https://togithub.com/moby/moby/pull/47871) - Added bridge driver option `com.docker.network.bridge.gateway_mode_ipv6=<nat|routed>`. - The default behavior, `nat`, is unchanged from previous releases running with `ip6tables` enabled. NAT and masquerading rules are set up for each published container port. - When set to `routed`, no NAT or masquerading rules are configured for published ports. This enables direct IPv6 access to the container, if the host's network can route packets for the container's address to the host. Published ports will be opened in the container's firewall. - When a port mapping only applies to `routed` mode, only addresses `0.0.0.0` or `::` are allowed and a host port must not be given. - Note that published container ports, in `nat` or `routed` mode, are accessible from any remote address if routing is set up in the network, unless the Docker host's firewall has additional restrictions. For example: `docker network create --ipv6 -o com.docker.network.bridge.gateway_mode_ipv6=routed mynet`. - The option `com.docker.network.bridge.gateway_mode_ipv4=<nat|routed>` is also available, with the same behavior but for IPv4. - If firewalld is running on the host, Docker creates policy `docker-forwarding` to allow forwarding from any zone to the `docker` zone. This makes it possible to configure a bridge network with a routable IPv6 address, and no NAT or masquerading. [moby/moby#47745](https://togithub.com/moby/moby/pull/47745) - When a port is published with no host port specified, or a host port range is given, the same port will be allocated for IPv4 and IPv6. [moby/moby#47871](https://togithub.com/moby/moby/pull/47871) - For example `-p 80` will result in the same ephemeral port being allocated for `0.0.0.0` and `::`, and `-p 8080-8083:80` will pick the same port from the range for both address families. - Similarly, ports published to specific addresses will be allocated the same port. For example, `-p 127.0.0.1::80 -p '[::1]::80'`. - If no port is available on all required addresses, container creation will fail. - Environment variable `DOCKER_ALLOW_IPV6_ON_IPV4_INTERFACE`, introduced in release 26.1.1, no longer has any effect. [moby/moby#47963](https://togithub.com/moby/moby/pull/47963) - If IPv6 could not be disabled on an interface because of a read-only `/proc/sys/net`, the environment variable allowed the container to start anyway. - In this release, if IPv4 cannot be disabled for an interface, IPv6 can be explicitly enabled for the network simply by using `--ipv6` when creating it. Other workarounds are to configure the OS to disable IPv6 by default on new interfaces, mount `/proc/sys/net` read-write, or use a kernel with no IPv6 support. - For IPv6-enabled bridge networks, do not attempt to replace the bridge's kernel-assigned link local address with `fe80::1`. [moby/moby#47787](https://togithub.com/moby/moby/pull/47787) ##### Removed - Deprecate experimental GraphDriver plugins. [moby/moby#48050](https://togithub.com/moby/moby/pull/48050), [docker/cli#5172](https://togithub.com/docker/cli/pull/5172) - pkg/archive: deprecate `NewTempArchive` and `TempArchive`. These types were only used in tests and will be removed in the next release. [moby/moby#48002](https://togithub.com/moby/moby/pull/48002) - pkg/archive: deprecate `CanonicalTarNameForPath` [moby/moby#48001](https://togithub.com/moby/moby/pull/48001) - Deprecate pkg/dmesg. This package was no longer used, and will be removed in the next release. [moby/moby#47999](https://togithub.com/moby/moby/pull/47999) - Deprecate `pkg/stringid.ValidateID` and `pkg/stringid.IsShortID` [moby/moby#47995](https://togithub.com/moby/moby/pull/47995) - runconfig: deprecate `SetDefaultNetModeIfBlank` and move `ContainerConfigWrapper` to `api/types/container` [moby/moby#48007](https://togithub.com/moby/moby/pull/48007) - runconfig: deprecate `DefaultDaemonNetworkMode` and move to `daemon/network` [moby/moby#48008](https://togithub.com/moby/moby/pull/48008) - runconfig: deprecate `opts.ConvertKVStringsToMap`. This utility is no longer used, and will be removed in the next release. [moby/moby#48016](https://togithub.com/moby/moby/pull/48016) - runconfig: deprecate `IsPreDefinedNetwork`. [moby/moby#48011](https://togithub.com/moby/moby/pull/48011) ##### API - containerd image store: `POST /images/{name}/push` now supports a `platform` parameter (JSON encoded OCI Platform type) that allows selecting a specific platform-manifest from the multi-platform image. This is experimental and may change in future API versions. [moby/moby#47679](https://togithub.com/moby/moby/pull/47679) - `POST /services/create` and `POST /services/{id}/update` now support `OomScoreAdj`. [moby/moby#47950](https://togithub.com/moby/moby/pull/47950) - `ContainerList` api returns container annotations. [moby/moby#47866](https://togithub.com/moby/moby/pull/47866) - `POST /containers/create` and `POST /services/create` now take `Options` as part of `HostConfig.Mounts.TmpfsOptions` allowing to set options for tmpfs mounts. [moby/moby#46809](https://togithub.com/moby/moby/pull/46809) - The `Healthcheck.StartInterval` property is now correctly ignored when updating a Swarm service using API versions less than v1.44. [moby/moby#47991](https://togithub.com/moby/moby/pull/47991) - `GET /events` now supports image `create` event that is emitted when a new image is built regardless if it was tagged or not. [moby/moby#47929](https://togithub.com/moby/moby/pull/47929) - `GET /info` now includes a `Containerd` field containing information about the location of the containerd API socket and containerd namespaces used by the daemon to run containers and plugins. [moby/moby#47239](https://togithub.com/moby/moby/pull/47239) - Deprecate non-standard (config) fields in image inspect output. The `Config` field returned by this endpoint (used for `docker image inspect`) returned additional fields that are not part of the image's configuration and not part of the [Docker Image Spec] and the [OCI Image Spec]. These fields are never set (and always return the default value for the type), but are not omitted in the response when left empty. As these fields were not intended to be part of the image configuration response, they are deprecated, and will be removed in the future API versions. - Deprecate the daemon flag `--api-cors-header` and the corresponding `daemon.json` configuration option. These will be removed in the next major release. [moby/moby#45313](https://togithub.com/moby/moby/pull/45313) The following deprecated fields are currently included in the API response, but are not part of the underlying image's `Config`: [moby/moby#47941](https://togithub.com/moby/moby/pull/47941) - `Hostname` - `Domainname` - `AttachStdin` - `AttachStdout` - `AttachStderr` - `Tty` - `OpenStdin` - `StdinOnce` - `Image` - `NetworkDisabled` (already omitted unless set) - `MacAddress` (already omitted unless set) - `StopTimeout` (already omitted unless set) ##### Go SDK changes - Client API callback for the following functions now require a context parameter. [moby/moby#47536](https://togithub.com/moby/moby/pull/47536) - `client.RequestPrivilegeFunc` - `client.ImageSearchOptions.AcceptPermissionsFunc` - `image.ImportOptions.PrivilegeFunc` - Remove deprecated aliases for Image types. [moby/moby#47900](https://togithub.com/moby/moby/pull/47900) - `ImageImportOptions` - `ImageCreateOptions` - `ImagePullOptions` - `ImagePushOptions` - `ImageListOptions` - `ImageRemoveOptions` - Introduce `Ulimit` type alias for `github.com/docker/go-units.Ulimit`. The `Ulimit` type as used in the API is defined in a Go module that will transition to a new location in future. A type alias is added to reduce the friction that comes with moving the type to a new location. The alias makes sure that existing code continues to work, but its definition may change in future. Users are recommended to use this alias instead of the `units.Ulimit` directly. [moby/moby#48023](https://togithub.com/moby/moby/pull/48023) - Move and rename types, changing their import paths and exported names. [moby/moby#47936](https://togithub.com/moby/moby/pull/47936), [moby/moby#47873](https://togithub.com/moby/moby/pull/47873), [moby/moby#47887](https://togithub.com/moby/moby/pull/47887), [moby/moby#47882](https://togithub.com/moby/moby/pull/47882), [moby/moby#47921](https://togithub.com/moby/moby/pull/47921), [moby/moby#48040](https://togithub.com/moby/moby/pull/48040): - Move the following types to `api/types/container`: - `BlkioStatEntry` - `BlkioStats` - `CPUStats` - `CPUUsage` - `ContainerExecInspect` - `ContainerPathStat` - `ContainerStats` - `ContainersPruneReport` - `CopyToContainerOptions` - `ExecConfig` - `ExecStartCheck` - `MemoryStats` - `NetworkStats` - `PidsStats` - `StatsJSON` - `Stats` - `StorageStats` - `ThrottlingData` - Move the following types to `api/types/image`: - `ImagesPruneReport` - `ImageImportSource` - `ImageLoadResponse` - Move the `ExecStartOptions` type to `api/types/backend`. - Move the `VolumesPruneReport` type to `api/types/volume`. - Move the `EventsOptions` type to `api/types/events`. - Move the `ImageSearchOptions` type to `api/types/registry`. - Drop `Network` prefix and move the following types to `api/types/network`: - `NetworkCreateResponse` - `NetworkConnect` - `NetworkDisconnect` - `NetworkInspectOptions` - `EndpointResource` - `NetworkListOptions` - `NetworkCreateOptions` - `NetworkCreateRequest` - `NetworksPruneReport` - Move `NetworkResource` to `api/types/network`. ##### Packaging updates - Update Buildx to [v0.15.1](https://togithub.com/docker/buildx/releases/tag/v0.15.1). [docker/docker-ce-packaging#1029](https://togithub.com/docker/docker-ce-packaging/pull/1029) - Update BuildKit to [v0.14.1](https://togithub.com/moby/buildkit/releases/tag/v0.14.1). [moby/moby#48028](https://togithub.com/moby/moby/pull/48028) - Update runc to [v1.1.13](https://togithub.com/opencontainers/runc/releases/tag/v1.1.13) [moby/moby#47976](https://togithub.com/moby/moby/pull/47976) - Update Compose to [v2.28.1](https://togithub.com/docker/compose/releases/tag/v2.28.1). [moby/docker-ce-packaging#1032](https://togithub.com/docker/docker-ce-packaging/pull/1032) [Docker image spec]: https://togithub.com/moby/docker-image-spec/blob/v1.3.1/specs-go/v1/image.go#L19-L32 [OCI Image Spec]: https://togithub.com/opencontainers/image-spec/blob/v1.1.0/specs-go/v1/config.go#L24-L62 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "after 6am on monday" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/earthly/dind).  Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

akerouanton added status/2-code-review kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. area/networking impact/changelog area/networking/ipam labels Apr 26, 2024

akerouanton added this to the 27.0.0 milestone Apr 26, 2024

akerouanton self-assigned this Apr 26, 2024

akerouanton mentioned this pull request Apr 26, 2024

libnet/ipam: Lazily sub-divide pools into subnets #43033

Closed

akerouanton force-pushed the libnet-ipam-linear-allocator branch from aae4040 to 9c6196f Compare April 26, 2024 16:43

corhere reviewed Apr 26, 2024

View reviewed changes

libnetwork/internal/netiputil/netiputil.go Outdated Show resolved Hide resolved

akerouanton mentioned this pull request Apr 26, 2024

libnet/ipam: Various clean-ups #47727

Merged

akerouanton force-pushed the libnet-ipam-linear-allocator branch from 9c6196f to 59e3d2a Compare April 26, 2024 20:44

akerouanton requested a review from robmry April 26, 2024 20:49

akerouanton marked this pull request as ready for review April 29, 2024 11:49

robmry approved these changes Apr 29, 2024

View reviewed changes

akerouanton force-pushed the libnet-ipam-linear-allocator branch from 59e3d2a to 304e175 Compare May 7, 2024 06:57

akerouanton requested a review from robmry May 7, 2024 06:57

robmry approved these changes May 7, 2024

View reviewed changes

corhere reviewed May 7, 2024

View reviewed changes

corhere reviewed May 8, 2024

View reviewed changes

akerouanton force-pushed the libnet-ipam-linear-allocator branch 2 times, most recently from 5cfd940 to b2fb88d Compare May 11, 2024 12:45

corhere reviewed May 17, 2024

View reviewed changes

libnetwork/ipams/defaultipam/structures_test.go Outdated Show resolved Hide resolved

libnetwork/netutils/utils_linux.go Outdated Show resolved Hide resolved

akerouanton force-pushed the libnet-ipam-linear-allocator branch 4 times, most recently from 7e24653 to 37ba824 Compare May 22, 2024 06:33

corhere approved these changes May 22, 2024

View reviewed changes

libnetwork/ipams/defaultipam/address_space.go Show resolved Hide resolved

libnetwork/drivers/bridge/bridge_linux_test.go Outdated Show resolved Hide resolved

akerouanton force-pushed the libnet-ipam-linear-allocator branch from 37ba824 to 9b54897 Compare May 23, 2024 06:25

akerouanton added 2 commits May 23, 2024 08:26

libnet/i/defaultipam: Unmap IPv4-mapped IPv6 addrs

0c02230

This ensures such address pools are part of the IPv4 address space. Signed-off-by: Albin Kerouanton <albinker@gmail.com>

akerouanton force-pushed the libnet-ipam-linear-allocator branch from 9b54897 to 500eff0 Compare May 23, 2024 06:26

robmry approved these changes May 23, 2024

View reviewed changes

akerouanton merged commit d16a425 into moby:master May 23, 2024
126 checks passed

akerouanton deleted the libnet-ipam-linear-allocator branch May 23, 2024 09:07

This was referenced Jun 24, 2024

Updates for moby 27.0 docker/docs#20026

Merged

Docker-27.0.2: "invalid pool request: Pool overlaps with other one on this address space" #48069

Closed

Fix duplicate subnet allocations #48084

Merged

thaJeztah mentioned this pull request Jun 28, 2024

Fix incorrect validation of port mapping #48088

Merged

robmry mentioned this pull request Jun 28, 2024

[27.0 backport] Fix duplicate subnet allocations #48089

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libnet/ipams/default: introduce a linear allocator #47768

libnet/ipams/default: introduce a linear allocator #47768

akerouanton commented Apr 26, 2024 •

edited

Loading

robmry left a comment

robmry Apr 29, 2024

akerouanton commented Apr 30, 2024

corhere left a comment

corhere May 7, 2024

akerouanton May 10, 2024

corhere May 17, 2024 •

edited

Loading

akerouanton May 21, 2024

corhere May 7, 2024

corhere left a comment

		for i, allocated := range aSpace.allocated {
		if nw.Addr().Compare(allocated.Addr()) < 0 {

	// If there is non-nil IPAM state always prefer those subnet
	// configs over Spec configs.
	if n.IPAM != nil {
	ipamConfigs = n.IPAM.Configs
	} else if n.Spec.IPAM != nil {
	ipamConfigs = make([]*api.IPAMConfig, len(n.Spec.IPAM.Configs))
	copy(ipamConfigs, n.Spec.IPAM.Configs)
	}

libnet/ipams/default: introduce a linear allocator #47768

libnet/ipams/default: introduce a linear allocator #47768

Conversation

akerouanton commented Apr 26, 2024 • edited Loading

robmry left a comment

Choose a reason for hiding this comment

robmry Apr 29, 2024

Choose a reason for hiding this comment

akerouanton commented Apr 30, 2024

corhere left a comment

Choose a reason for hiding this comment

corhere May 7, 2024

Choose a reason for hiding this comment

akerouanton May 10, 2024

Choose a reason for hiding this comment

corhere May 17, 2024 • edited Loading

Choose a reason for hiding this comment

akerouanton May 21, 2024

Choose a reason for hiding this comment

corhere May 7, 2024

Choose a reason for hiding this comment

corhere left a comment

Choose a reason for hiding this comment

akerouanton commented Apr 26, 2024 •

edited

Loading

corhere May 17, 2024 •

edited

Loading