Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable grid subnet, supernet #1323

Merged
merged 9 commits into from Feb 28, 2017
Merged

Configurable grid subnet, supernet #1323

merged 9 commits into from Feb 28, 2017

Conversation

SpComb
Copy link
Contributor

@SpComb SpComb commented Nov 15, 2016

Fixes #1304

Server

  • Add Grid subnet, supernet to model, JSON views

  • Add HostNode overlay_ip to models, JSON views

    Uses IPAddr to compute the offset, allowing the use of node_number > 255 once any existing 10.81.1.X service containers are migrated over to the new 10.81.128.X range.

  • Support optional POST /grids {"subnet": "...", "supernet": "..."}

  • Migrate existing girds to use the current default subnet/supernet

CLI

  • Render grid subnet, host node overlay_ip from server JSON
  • Render kontena grid cloud-config --insecure-registry= from the grid subnet
  • Support kontena grid create --subnet= --supernet=

Agent

  • Add IPAddr helpers

  • Use node_info['grid']['subnet'] and node_info['overlay_ip'] for etcd, weave launchers

  • Use node_info['grid']['supernet'] for the IPAM launcher

  • Use node_info['grid]['subnet'] for the default kontena network

    Automatically compute the iprange from the upper half of the grid subnet.

@SpComb
Copy link
Contributor Author

SpComb commented Nov 21, 2016

Rebased for 0.17 git master

@jakolehm jakolehm modified the milestones: 1.0.0, 0.17.0, 1.1.0 Nov 24, 2016
@SpComb
Copy link
Contributor Author

SpComb commented Nov 30, 2016

Rebased for 1.0 git master.

TODO on fixing up the WeaveWorker#attach_overlay 0.16 container migration logic, which is currently hardcoded to assume the prior 10.81.0.0/19: #1336 (comment)

Yes, this should ideally check that the container.overlay_addr.network from the container label overlaps with the expected grid subnet, such that the existing container.overlay_addr is within the new grid subnet.

if container.overlay_cidr.network != grid_subnet && grid_subnet.include?(container.overlay_ip)

For 0.16 -> 0.17 it can be implicitly assumed that both of those are within the 10.81.0.0/#{OVERLAY_SUFFIX}. With #1323 the WeaveWorker will need access to the node_info['grid']['subnet'], and better IPAddr logic for this comparison.

This could be used to allow some limited form of grid subnet changes in the future.

We can't attach containers until we have the grid configuration, which changes the WeaveWorker` startup ordering again.

@SpComb SpComb self-assigned this Dec 7, 2016
@SpComb
Copy link
Contributor Author

SpComb commented Dec 7, 2016

Assigning to me as I still need to work on the configurable subnet migration.

@jakolehm
Copy link
Contributor

Is this progressing?

@SpComb
Copy link
Contributor Author

SpComb commented Dec 28, 2016

Is this progressing?

This is blocked on the hardcoded /16 subnet mask used for <1.0 container network migration in the WeaveWorker. Each attach event needs access to the grid/node info from the server, which only exists in the Celluloid notifications atm.

@jakolehm
Copy link
Contributor

But this can be only configured when grid is created, right? I can't see how this affects <1.0 migration?

@SpComb
Copy link
Contributor Author

SpComb commented Dec 28, 2016

The WeaveWorker needs to know if the io.kontena.container.overlay_cidr label on the container contains an old pre-migration subnet mask. If it does, then it rewrites it to the desired grid overlay subnet mask. For that to happen, it needs to know what the current actual grid overlay subnet mask is, and you can't update that into the pre-existing container labels.

For new service containers being deployed, it's indeed not an issue; the server assigns them an overlay_cidr and the agent just sets and uses that label.

@SpComb
Copy link
Contributor Author

SpComb commented Jan 12, 2017

Rebased to current git master.

Travis is failing on the flaky server specs.

@jakolehm jakolehm modified the milestones: 1.2.0, 1.1.0 Jan 23, 2017
@SpComb
Copy link
Contributor Author

SpComb commented Jan 23, 2017

The CLI/server parts of this are pretty straightforward, but the agent bits are pending on a major refactoring to clean up the agent:node_info notifications and weave actors.

Copy link
Contributor

@jnummelin jnummelin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments on code, overall looks good.

Documentation should be updated, especially regarding the supernet/subnet relation. That might not be really intuitive for all the Kontena users.

Some quick testing locally:

$ k grid create --supernet 10.90.0.0/12 --subnet 10.90.0.0/17 subnet-test

# ip addr show dev weave
33: weave: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UP qlen 1000
    link/ether de:58:13:a5:1e:ce brd ff:ff:ff:ff:ff:ff
    inet 10.90.0.1/17 scope global weave
       valid_lft forever preferred_lft forever
    inet6 fe80::dc58:13ff:fea5:1ece/64 scope link 
       valid_lft forever preferred_lft forever

$ k service create web nginx && k service deploy web
$ k service show web | grep ip:
      ip: 10.90.64.65

Everything works as expected.

# override label for existing containers that may need to be migrated
if container.overlay_network.nil?
# overlay network migration for 0.16 compat
# override overlay network /12 -> /16 suffix for existing containers that may need to be migrated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/12 network? Do you mean the old (< 1.0) /19 subnet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I thought I rgrep'd for all of these; I confused the grid supernet with the old subnet :)

'grid' => {
'initial_size' => 3
'initial_size' => 3,
'subnet' => '10.81.0.0/16',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node_info is repeating quite often in the test cases, maybe do a let(:node_info)... over it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored specs to use context node_info

@@ -0,0 +1,48 @@
class IPAddr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing IPAddr patching in many places now, maybe it's time to roll these as a separate gem? (not maybe part of this PR, but overall)

Copy link
Contributor Author

@SpComb SpComb Jan 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the stdlib IPAddr is really inadequate. I figure these should be replaced with the ipaddress gem.

The worst part is the crazy-looking IPAddr operations on the server, like

(IPAddr.new(self.grid.subnet) | self.node_number).to_s

This would be make somewhat more sense with the agent IPAddr patching (IPAddr.new(self.grid.subnet)[self.node_number]), but the way the kontena repo is structured, there's no easy way to share code between across all three sub-components...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I fix all the server and agent stuff to start using ipaddress in this PR, or later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do it later as separate exercise.

@SpComb
Copy link
Contributor Author

SpComb commented Jan 26, 2017

The plugins will also need to fixed to use the grid['subnet'] for the docker --insecure-registry=

@SpComb
Copy link
Contributor Author

SpComb commented Jan 26, 2017

Documentation should be updated, especially regarding the supernet/subnet relation. That might not be really intuitive for all the Kontena users.

Added Kontena Network Model docs. Not sure what other docs should mention this? It's a pretty niche thing... until we have to start dealing with platforms/providers using 10.81.0.0/16 (like Packet, see #1304).

@SpComb
Copy link
Contributor Author

SpComb commented Jan 26, 2017

@kke IMO this should be mostly okay for 1.1, if we want to. Need to fix the plugins to support this after merging, though.

@SpComb SpComb removed their assignment Jan 26, 2017
Both of these private RFC1918 IP address spaces are used for internal overlay networking within each grid.
These overlay network addresses are only unique within a grid; different grids can (and will) use the same overlay network IP addresses.

The grid subnet (`10.81.0.0/16`) is used to provide overlay networking addresses for both host nodes and service contaienrs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo

@jnummelin
Copy link
Contributor

jnummelin commented Jan 26, 2017

I think the grid command reference should be updated to reflect this in /docs/using-kontena/grids.md

@SpComb
Copy link
Contributor Author

SpComb commented Jan 27, 2017

I think the grid command reference should be updated to reflect this in /docs/using-kontena/grids.md

Fixed several other things at the same time, see #1727

SpComb pushed a commit that referenced this pull request Feb 1, 2017
kke pushed a commit that referenced this pull request Feb 2, 2017
* docs grids: document grid options, initial nodes

* edit, fix links

* remove references to #1323
@SpComb
Copy link
Contributor Author

SpComb commented Feb 21, 2017

Rebased for latest 1.1.x master, pruned the /16 validation stuff, and re-added the grid subnet docs left out of #1727

@SpComb SpComb merged commit e38e1ec into master Feb 28, 2017
@SpComb SpComb deleted the grid-subnet-supernet branch February 28, 2017 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants