Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

microcloud init does not perform sanity checks on the uplink network input #210

Closed
IsaacJT opened this issue Nov 10, 2023 · 4 comments
Closed
Assignees
Labels
Feature New feature, not a bug

Comments

@IsaacJT
Copy link

IsaacJT commented Nov 10, 2023

microcloud init does not perform many sanity checks on the inputs for configuring the OVN uplink network. I can identify two cases where the user can easily make a mistake while entering the network configuration, which causes the init to fail at the last stage, requiring that the user starts again.

Case 1: first and last addresses are outside the gateway's subnet

$ sudo microcloud init
Waiting for LXD to start...
Using address "10.42.0.49" for MicroCloud
Limit search for other MicroCloud servers to 10.42.0.49/24? (yes/no) [default=yes]:
Scanning for eligible servers ...

 Selected "orangepi-1" at "10.42.0.198"
 Selected "orangepi-2" at "10.42.0.49"
 Selected "orangepi-3" at "10.42.0.217"

Would you like to set up local storage? (yes/no) [default=yes]: no
Would you like to set up distributed storage? (yes/no) [default=yes]:
Select from the available unpartitioned disks:

Select which disks to wipe:

 Using 1 disk(s) on "orangepi-2" for remote storage pool
 Using 1 disk(s) on "orangepi-3" for remote storage pool
 Using 1 disk(s) on "orangepi-1" for remote storage pool

Configure distributed networking? (yes/no) [default=yes]:
Select exactly one network interface from each cluster member:

 Using "enP4p65s0" on "orangepi-2" for OVN uplink
 Using "enP4p65s0" on "orangepi-3" for OVN uplink
 Using "enP4p65s0" on "orangepi-1" for OVN uplink

Specify the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 10.1.1.1/24
Specify the first IPv4 address in the range to use with LXD: 10.2.2.50
Specify the last IPv4 address in the range to use with LXD: 10.2.2.100
Specify the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6):
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroOVN is ready
 Local MicroCeph is ready
Awaiting cluster formation ...
 Peer "orangepi-3" has joined the cluster
 Peer "orangepi-1" has joined the cluster
Configuring cluster-wide devices ...
Error: Failed allocating uplink port IPs on network "UPLINK": Failed to parse uplink IPv4 OVN ranges: IP range "10.2.2.50-10.2.2.100" does not fall within any of the allowed networks [10.1.1.0/24]

The first and last addresses should maybe be validated to ensure that they are within the gateway's subnet.

Case 2: uplink network is in the same subnet as the management network (here the management network has the subnet 10.0.0.0/8 and the uplink network is configured to 10.254.254.0/24)

$ sudo microcloud init
Waiting for LXD to start...
Using address "10.42.0.49" for MicroCloud
Limit search for other MicroCloud servers to 10.42.0.49/8? (yes/no) [default=yes]:
Scanning for eligible servers ...

 Selected "orangepi-2" at "10.42.0.49"
 Selected "orangepi-3" at "10.42.0.217"
 Selected "orangepi-1" at "10.42.0.198"

Would you like to set up local storage? (yes/no) [default=yes]: no
Would you like to set up distributed storage? (yes/no) [default=yes]:
Select from the available unpartitioned disks:

Select which disks to wipe:

 Using 1 disk(s) on "orangepi-2" for remote storage pool
 Using 1 disk(s) on "orangepi-3" for remote storage pool
 Using 1 disk(s) on "orangepi-1" for remote storage pool

Configure distributed networking? (yes/no) [default=yes]:
Select exactly one network interface from each cluster member:

 Using "enP4p65s0" on "orangepi-2" for OVN uplink
 Using "enP4p65s0" on "orangepi-1" for OVN uplink
 Using "enP4p65s0" on "orangepi-3" for OVN uplink

Specify the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 10.254.254.1
Invalid input: invalid CIDR address: 10.254.254.1

Specify the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 10.254.254.1/24
Specify the first IPv4 address in the range to use with LXD: 10.254.254.10
Specify the last IPv4 address in the range to use with LXD: 10.254.254.254
Specify the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6):
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroOVN is ready
 Local MicroCeph is ready
Awaiting cluster formation ...
 Peer "orangepi-1" has joined the cluster
 Peer "orangepi-3" has joined the cluster
Configuring cluster-wide devices ...
Error: Failed generating auto config: Failed to automatically find an unused IPv4 subnet, manual configuration required
@tomponline
Copy link
Member

The first and last addresses should maybe be validated to ensure that they are within the gateway's subnet.

I agree we should do this.

@tomponline tomponline added the Feature New feature, not a bug label Nov 20, 2023
@tomponline
Copy link
Member

Case 2: uplink network is in the same subnet as the management network (here the management network has the subnet 10.0.0.0/8 and the uplink network is configured to 10.254.254.0/24)

@masnax yes lets do the same pre-check that LXD does in microcloud to allow earlier detection & correction, and then microcloud can provide the picked subnet directly to LXD.

@masnax masnax assigned MggMuggins and unassigned masnax Mar 21, 2024
@masnax
Copy link
Contributor

masnax commented Mar 21, 2024

@MggMuggins So for this one, I think the best place to put a validation check is at the end of the config collecting process just before we set up the cluster, so that the validation is consistent for both interactive and preseed initialization.

How this works in microcloud is that we carry a map[string]InitSystem throughout the initialization process, filling it with config for each potential node in the cluster as we go through interactive init, or all at once for preseed. Then we call setupCluster to actually initialize the daemons and set up clustering.

You can probably add a validateConfig or validateSystems function just before we call setupCluster in the 2 places we call it. For now, we can just include a check here to make sure that the selected IP ranges fit within the selected gateway. The network config is stored in InitSystem.Networks, populated with the return value of DefaultOVNNetwork.

Since we only apply the config keys when first setting up the cluster, you would need to check if we are bootstrapping (should be the case if the local node is included in the InitSystem map. We usually check that like this). There may be other network structs in the InitSystem.Networks slice, so you can check if the Type is physical and the Name is UPLINK to determine if you need to validate anything.

The actual validation should just be like this for now, we can grow the validation steps as needed in the future:

  • If network.Config[ipv4.gateway] is present, network.Config[ipv4.ovn.ranges] needs to either not exist or fit within the subnet specified by ipv4.gateway. LXD uses this helper to make the same validation so you can make it exportable and move it to the shared package so we can call it from microcloud.

  • If either network.Config[ipv4.gateway] or network.Config[ipv6.gateway] is present, it must not collide with any addresses of any system in the InitSystem map. Their addresses are stored at InitSystem.ServerInfo.Address, except for the local node, which is s.Address.

MggMuggins added a commit to MggMuggins/lxd that referenced this issue Mar 25, 2024
See canonical/microcloud#210; this is essentially a constructor so it
would be handy to be able to use it in microcloud.

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/lxd that referenced this issue Mar 25, 2024
See canonical/microcloud#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/lxd that referenced this issue Mar 25, 2024
See canonical/microcloud#210; this is essentially a constructor so it
would be handy to be able to use it in microcloud.

Also moves/renames `IPRangeOverlap` to a method.

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/lxd that referenced this issue Mar 25, 2024
See canonical/microcloud#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/lxd that referenced this issue Mar 25, 2024
See canonical/microcloud#210; this is essentially a constructor so it
would be handy to be able to use it in microcloud.

Also moves/renames `IPRangeOverlap` to a method `Overlaps`.

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/lxd that referenced this issue Mar 25, 2024
See canonical/microcloud#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
tomponline pushed a commit to tomponline/lxd that referenced this issue Mar 27, 2024
See canonical/microcloud#210; this is essentially a constructor so it
would be handy to be able to use it in microcloud.

Also moves/renames `IPRangeOverlap` to a method `Overlaps`.

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
tomponline pushed a commit to tomponline/lxd that referenced this issue Mar 27, 2024
See canonical/microcloud#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
tomponline pushed a commit to tomponline/lxd that referenced this issue Mar 28, 2024
See canonical/microcloud#210; this is essentially a constructor so it
would be handy to be able to use it in microcloud.

Also moves/renames `IPRangeOverlap` to a method `Overlaps`.

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
tomponline pushed a commit to tomponline/lxd that referenced this issue Mar 28, 2024
See canonical/microcloud#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 3, 2024
…ster

Fixes canonical#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 3, 2024
…ster

Fixes canonical#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 3, 2024
…ster

Fixes canonical#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 4, 2024
…ster

Fixes canonical#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 11, 2024
…ster

Fixes canonical#210

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 12, 2024
…ster

Fixes canonical#210

This catches two situations:

1. Invalid OVN ranges in UPLINK networks (range outside gateway subnet, etc)
2. UPLINK gateway subnets that contain any system's management address;
   these should be on separate interfaces and shouldn't conflict

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
@tomponline
Copy link
Member

tomponline commented Apr 17, 2024

Case 1: first and last addresses are outside the gateway's subnet

This should certainly be checked for (in both MicroCloud and LXD and if not already).

Case 2: uplink network is in the same subnet as the management network (here the management network has the subnet 10.0.0.0/8 and the uplink network is configured to 10.254.254.0/24)

This configuration is valid. It should be allowed to use the same or different subnets for the microcloud and ovn networks (after all the OVN uplink network interface may be connected to the same or different external network as the microcloud network interfac).

We should check that none of the OVN external IPv4 ranges cover the uplink gateway address, external DNS addresses or any of the microcloud member IPs (that's asking for trouble).

But the problem the OP is experiencing in case 2:

Error: Failed generating auto config: Failed to automatically find an unused IPv4 subnet, manual configuration required

is unrelated to the differing subnets and more likely because their host has a the entire 10.0.0.0 - 10.255.255.255 range in use on the microcloud interface (we can tell this because of the question Limit search for other MicroCloud servers to 10.42.0.49/8 the /8 is key here).

LXD is using this function to find a free OVN subnet:

https://github.com/canonical/lxd/blob/c8bc5de197b0cf7ea9843fda6ecb80cda10b4f57/lxd/network/network_utils.go#L598-L615

And this is considering any subnet in the host's routing table as unavailable. And thus it cannot find a free /24 in the 10/8 prefix.

So I would suggest creating a separate issue over at LXD's repo for the case 2 and then case 1 (and the other suggestions here) can be implemented in microcloud.

MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 18, 2024
…ster

Fixes canonical#210

1. Invalid OVN ranges in UPLINK networks (range outside gateway subnet, etc)
2. UPLINK OVN ranges that contain any system's management address;
3. Invalid dns nameserver IPs

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
MggMuggins added a commit to MggMuggins/microcloud that referenced this issue Apr 18, 2024
…ster

Fixes canonical#210

1. Invalid OVN ranges in UPLINK networks (range outside gateway subnet, etc)
2. UPLINK OVN ranges that contain any system's management address;
3. Invalid dns nameserver IPs

Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature, not a bug
Projects
None yet
Development

No branches or pull requests

4 participants