Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add subscription feature MTU3900 #1672

Merged
merged 3 commits into from Aug 19, 2021
Merged

Add subscription feature MTU3900 #1672

merged 3 commits into from Aug 19, 2021

Conversation

mbarnes
Copy link
Contributor

@mbarnes mbarnes commented Aug 12, 2021

Which issue this PR addresses:

This implements 10080838 Support for larger MTUs at cluster install time.

What this PR does / why we need it:

See Matt Woodson's summary in the ADO feature. (It's customer-specific so not sure I should paste details here.)

Test plan for issue:

So far only manual testing until the subscription feature flag exists.

Is there any documentation that needs to be updated for this PR?

Microsoft will be delivering documentation about the larger MTU in the October/November timeframe, but presumably it will not mention Azure Red Hat OpenShift. Waiting to see what Microsoft delivers but likely we'll need to add a mention under Azure Red Hat OpenShift documentation.

@ghost
Copy link

ghost commented Aug 12, 2021

CLA assistant check
All CLA requirements met.

pkg/cluster/overridemtu.go Outdated Show resolved Hide resolved
@mbarnes
Copy link
Contributor Author

mbarnes commented Aug 16, 2021

Maybe to clarify, here's my thinking on why I put this in ARO-RP. I could be wrong in my assumptions.

As best as I can tell, the installer code does not currently query Azure subscription flags. Replicating that functionality in the installer fork just for this feature seems inappropriate and unlikely to be accepted by upstream.

We could have also split the code across ARO-RP and the installer fork, with the subscription flag check in the RP and the "business logic" in the installer. But it was not clear to me how to add this logic to the installer in an API-compatible way aside from just bolting on public function(s) somewhere and calling them directly from the RP.

I imagine such an approach is also unlikely to be accepted by upstream since they'd be stuck maintaining what is, from their perspective, dead code. That would leave us with a permanent carry-patch in the installer fork, and I'm loathe to make the installer rebase work any more difficult than it already is.

I understand this code is tightly coupled to the installer, but for the reasons above I thought the RP would be the best place.

pkg/cluster/overridemtu.go Outdated Show resolved Hide resolved
pkg/cluster/overridemtu.go Outdated Show resolved Hide resolved
@github-actions
Copy link

Please rebase pull request.

@github-actions github-actions bot added the needs-rebase branch needs a rebase label Aug 18, 2021
FeatureFlagMTU3900 is the feature in the subscription that causes new
OpenShift cluster nodes to use the largest available Maximum Transmission
Unit (MTU) on Azure virtual networks, which as of late 2021 is 3900 bytes.
Otherwise cluster nodes will use the DHCP-provided MTU of 1500 bytes.
@mbarnes
Copy link
Contributor Author

mbarnes commented Aug 18, 2021

Force-pushed a replacement commit that heavily refactors the code to add a 99-master-mtu MachineConfig manifest to the bootstrap node, so it's consistent with 99-worker-mtu (per @jewzaam's request).

I faked the subscription feature flag to test, but aside from that it works:

# Verify an MTU MachineConfig exists for each MachineConfigPool
$ oc get machineconfigs | grep mtu
99-master-mtu                                                                                 3.2.0             43m
99-worker-mtu                                                                                 3.2.0             43m

# Verify all nodes have MTU 3900 on eth0
$ for object in $(oc get nodes -o name --no-headers); do echo "$object:"; oc debug -q $object -- ip link show eth0; done
node/mbarnes-mtutest-w5zt8-master-0:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 3900 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 00:0d:3a:11:49:4b brd ff:ff:ff:ff:ff:ff
node/mbarnes-mtutest-w5zt8-master-1:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 3900 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 00:0d:3a:11:44:57 brd ff:ff:ff:ff:ff:ff
node/mbarnes-mtutest-w5zt8-master-2:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 3900 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 00:0d:3a:11:4a:f7 brd ff:ff:ff:ff:ff:ff
node/mbarnes-mtutest-w5zt8-worker-eastus1-v9879:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 3900 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 00:0d:3a:12:d8:43 brd ff:ff:ff:ff:ff:ff
node/mbarnes-mtutest-w5zt8-worker-eastus2-9vn6z:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 3900 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 00:0d:3a:11:4a:5e brd ff:ff:ff:ff:ff:ff
node/mbarnes-mtutest-w5zt8-worker-eastus3-d4sdn:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 3900 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 00:0d:3a:11:57:57 brd ff:ff:ff:ff:ff:ff

@mbarnes mbarnes removed the hold Hold label Aug 18, 2021
@jewzaam jewzaam self-requested a review August 18, 2021 18:31
jewzaam
jewzaam previously approved these changes Aug 18, 2021
@mbarnes
Copy link
Contributor Author

mbarnes commented Aug 18, 2021

I got so focused on setting MTU on the hosts at install time that I forgot to check if cluster-network-operator is picking up the increased host MTU when deriving a default MTU for SDN/OVN.

It's not. 😞 Further investigation required.

getDefaultMTU is the cluster-network-operator function I thought would detect the higher host MTU. I wonder if the operator first comes up on the bootstrap node? This PR doesn't alter MTU on the bootstrap node.

$ oc describe network.config cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1
Kind:         Network
Metadata:
  Creation Timestamp:  2020-05-05T22:42:09Z
  Generation:          2
  Resource Version:    1866
  Self Link:           /apis/config.openshift.io/v1/networks/cluster
  UID:                 0ca443c4-9a3e-4b45-a75f-4d360f8927eb
Spec:
  Cluster Network:
    Cidr:         10.128.0.0/14
    Host Prefix:  23
  External IP:
    Policy:
  Network Type:  OpenShiftSDN
  Service Network:
    172.30.0.0/16
Status:
  Cluster Network:
    Cidr:               10.128.0.0/14
    Host Prefix:        23
  Cluster Network MTU:  1450    <---- * WRONG *
  Network Type:         OpenShiftSDN
  Service Network:
    172.30.0.0/16
Events:  <none>

@mbarnes
Copy link
Contributor Author

mbarnes commented Aug 18, 2021

New commit gets SDN/OVN configured properly.

$ oc describe network.config cluster
...
Status:
  Cluster Network:
    Cidr:               10.128.0.0/14
    Host Prefix:        23
  Cluster Network MTU:  3800    <---- * CORRECT! *
  Network Type:         OVNKubernetes
  Service Network:
    172.30.0.0/16

Copy link
Contributor

@jewzaam jewzaam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to hold merging this until #1492 which is very close.

@troy0820 troy0820 added priority-high High priority issue or pull request size-medium Size medium ready-for-review labels Aug 19, 2021
@troy0820 troy0820 added the LGTM Looks Good To Me label Aug 19, 2021
@jewzaam jewzaam removed the hold Hold label Aug 19, 2021
@jewzaam jewzaam merged commit e831008 into Azure:master Aug 19, 2021
@mbarnes mbarnes deleted the mtu3900 branch August 19, 2021 22:07
@mjudeikis
Copy link
Contributor

As best as I can tell, the installer code does not currently query Azure subscription flags. Replicating that functionality in the installer fork just for this feature seems inappropriate and unlikely to be accepted by upstream.

We have other ways to do this. I somehow think all this PR introduced one more code pattern into our code.
And ontop of this it did change in the way it is wrong (we resolved the graph and append after resolution so it can break the graph).

And last - with new pattern added it has no tests at all. Not even E2E or unit tests.

I think this was merged prematurely.

@m1kola
Copy link
Contributor

m1kola commented Aug 20, 2021

As best as I can tell, the installer code does not currently query Azure subscription flags. Replicating that functionality in the installer fork just for this feature seems inappropriate and unlikely to be accepted by upstream.

  1. As far as I know the installer doesn't validate whether Microsoft.Compute and other required RPs are enabled in a subscription, but it does allow you to installer the cluster. So I would say it is not a problem.

  2. We have a way to gate ARO-specific features in the installer upstream: so we can already move this code to the upstream. And if/when MTU configuration becomes generally avaialble without a flag - the condition which gates MTU code can be removed and all Azure users can start benefiting from it.

We could have also split the code across ARO-RP and the installer fork, with the subscription flag check in the RP and the "business logic" in the installer. But it was not clear to me how to add this logic to the installer in an API-compatible way aside from just bolting on public function(s) somewhere and calling them directly from the RP.

We already introduce custom machine configs in our installer fork. Take a look at the commits which introduce custom DNS as a reference. BTW, as far as I know OCP is going to work on supporting custom DNS. I imagine that having (more or less) properly implemented installer features in our fork is going to be huge help (as well as enhancement proposal in this case).

Also It is ok to ask for help, if something is not clear. I can help with this (or at least show the right course). I think Amber, MJ and Ben can help and probably some other people too.

I imagine such an approach is also unlikely to be accepted by upstream since they'd be stuck maintaining what is, from their perspective, dead code.

Upstream is ok to have ARO and OKD specific code gated: we already have aro build tag support in the installer and OKD has a similar buld tag. See 1 (introduces the aro build tag) and 2. There is a room for improvement (like we should run at least unit tests for gated code, etc).

BTW, the second example while being ARO-specific was still useful to the upstream: it was one of the requirements for Azure Stack Hub support.

That would leave us with a permanent carry-patch in the installer fork, and I'm loathe to make the installer rebase work any more difficult than it already is.

Even if we end up with a permanent carry-patch (I doubt it) - I think it will be better than doing a hack in the RP codebase which is inconsistent with what we already do (for DNS, etc). I imagine that the installer patch will create a a new asset (installer terminology) so it will be easy to rebase it. I think it will be very similar to the DNS patch, but simpler/smaller.

And as mentioned above: we have at least 2 cases were ARO-specific cary patches were useful to the upstream in implementing something new featuers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM Looks Good To Me next-up priority-high High priority issue or pull request ready-for-review size-medium Size medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants