Skip to content

Conversation

@JoelSpeed
Copy link
Contributor

This PR adds the API to allow users to choose between standard network interfaces and AWS Elastic Fabric Adapter interfaces for the EC2 instance created by the Machine.

This is related to https://issues.redhat.com/browse/RFE-2236

More details on EFA can be found https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa.html

According to the AWS API reference docs, we should just need to set the Interface Type on the EC2 Network Interface request to the value of interface, or efa. For now, I've left the PR keeping the two constants as matching the EC2 types, though if we think the names could be improved, I'm open to suggestions

// it should use the default of its subnet.
// +optional
PublicIP *bool `json:"publicIp,omitempty"`
// NetworkInterfaceType specifies the type of network interface to be used for the primary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we make use of secondary network interfaces? I don't mind only controlling this one, but I'd like to know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only support a single network interface within Machine API across all of the core cloud providers, I'm not sure why this was done as this was pre me joining RH

@JoelSpeed JoelSpeed force-pushed the aws-network-interface branch from cbad8a9 to bbf96a7 Compare November 23, 2021 14:49
@deads2k
Copy link
Contributor

deads2k commented Nov 23, 2021

I don't see an enhancement linked here. I think there are some questions to answer before we grown our API surface.

  1. does the regularly consuming team know this is coming. I think this is mostly the machine-api team. elmiko ACK - [OCPCLOUD-1354] Add Network Interface Type to AWS Machine Provider Spec #1065 (review)
  2. does this work for masters. My understanding is that masters are created by the installer, not the machine-api. If this is the case, then the installer needs to review this API and determine how they will implement it. We need an installer ACK before this can merge.
  3. from the networking team, I'd like to understand if this new mode causes them an issues with debugging or constrains their future choices.
  4. does the QE team know that this is adding to the list of configurations to test

These would all generally be answered in an enhancement. if the installer says it doesn't impact them and the networking team says it doesn't impact them, then I'm ok to proceed without one if the machine-api team says its easy and promises to write an automated test for it that will block payload promotion. We write enhancements to make this coordination easier.

While I can appreciate that some changes are more urgent than others, urgently delivering half a solution or a solution that does not formalize a means to ensure that it continues to work over time, our future selves will not be pleased.

/hold

holding to get the ack/nack list from above worked out.

oh, and the API itself looks pretty reasonable.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 23, 2021
@JoelSpeed
Copy link
Contributor Author

For reference, this is where/how this will be used within Machine API: openshift/machine-api-provider-aws#8

We will also add webhook validation for the values of the enum once these first two PRs are merged

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, generally looks good to me

@JoelSpeed
Copy link
Contributor Author

Thanks for the feedback David, some notes in response to the questions

does the regularly consuming team know this is coming. I think this is mostly the machine-api team.

This will primarily be a machine API feature, in which case myself and others from the team have already reviewed this

does this work for masters. My understanding is that masters are created by the installer, not the machine-api. If this is the case, then the installer needs to review this API and determine how they will implement it. We need an installer ACK before this can merge.

I have tested the feature and there is no technical reason I have found for why this feature would break masters. You are correct that masters are created day 1 by the installer and as such, if we want to have support for this feature on masters, this will need to be added to the installer.

As far as I have been told, this feature request from the customer was for day 2 initially, as it is possible to reconfigure existing VMs to use the feature, but they want autoscaling via MAPI to avoid toil once the cluster is bootstrapped.

For the installer sake, I know they are very busy right now, but if they were to support this, the prerequisites are that the GO SDK and terraform both should support the feature.

Based on the docs, the terrform provider has support and we should be able to include the option (also the code for the specific version)

Likewise, the SDK Docs for the version used by the installer also include the option.

I believe the only thing prohibiting the installer here would be a lack of time

promises to write an automated test for it that will block payload promotion

Just to clarify, you would like to see a E2E blocking job that runs all machines with this option enabled to make sure all of our tests pass as normal?

@romfreiman
Copy link

@deads2k we've been discussing this with @tkatarki and having masters to support is not a requirement currently.
bullet 3 is interesting - I assume that we should try running networking related tests on those interfaces and see how it goes.
And, of course, qe have to validate it before releasing, but this is true to every feature we develop, I dont think it's different from supporting various instance types - do we have coverage for each and every instance type in our CI?

@tkatarki
Copy link

right having masters also use efa instances and efa interfaces is not an ask for the mvp. @deads2k and @romfreiman

@romfreiman
Copy link

@danwinship anyone has experience with efa? Any thoughts on how sdn can be affected?

@danwinship
Copy link
Contributor

@danwinship anyone has experience with efa? Any thoughts on how sdn can be affected?

I don't know anything about it, but from the docs it seems like EFAs offer a superset of the functionality of ordinary network adapters, so just switching the adapter type shouldn't break anything, hurt debuggability, or anything else.

The docs say "An EFA requires a security group that allows all inbound and outbound traffic to and from the security group itself." We don't currently have that, so something is going to have to make that happen when EFAs are enabled. AFAIK currently the security groups are created by the installer in an undocumented manner which is not guaranteed to remain unchanged in the future, so stuff will have to be figured out here. (And also of course, changing the security groups in this way means we'd be dropping all firewalling between worker nodes when using EFAs, meaning there will be less total network security, but maybe people who want to use EFAs don't care about that.)

It also seems like actually using the extra EFA functionality might require extra kernel modules, sysctl changes, etc? I don't see anything in the RFEs about making sure that the HPC functionality is actually usable with OCP when we enable EFAs, and if that hasn't happened, we should probably do that before adding APIs to enable them.

@JoelSpeed
Copy link
Contributor Author

Just wanted to add a note, as I'm being poked, that we have pressure from PM (@tkatarki knows more) to try and get this feature merged ahead of FF even without demonstrating that it can be used to end to end. The suggestion was to merge this and the feature implementation under regression testing, rather than waiting for the full end to end flow to be demonstrated. I would assume that if we don't get it working by code freeze, we would revert the feature out.

It also seems like actually using the extra EFA functionality might require extra kernel modules, sysctl changes, etc? I don't see anything in the RFEs about making sure that the HPC functionality is actually usable with OCP when we enable EFAs, and if that hasn't happened, we should probably do that before adding APIs to enable them.

There is work happening at the moment to get full end to end feature working with extra required software deployed via daemonsets, @kwozyman is giving regular updates on slack for these. He has also managed to verify that this API and associated machine patch work as expected and allowed him to continue his development of the full end to end feature

@deads2k
Copy link
Contributor

deads2k commented Dec 14, 2021

get this feature merged ahead of FF even without demonstrating that it can be used to end to end.

I'm generally willing to merge APIs without backing implementations (we do this often), but we do it after an enhancement has described how the feature itself impacts other teams (installer, machine-api, and network all appear to be coming together here). Without knowing how the feature impacts other teams, without knowing the scope of additional changes like kernel modules and sysctl changes, and without seeing how this can be made consistent across workers and masters (even as day 2), I don't think merging now and working it out later is the right path.

The comments in this PR suggest to me that other teams have a stake in how this rolls out and they need a concrete design (openshift/enhancements) to read and comment against. That should happen before we merge an API and hope we can make a viable implementation without a design.

/hold

@JoelSpeed JoelSpeed force-pushed the aws-network-interface branch from 75493c2 to f24df3c Compare December 21, 2021 17:21
@JoelSpeed JoelSpeed force-pushed the aws-network-interface branch from f24df3c to 9ec7092 Compare January 25, 2022 12:48
@JoelSpeed
Copy link
Contributor Author

/hold cancel

The enhancement has been approved and merged, I think we can carry on with this and prepare it for merging post branch splits at the end of the week

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 25, 2022
Copy link
Contributor

@alexander-demicev alexander-demicev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 25, 2022
@lobziik
Copy link
Contributor

lobziik commented Jan 25, 2022

/lgtm

@JoelSpeed
Copy link
Contributor Author

Just to preempt a possible question about how this interacts with placement groups, IMO they should be considered separately. Each of the features can be used separately and there are no cross dependencies between the API types or PRs to implement each of the features.

The only reason these are linked at the moment is because we have a customer who is looking to get the absolute maximal performance from the EFA networking, for this, you can drop latency from approx 70-80us to 20us by using a cluster placement group with the EFA interface.

@deads2k
Copy link
Contributor

deads2k commented Feb 3, 2022

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 3, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alexander-demichev, deads2k, JoelSpeed, lobziik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2022
@JoelSpeed
Copy link
Contributor Author

/label docs-approved
/label px-approved
/label qe-approved

We are not a no-FF team, so I'm adding these labels manually

@openshift-ci openshift-ci bot added docs-approved Signifies that Docs has signed off on this PR px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR labels Feb 4, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 4, 2022

@JoelSpeed: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit c4ca496 into openshift:master Feb 4, 2022
@JoelSpeed JoelSpeed deleted the aws-network-interface branch February 4, 2022 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. docs-approved Signifies that Docs has signed off on this PR lgtm Indicates that a PR is ready to be merged. px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants