Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Tested instance type lists for AWS/Azure/GCP #5517

Merged

Conversation

codyhoag
Copy link
Contributor

@codyhoag codyhoag commented Jan 6, 2022

@cuppett @staebler these are the supported instance type lists we compiled from our collaboration with QE a few weeks ago. Let me know your thoughts on the placement of this and any other feedback.

@yunjiang29 @MayXuQQ @jianli-wei can you review the lists for accuracy based on what QE has verified? Once these are merged, the hope is QE can update these docs when supported instance types are added/removed to reflect what is tested for each OCP version, which would auto-update in the official docs.

cc @vikram-redhat @sjstout

@openshift-ci openshift-ci bot requested review from jstuever and m1kola January 6, 2022 22:15
@MayXuQQ
Copy link
Contributor

MayXuQQ commented Jan 7, 2022

shall we list the family name ?

@codyhoag
Copy link
Contributor Author

codyhoag commented Jan 7, 2022

@MayXuQQ I attempted to follow the instance name patterns in the Azure docs. Does that format not accurately describe the supported types for OCP?

@MayXuQQ
Copy link
Contributor

MayXuQQ commented Jan 7, 2022

@MayXuQQ I attempted to follow the instance name patterns in the Azure docs. Does that format not accurately describe the supported types for OCP?

For public doc, please just list the vm size I've tested on https://docs.google.com/spreadsheets/d/1Vqk9s0jJWtdxXJ5jYCnPAfkuWMPa4bSP/edit#gid=98314921

Actually for Azure, we support all the vm size which match the minimum requirements(vCPUsAvailable), is it necessary list all the vm size ? Is it list our minimum requirements, and how to check the vm size's properties more useful (az vm list-skus -l centralus --query "[?resourceType=='virtualMachines' && capabilities[?name=='PremiumIO'].value==['True']].{Name:name, Family:family, Size:size, PremiumIO:capabilities[?name=='PremiumIO'].value, vCPUsAvailable:capabilities[?name=='vCPUsAvailable'].value, Mem:capabilities[?name=='MemoryGB'].value, HyperVG:capabilities[?name=='HyperVGenerations'].value}" )

@codyhoag
Copy link
Contributor Author

codyhoag commented Jan 7, 2022

@MayXuQQ your recommendations are similar to what we talked about with @yunjiang29 for AWS. The conclusion we came to is to specifically state the instance types we support; we've gotten feedback that only listing the minimum requirements is not a desired outcome for customers based on feedback.

I have updated the Azure list based on the VM sizes you provided. Let me know if there's any additional feedback.

@MayXuQQ
Copy link
Contributor

MayXuQQ commented Jan 7, 2022

for azure is OK

@staebler
Copy link
Contributor

LGTM

@codyhoag
Copy link
Contributor Author

@yunjiang29 @jianli-wei can you verify for AWS/GCP?

@yunjiang29
Copy link
Contributor

@codyhoag the currently supported machine types are https://docs.openshift.com/container-platform/4.9/installing/installing_aws/installing-aws-customizations.html#installation-supported-aws-machine-types_installing-aws-customizations, these have been tested by QE, so I suggest putting these instance type families only:

c4.*
c5.*
c5a.*
i3.*
m4.*
m5.*
m5a.*
m6i.*
r4.*
r5.*
r5a.*
t3.*
t3a.*

for the rest of the instance type families, QE will keep increasing the coverage, once the tests get passed, then we can add them to the list.

@codyhoag
Copy link
Contributor Author

@yunjiang29 thanks for that feedback. I made those changes. Can you verify the updated AWS instance types?

@yunjiang29
Copy link
Contributor

@codyhoag thanks for the updates, lgtm.

@jianli-wei
Copy link
Contributor

LGTM

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 26, 2022

@codyhoag: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-workers-rhel8 864240f link false /test e2e-aws-workers-rhel8
ci/prow/e2e-alibaba 864240f link true /test e2e-alibaba

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@makentenza
Copy link
Contributor

Hey @codyhoag I think we had this conversation in the past already and from my understanding we agreed to remove those lists and point the user to the minimal requirements for control plane and worker nodes. I have read the comments on this PR and I'd like to understand better this one: "we've gotten feedback that only listing the minimum requirements is not a desired outcome for customers based on feedback"

Since instance types are something cloud providers are updating frequently we are going to miss these all the time in the documentation and we'll get users confused. As an example, "r6i.*" instances are not part of your PR here, and I don't see any reason why we don't support those. This is just one example, but if we keep maintaining the full list of supported instances this will happen again and again.

If QE wants to include the instances we test I'm fine with that but those should not be called supported but tested.

cc. @staebler

@codyhoag
Copy link
Contributor Author

@makentenza I believe this direction originated from Engineering leadership. Our intention was to provide these lists for "tested instances" and also reference the minimum resources table for what we support. You can reference this Google Doc with what is planned for the official docs: https://docs.google.com/document/d/1XGZQRJshZD2V4eqLLflDflzGuYIn6zSzXIV9kHKoGIs/edit?usp=sharing.

@cuppett can you provide some perspective?

@codyhoag
Copy link
Contributor Author

From a docs perspective, my plan is to refer to the minimum resources table as what is "supported," and refer to the instance type lists as examples. This would allow readers to know what is supported, and also quickly reference the tested types for confirmation. We can update the wording to move away from "supported" to avoid confusion.

@makentenza
Copy link
Contributor

Thanks for the clarification @codyhoag, I think for users not to get confused is better to divide this information into 2 sections as you mentioned. The first one will make clear what are the minimum requirements per node type (vCPU, Memory and IOPS) to be supported and the second one will show what are the actual types we have tested. The key here is to avoid confusion between supported and tested and make it clear that we support the instance type as long the minimum requirements are met even if we haven't tested yet.

@codyhoag
Copy link
Contributor Author

@makentenza absolutely. I will plan to have this info separated with the table for official support, and the lists as tested examples 👍 . I'm going to rename the files in this PR to reflect "tested" so it's clear from the repo as well.

@codyhoag codyhoag changed the title [docs] Supported instance type lists for AWS/Azure/GCP [docs] Tested instance type lists for AWS/Azure/GCP Jan 31, 2022
@codyhoag
Copy link
Contributor Author

codyhoag commented Feb 8, 2022

@patrickdillon @jstuever I have received ACKs from Matthew and the related QE leads. Is there anything else we need to do to get this merged?

@codyhoag
Copy link
Contributor Author

codyhoag commented Feb 8, 2022

@staebler also forgot to loop you in again. Can we merge this now that we have all QE ACKs?

@jstuever
Copy link
Contributor

jstuever commented Feb 9, 2022

/lgtm
/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 9, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jstuever

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 9, 2022
@openshift-merge-robot openshift-merge-robot merged commit 0391519 into openshift:master Feb 9, 2022
@jstuever
Copy link
Contributor

jstuever commented Feb 9, 2022

/cherry-pick release-4.10

@openshift-cherrypick-robot

@jstuever: new pull request created: #5628

In response to this:

/cherry-pick release-4.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -0,0 +1,2 @@
* `c6g.*`
* `m6i.*`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while reviewing openshift/openshift-docs#41796, looks like it should be m6g.*, cc @aleskandro

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/user/azure/tested_instance_types.md need be updated , now I'm waiting for the test result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yunjiang29 yes, it should

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants