Skip to content

Consolidate and bump default AWS instance type (m6i.xlarge)#2076

Merged
openshift-merge-robot merged 1 commit into
openshift:masterfrom
2uasimojo:HIVE-2235/aws-instance-type
Aug 9, 2023
Merged

Consolidate and bump default AWS instance type (m6i.xlarge)#2076
openshift-merge-robot merged 1 commit into
openshift:masterfrom
2uasimojo:HIVE-2235/aws-instance-type

Conversation

@2uasimojo
Copy link
Copy Markdown
Member

@2uasimojo 2uasimojo commented Aug 3, 2023

This commit cleans up how we handle instance types in hiveutil and e2e CI.

  • hiveutil create-cluster now accepts an --aws-instance-type option.
  • The default instance type is now m6i.xlarge. It was previously m5.xlarge for hiveutil and m4.large for the MachinePool scaling tests in our e2e CI.
  • This default now lives in one place, so it's easier to change later without missing things.
  • To allow easier noodling via CI rehearsals, e2e will feed a new $AWS_INSTANCE_TYPE environment variable through to the new --aws-instance-type param for create-cluster.

HIVE-2235

@2uasimojo
Copy link
Copy Markdown
Member Author

/assign @jstuever

@openshift-ci openshift-ci Bot requested review from abutcher and lleshchi August 3, 2023 20:23
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 3, 2023
@2uasimojo 2uasimojo force-pushed the HIVE-2235/aws-instance-type branch from 2af43fe to e8e40f8 Compare August 3, 2023 20:57
@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 3, 2023

Codecov Report

Merging #2076 (8574253) into master (d99579a) will increase coverage by 0.01%.
Report is 8 commits behind head on master.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2076      +/-   ##
==========================================
+ Coverage   57.37%   57.39%   +0.01%     
==========================================
  Files         186      186              
  Lines       25712    25713       +1     
==========================================
+ Hits        14752    14757       +5     
+ Misses       9719     9717       -2     
+ Partials     1241     1239       -2     
Files Changed Coverage Δ
pkg/clusterresource/aws.go 90.62% <100.00%> (ø)
...g/controller/clusterpool/clusterpool_controller.go 56.35% <100.00%> (+0.05%) ⬆️

... and 1 file with indirect coverage changes

@2uasimojo
Copy link
Copy Markdown
Member Author

/test e2e-pool

This didn’t even get around to creating clusters. And hive-operator was in a weird hot loop. None of which should be related to an instance type change.

@2uasimojo
Copy link
Copy Markdown
Member Author

Oh. No, that's actually broken. I think I missed a path through clusterpools somehow.

@2uasimojo 2uasimojo force-pushed the HIVE-2235/aws-instance-type branch from e8e40f8 to e2d270d Compare August 4, 2023 15:52
@2uasimojo
Copy link
Copy Markdown
Member Author

hive-operator was in a weird hot loop

This was just the namespace watch filter. We weren't actually reconciling (despite the message).

@2uasimojo 2uasimojo force-pushed the HIVE-2235/aws-instance-type branch from e2d270d to bbe6e70 Compare August 4, 2023 20:40
@2uasimojo
Copy link
Copy Markdown
Member Author

/test e2e

unrelated flake

This commit cleans up how we handle instance types in hiveutil and e2e
CI.
- `hiveutil create-cluster` now accepts an `--aws-instance-type` option.
- The default instance type is now `m6i.xlarge`. It was previously
`m5.xlarge` for hiveutil and `m4.large` for the MachinePool scaling
tests in our e2e CI.
- This default now lives in one place, so it's easier to change later
without missing things.
- To allow easier noodling via CI rehearsals, `e2e` will feed a new
`$AWS_INSTANCE_TYPE` environment variable through to the new
`--aws-instance-type` param for `create-cluster`.

HIVE-2235
@2uasimojo 2uasimojo force-pushed the HIVE-2235/aws-instance-type branch from bbe6e70 to 8574253 Compare August 7, 2023 16:21
@2uasimojo
Copy link
Copy Markdown
Member Author

/test coverage

known taint flake

/test e2e-pool

I have absolutely no idea what happened here. It looks like we got past the previous error, but then something weird happened with the uninstall of the ("real") pool cluster.

@fxierh
Copy link
Copy Markdown
Contributor

fxierh commented Aug 8, 2023

Looks like 4.14 starts to use OVNKubernetes. Wondering if we should generate install-config with this networkType.

@2uasimojo
Copy link
Copy Markdown
Member Author

Looks like 4.14 starts to use OVNKubernetes. Wondering if we should generate install-config with this networkType.

This sounds like a really useful tip... if only I had some context.

Also, if this were a problem, wouldn't e2e have failed as well? It's using the same install-config builder as e2e-pool.

@2uasimojo
Copy link
Copy Markdown
Member Author

This actually looks like it pretty much succeeded, but ran up against the overall 2h test timeout.

/test e2e-pool

Even if this eventually works, we're obviously skating way too close to the maximum allowable test time. I'm not even sure what we would do about that.

@2uasimojo
Copy link
Copy Markdown
Member Author

I'm only guessing the slowdown has something to do with having increased the instance size. (Bigger instances take longer to... provision? hibernate? resume?) I believe I found about half an hour we can trim out of the test via #2080 -- will retest when that lands.

@fxierh
Copy link
Copy Markdown
Contributor

fxierh commented Aug 9, 2023

Sorry there might be a mix-up, I wan't referring to the e2e-pool job fail. What I meant is, in addition to the default AWS instance type (generated by hiveutil), do you think it's a good idea to also update the networkType (to align with Openshift Installer's default) ?

@2uasimojo
Copy link
Copy Markdown
Member Author

2uasimojo commented Aug 9, 2023

/test e2e-pool

#2080 has landed (and it does seem to have had the desired effect).

@jstuever
Copy link
Copy Markdown
Contributor

jstuever commented Aug 9, 2023

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Aug 9, 2023
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Aug 9, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo, jstuever

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@2uasimojo
Copy link
Copy Markdown
Member Author

Sorry there might be a mix-up, I wan't referring to the e2e-pool job fail. What I meant is, in addition to the default AWS instance type (generated by hiveutil), do you think it's a good idea to also update the networkType (to align with Openshift Installer's default) ?

Gosh, I don't know. Is there a benefit to doing so? Will it make something faster or cheaper? Is OpenShiftSDN being deprecated/removed?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Aug 9, 2023

@2uasimojo: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jstuever
Copy link
Copy Markdown
Contributor

jstuever commented Aug 9, 2023

Gosh, I don't know. Is there a benefit to doing so? Will it make something faster or cheaper? Is OpenShiftSDN being deprecated/removed?

I'd be curious if we could remove NetworkType and let things naturally use their defaults.

@openshift-merge-robot openshift-merge-robot merged commit a0c10c8 into openshift:master Aug 9, 2023
@2uasimojo 2uasimojo deleted the HIVE-2235/aws-instance-type branch August 9, 2023 19:09
@2uasimojo
Copy link
Copy Markdown
Member Author

I'd be curious if we could remove NetworkType and let things naturally use their defaults.

Okay, sure: #2085

Note: I'm pretty sure this is only affecting hiveutil (and thereby e2e).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants