-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[e2e] Increase the vCPU quota limit for EC2 instances #3002
Conversation
493016b
to
479c309
Compare
/priority critical-urgent |
test/e2e/shared/defaults.go
Outdated
ServiceCode: "ec2", | ||
QuotaName: "Running On-Demand G and VT instances", | ||
QuotaCode: "L-DB2E81BA", | ||
DesiredMinimumValue: 32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this many vCPUs? For 1 GPU node --> 4 vCPU enough.
Your current vCPU limit is Running On-Demand All G instances = 8 vCPU.
A g4dn.xlarge instance has a footprint of 4 vCPU. Your account can already run two g4dn.xlarge instances, based on your G instances limit of 8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reduced it to 8 for now
479c309
to
4177f00
Compare
/test pull-cluster-api-provider-aws-e2e |
Test took 5 hours and timed out. |
Yeah, that is weird, is it because we added some wrong value? Couldn't figure out the problem in local run as it works fine with my AWS account. |
/retest |
@sedefsavas I see below log
But again its getting stuck at same place while acquiring resources, not sure how should we proceed |
This is the problem:
Until we figure out what the problem is, let's not run e2e test here. Takes 5 hours and blocks other PR's e2e tests. |
test/e2e/shared/defaults.go
Outdated
@@ -154,6 +154,13 @@ func getLimitedResources() map[string]*ServiceQuota { | |||
QuotaCode: "L-E9E9831D", | |||
DesiredMinimumValue: 20, | |||
} | |||
|
|||
serviceQuotas["ec2"] = &ServiceQuota{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crit. serviceQuotas
is a map, it is being overwritten here as we already set it for regular instances above. Need to have a different name for normal instances and GPU ones.
Also, we should update initial resources file as ec2-normal and ec2-GPU, since their quotas are different. |
Disabled for now: #3007 |
4177f00
to
b8c4f9e
Compare
Current change works fine locally as the resource quota limit is sufficient in my AWS account, that's why trying to trigger and check in PR, looks like it should go through now |
b8c4f9e
to
49ec752
Compare
49ec752
to
e3025a1
Compare
/test pull-cluster-api-provider-aws-e2e |
de3e511
to
d1c333b
Compare
/retest |
fb61838
to
50425d1
Compare
/test pull-cluster-api-provider-aws-e2e |
50425d1
to
e2aa26f
Compare
/assign @sedefsavas for approval |
@Ankitasw: GitHub didn't allow me to assign the following users: for, approval. Note that only kubernetes-sigs members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sedefsavas The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[e2e] Increase the vCPU quota limit for EC2 instances
What type of PR is this?
/kind failing-test
What this PR does / why we need it:
This PR fixes the vCPU quota limit issues while executing E2E upstream.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Checklist:
Release note: