Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[E2E] Tests for MachinePool failing in open PRs #3295

Closed
Ankitasw opened this issue Mar 9, 2022 · 15 comments
Closed

[E2E] Tests for MachinePool failing in open PRs #3295

Ankitasw opened this issue Mar 9, 2022 · 15 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@Ankitasw
Copy link
Member

Ankitasw commented Mar 9, 2022

/kind bug

What steps did you take and what happened:
E2E tests for machinepools failing with below error

E0309 08:38:12.873061 1 awsmachinepool_controller.go:254] "msg"="error updating AWSMachinePool" "error"="unable to update ASG: failed to update ASG "machine-pool-oxhbj8-mp-0": ValidationError: Desired capacity:0 must be between the specified min size:1 and max size:4\n\tstatus code: 400, request id: 7a81e62c-9a96-4824-8510-69432b30f83c"

What did you expect to happen:
E2E tests should pass

Anything else you would like to add:
Probably happened because of #3255

Environment:

  • Cluster-api-provider-aws version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-priority labels Mar 9, 2022
@k8s-ci-robot
Copy link
Contributor

@Ankitasw: This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Mar 9, 2022
@Ankitasw
Copy link
Member Author

Ankitasw commented Mar 9, 2022

/assign @shivi28

@k8s-ci-robot
Copy link
Contributor

@Ankitasw: The label(s) kind/test cannot be applied, because the repository doesn't have them.

In response to this:

/kind-failing test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@Ankitasw: The label(s) kind/? cannot be applied, because the repository doesn't have them.

In response to this:

/kind ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Ankitasw
Copy link
Member Author

Ankitasw commented Mar 9, 2022

/kind failing-test

@k8s-ci-robot k8s-ci-robot added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label Mar 9, 2022
@Ankitasw
Copy link
Member Author

Ankitasw commented Mar 9, 2022

I think it might be flaky as it didn't fail for this PR

@shivi28
Copy link
Contributor

shivi28 commented Mar 14, 2022

Thanks @Ankitasw for this issue. I will take a look.
Meanwhile if possible can you please share your findings on how's it's related with PR as you mentioned in the description.

@Ankitasw
Copy link
Member Author

Ankitasw commented Mar 14, 2022

Hi @shivi28, I don't have clear evidence to support that this failure is happening only because of the PR mentioned above, that's why raised this issue. Why I suspect it is we didn't test e2e in the above PR before merging and the changes are related to machinepool controller, so as part of this issue, we can find out whether E2E upgrade tests are failing because of above PR merge.
If it's not related to the PR once you have confirmed, we will rule out this possibility and then feel free to unassign it, maybe I will ask community members to check further.

@shivi28
Copy link
Contributor

shivi28 commented Mar 14, 2022

Actually test was failing only in upgrade PR because CAPI has enabled scaling machinePool to zero in latest release. To handle this change we need to enable zero as the minimum size for AWSMachinePools in CAPA, which is in progress.

And that's why tests were failing only in this PR and it's already on hold we can merge that after this fix.

May be we can close this issue as I can't see any other PR facing test failure.

cc: @sedefsavas

@Ankitasw
Copy link
Member Author

Thanks @shivi28 , if we have found the root cause for issue and can be fixed in other PR, we can attach this issue to open PR itself so that it will be closed with the PR. Wdyt?

@sedefsavas
Copy link
Contributor

@shivi28
Do you know why not allowing scale to 0 caused e2e test failures only during upgrade?
Was there a change in the cluster-api e2e tests we use?

@shivi28
Copy link
Contributor

shivi28 commented Mar 15, 2022

@shivi28 Do you know why not allowing scale to 0 caused e2e test failures only during upgrade? Was there a change in the cluster-api e2e tests we use?

Yes @sedefsavas they have added a new test case Scaling the machine pool to zero in CAPI e2e test

@mweibel
Copy link
Contributor

mweibel commented Mar 18, 2022

making sure this issue is also updated: kubernetes-sigs/cluster-api#6312 is a required fix for #3253 to pass. Not sure if we should skip that E2E for now, until that other PR is merged and released?

@Ankitasw
Copy link
Member Author

@mweibel I will leave this to you(since I was not involved in discussions) and let @sedefsavas take a look.

@sedefsavas
Copy link
Contributor

making sure this issue is also updated: kubernetes-sigs/cluster-api#6312 is a required fix for #3253 to pass. Not sure if we should skip that E2E for now, until that other PR is merged and released?

We can wait for kubernetes-sigs/cluster-api#6312 to be in before merging CAPI version, if there are no objections to that. Trying to avoid the risk of forgetting re-enabling the e2e test once disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

5 participants