Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Scale and HPA support for VMIRS #1881

Merged
merged 4 commits into from Jan 8, 2019

Conversation

rmohr
Copy link
Member

@rmohr rmohr commented Dec 31, 2018

What this PR does / why we need it:

Enable and implement the scale subresouce for VMIRS instances. This allows the following:

First, running kubectl scale commands against VMIRS. For instance:

kubectl scale vmirs myvmirs --replicas 5

This requires the CustomResourceSubresources feature gate to be enables for clusters prior to 1.11.

Second, allow using the Horizontal Pod Autoscaler. For instance:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: myhpa
spec:
  scaleTargetRef:
    kind: VirtualMachineInstanceReplicaSet
    name: vmi-replicaset-cirros
    apiVersion: kubevirt.io/v1alpha2
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50

In combination with the HPA, there are two restrictions:

  • HPA works with CRDs starting from 1.11, tests are skipped for older cluster versions.
  • kubectl autoscale does not yet work with CRDs. Posting the HPA spec directly with kubectl create is necessary.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Documentation is at kubevirt/user-guide#190.

Release note:

Enable scale and HPA support for VMIRS

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XL labels Dec 31, 2018
@rmohr
Copy link
Member Author

rmohr commented Dec 31, 2018

/cc @davidvossel @fabiand FYI

@kubevirt-bot
Copy link
Contributor

@rmohr: GitHub didn't allow me to request PR reviews from the following users: FYI.

Note that only kubevirt members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @davidvossel @fabiand FYI

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rmohr
Copy link
Member Author

rmohr commented Jan 3, 2019

/cc @cynepco3hahue

@cynepco3hahue
Copy link

cynepco3hahue commented Jan 6, 2019

@rmohr I see failure under https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/3003//artifact/check-patch.k8s-1.10.11.el7.x86_64/mock_logs/script/stdout_stderr.log

[Fail] VirtualMachineInstanceReplicaSet [It] should not scale when paused and scale when resume 
/root/go/src/kubevirt.io/kubevirt/tests/replicaset_test.go:355

I am pretty sure it does not relate, but can you check?

@rmohr
Copy link
Member Author

rmohr commented Jan 7, 2019

@cynepco3hahue thanks. Addressed. I use now PATCH where appropriate to avoid such collisions.

Copy link

@cynepco3hahue cynepco3hahue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rmohr It has some new failures
it has some new failures

      jsonpatch replace operation does not apply: doc is missing key: /spec/paused
  occurred

"kubevirt.io/kubevirt/pkg/kubecli"
"kubevirt.io/kubevirt/tests"
)

var _ = Describe("VirtualMachineInstanceReplicaSet", func() {
var _ = FDescribe("VirtualMachineInstanceReplicaSet", func() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove F

Copy link

@cynepco3hahue cynepco3hahue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh gave approve by mistake

@rmohr rmohr force-pushed the scale_and_hpa branch 2 times, most recently from bbea318 to d3ab1df Compare January 7, 2019 13:45
Copy link
Member

@davidvossel davidvossel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, looks good to me. I just had a few questions.

pkg/kubecli/replicaset.go Outdated Show resolved Hide resolved
@@ -5893,6 +5893,10 @@
"$ref": "#/definitions/v1.VirtualMachineInstanceReplicaSetCondition"
}
},
"labelSelector": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the HPA detecting load to know when to scale? Is it using this labelSelector and calculating load based on the Pods that match this selector?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The metrics api needs to be active inside the cluster: https://kubernetes.io/docs/tasks/debug-application-cluster/core-metrics-pipeline/.

We will have to improve the situation a little bit where no readiness check is explicitly configured on the VMI and migrations come into play, but apart from that the algorithm can work with non-ready pods, which allows to cope with "boot" load.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, batch migrations may confuse HPA. The workload's calculated metrics during a bulk migrations might not accurately reflect the actual workload due to migration overhead.

@@ -32,4 +32,9 @@ spec:
- vmirss
singular: virtualmachineinstancereplicaset
scope: Namespaced
subresources:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires the CustomResourceSubresources feature gate to be enables for clusters prior to 1.11.

Is this CRD going to get rejected for 1.10 if the feature gate isn't enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, will just strip the fields away from the CRD. I find that actually a little bit unpleasant in general but good in this case ...

Enabling the scale subresource for VMIRS allows to execute commands like

    kubectl scale vmirs myvmirs --scale 5.
VMIRS exposes now label selectors in a canonical form in their status.
The scale subresource picks that up and as a consequence finally enables
HPA support for VMIRS.
In order to not collide with status updates on VMIRS, use PATCH where
appropriate.
@rmohr
Copy link
Member Author

rmohr commented Jan 8, 2019

Ok test results look good and I opened kubernetes/kubernetes#72678 to add autoscale support for kubectl. Could someone have another look?

@cynepco3hahue
Copy link

lgtm

@cynepco3hahue cynepco3hahue merged commit 38b212c into kubevirt:master Jan 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants