Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bucketize autoscaling metrics by timeframe not by pod name. #3289

Merged
merged 18 commits into from
Feb 22, 2019

Conversation

markusthoemmes
Copy link
Contributor

@markusthoemmes markusthoemmes commented Feb 20, 2019

Fixes #2977
Fixes #2379

Proposed Changes

Stats are averaged in each specific timeframe vs. averaged over the whole window. See the linked issue for more in-depth information

Release Note

TBD

@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 20, 2019
@knative-prow-robot knative-prow-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 20, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markusthoemmes: 0 warnings.

In response to this:

Fixes #2977

Proposed Changes

Stats are averaged in each specific timeframe vs. averaged over the whole window. See the linked issue for more in-depth information

Release Note

TBD

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 20, 2019
@knative-prow-robot knative-prow-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 22, 2019
@markusthoemmes markusthoemmes marked this pull request as ready for review February 22, 2019 10:13
@markusthoemmes markusthoemmes changed the title [WIP] Bucketize autoscaling metrics by timeframe not by pod name. Bucketize autoscaling metrics by timeframe not by pod name. Feb 22, 2019
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 22, 2019
@markusthoemmes
Copy link
Contributor Author

Unrelated failure

/test pull-knative-serving-integration-tests

@markusthoemmes
Copy link
Contributor Author

/assign @yanweiguo
/assign @k4leung4

Please let me know what you think.

@@ -281,7 +281,7 @@ func assertAutoscaleUpToNumPods(ctx *testContext, numPods int32) {
defer close(stopChan)

go func() {
if err := generateTraffic(ctx, int(numPods*10), 30*time.Second, stopChan); err != nil {
if err := generateTraffic(ctx, int(numPods*10), 60*time.Second, stopChan); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes stabilize the autoscaling tests. They have recently been adjusted to continue generating more traffic as soon as the we hit the desired replica count. However that's only been done on "Replicas" so we're at danger of overflowing if the pod takes a while to come up.

Likewise the amount of traffic being sent in (30s) can be juuuuuust about enough to cause us to scale up. After 60s it's guaranteed to (for the default window sizes).

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superficial mostly. I need to re-read the PR again for the logic part, though it mostly makes sense to me.

pkg/autoscaler/autoscaler.go Outdated Show resolved Hide resolved
pkg/autoscaler/autoscaler.go Outdated Show resolved Hide resolved
pkg/autoscaler/autoscaler.go Outdated Show resolved Hide resolved
pkg/autoscaler/autoscaler.go Show resolved Hide resolved
pkg/autoscaler/autoscaler.go Outdated Show resolved Hide resolved
pkg/autoscaler/autoscaler.go Outdated Show resolved Hide resolved
pkg/autoscaler/autoscaler.go Outdated Show resolved Hide resolved
@@ -668,3 +628,7 @@ func createEndpoints(ep *corev1.Endpoints) {
kubeClient.CoreV1().Endpoints(testNamespace).Create(ep)
kubeInformer.Core().V1().Endpoints().Informer().GetIndexer().Add(ep)
}

func roundedNow() time.Time {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason to use roundedNow that it prevent flakiness because some stats could be out of scale window if now() is used directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it basically normalizes the instances of "now" so the test doesn't depend on when exactly it is executed. Especially when adding to "now" in the tests we otherwise risk to jump into other buckets in the calculation. It makes the test deterministic.

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 22, 2019
@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Feb 22, 2019
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/autoscaler/autoscaler.go 97.2% 97.0% -0.3

@k4leung4
Copy link
Contributor

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 22, 2019
@yanweiguo
Copy link
Contributor

/lgtm

@srinivashegde86
Copy link
Contributor

/lgtm
/approve

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: markusthoemmes, srinivashegde86

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 22, 2019
@knative-prow-robot knative-prow-robot merged commit 366aa03 into knative:master Feb 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants