[COST-1913] Keda implementation #471

maskarb · 2021-10-15T19:13:43Z

This PR is an idea for a KEDA implementation. This takes a different approach compared to #105. The CRD here adds an optional autoScaler field to each deployment. This field matches the KEDA ScaledObjectSpec, except for the MinReplicaCount (this value is left out of the spec in favor of the deployment minReplicas).

An example Clowdapp. This definition will create a standard HPA based on CPU utilization for the hello-app deployment:

apiVersion: cloud.redhat.com/v1alpha1
kind: ClowdApp
metadata:
  name: hello
  namespace: default
spec:
  # The name of the ClowdEnvironment providing the services
  envName: env-default

  # The bulk of your App. This is where your running apps will live
  deployments:
  - name: app
    # Give details about your running pod
    podSpec:
      image: quay.io/psav/clowder-hello
    autoScaler:
      maxReplicaCount: 10
      triggers:
      - type: cpu
        metadata:
          type: Utilization
          value: "50"
    # Creates a Service on port 8000
    webServices:
      public:
        enabled: true
      metrics:
        enabled: true

  # Request kafka topics for your application here
  kafkaTopics:
    - replicas: 3
      partitions: 64
      topicName: topicOne

  # Creates a database if local mode, or uses RDS in production
  database:
    # Must specify both a name and a major postgres version
    name: jumpstart-db
    version: 12

TODO:

build out tests
add docs
demo

psav

Early look, looks good - just one small change at this point

psav · 2021-10-18T21:27:37Z

bundle/tests/scorecard/kuttl/test-autoscaler/00-install.yaml

+apiVersion: v1
+kind: Namespace
+metadata:
+  name: test-basic-app


The name here should match the name of the test, so in this case test-autoscaler

kylape · 2021-10-27T20:10:21Z

With #481, you should be able to refer to env.status.prometheus.hostname to fill in the prometheus hostname for keda scale triggers. Once that's done, I think we can get this merged.

psav · 2021-10-29T17:28:42Z

With #481, you should be able to refer to env.status.prometheus.hostname to fill in the prometheus hostname for keda scale triggers. Once that's done, I think we can get this merged.

Ideally we'd like to have some tests around this, just ensuring that it puts out the right resource, and secondly we need to run make generate && make manifests && make build-template && make api-docs

maskarb · 2021-11-05T16:54:29Z

@kylape and @psav I made a few updates

Pulled the prometheus hostname from the env.
I've added various kuttl tests.
I've run the various make commands.
Removed the IdleReplicaCount because Clowder will always try to maintain the minReplicaCount.

With these updates, I think it is ready to go.

kylape · 2021-11-05T20:55:11Z

Gah, looks like there's some conflicts with #486. Also, I'm not sure if you GPG signed your last commit, but our PR check requires that.

maskarb · 2021-11-05T20:56:31Z

Ok, I'll resolve the conflicts and sign the next commit.

maskarb · 2021-11-06T20:54:06Z

I looked into the failing test (test-disabled). I think I understand what is happening. It may be considered a regression, but I'm not sure.

For a Deployment, the default value for replicas is 1. So, if you don't set replicas, you'll get 1 replica.

For the failing test, we create the ClowdApp, set the ClowdApp disabled to true, set the Deployment replicas to 0, then set the ClowdApp disabled to false. The test is expecting reconciliation to cause the replica count to return to 1.

This worked previously because we were always setting the deployment replica count to the specified minReplicas in the clowdapp spec. For this test though, that value is not set. So this test was previously unsetting the 0 replicas as the test defines for the deployment, and then the deployment falls back to the default replica count of 1.

This behavior is implicit and I think it "just happened to work". I think logically it makes sense to specify a minReplica count for this test. However, if this test is demonstrating the expected behavior of clowder, then I'll need to update the deployment logic to reset a 0 replicas to the default of 1.

kylape · 2021-11-08T14:08:38Z

I think we should add minReplicas: 1 to the failing kuttl tests. I'm not aware of an intentional design around this specific edge case.

maskarb marked this pull request as draft October 15, 2021 19:17

psav requested changes Oct 18, 2021

View reviewed changes

psav added enhancement New feature or request pr-architectural-change Required architect sign-off labels Oct 26, 2021

maskarb force-pushed the keda-implementation branch 2 times, most recently from 2e30155 to b2bf73d Compare November 5, 2021 16:44

maskarb marked this pull request as ready for review November 5, 2021 16:54

maskarb changed the title ~~wip: [COST-1913] Keda implementation~~ [COST-1913] Keda implementation Nov 5, 2021

initial KEDA implementation

dd065dd

maskarb force-pushed the keda-implementation branch from 27ce32a to dd065dd Compare November 5, 2021 21:16

update test-disabled to add minReplicas

cc5a337

kylape merged commit a68dc76 into RedHatInsights:master Nov 8, 2021

infinitewarp mentioned this pull request Nov 9, 2021

Use KEDA for autoscaling pods cloudigrade/cloudigrade#1066

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[COST-1913] Keda implementation #471

[COST-1913] Keda implementation #471

maskarb commented Oct 15, 2021 •

edited

psav left a comment

psav Oct 18, 2021

kylape commented Oct 27, 2021

psav commented Oct 29, 2021 •

edited

maskarb commented Nov 5, 2021

kylape commented Nov 5, 2021

maskarb commented Nov 5, 2021

maskarb commented Nov 6, 2021

kylape commented Nov 8, 2021

[COST-1913] Keda implementation #471

[COST-1913] Keda implementation #471

Conversation

maskarb commented Oct 15, 2021 • edited

psav left a comment

Choose a reason for hiding this comment

psav Oct 18, 2021

Choose a reason for hiding this comment

kylape commented Oct 27, 2021

psav commented Oct 29, 2021 • edited

maskarb commented Nov 5, 2021

kylape commented Nov 5, 2021

maskarb commented Nov 5, 2021

maskarb commented Nov 6, 2021

kylape commented Nov 8, 2021

maskarb commented Oct 15, 2021 •

edited

psav commented Oct 29, 2021 •

edited