Understanding autoscaling target and target percentage

Running KNative 1.15 (Openshift Serverless operator)

I'm struggling to understand how KNative scales when sending requests. I have a pod running a job (concurrency is 1) that takes a few minutes, I'd like to scale them to 5. I have this configuration snippet:

```
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/max-scale: '5'
        autoscaling.knative.dev/min-scale: '1'
        autoscaling.knative.dev/scale-down-delay: 60s
        autoscaling.knative.dev/target: '1'
        autoscaling.knative.dev/target-utilization-percentage: '50'
        autoscaling.knative.dev/window: 6s
      creationTimestamp: null
    spec:
      containerConcurrency: 1
```

I have a readiness probe implemented that returns 503 as long as the job is not finished (probed every second).
In this scenario, here is what happens when I run curl requests every few seconds:
- 1st curl request: routes request to available pod + spawns a new replica --> expected
- 2nd curl request: routes request to available pod but doest NOT spawn a new replica --> unexpected (because we're exceeding the target utilization percentage?)
- 3rd: curl hangs (waits), not additional replicas requested
- 4th: Autoscaler spawns two new replicas, 3rd curl is routed to one of the new pods, 4th curl is routed to an existing pod and results in HTTP 503 (returned by the application), thus bypassing the readiness probe
- Subsequent curls: routed to the falsely available pod and result in application 503

Is all this expected? According to the documentation the autoscaler should just add one new replica every time I run a curl request so it is ready for a future request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Understanding autoscaling target and target percentage #15861

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Understanding autoscaling target and target percentage #15861

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions