Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update paasta docs for multi-metrics #3854

Merged
merged 1 commit into from
May 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 55 additions & 3 deletions docs/source/autoscaling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ The HPA will ignore the load on your pods between when they first start up and w
This ensures that the HPA doesn't incorrectly scale up due to this warm-up CPU usage.

Autoscaling parameters are stored in an ``autoscaling`` attribute of your instances as a dictionary.
Within the ``autoscaling`` attribute, setting a ``metrics_provider`` will allow you to specify a method that determines the utilization of your service.
Within the ``autoscaling`` attribute, setting ``metrics_providers`` will allow you to specify one or more methods to determine the utilization of your service.
If a metrics provider isn't provided, the ``cpu`` metrics provider will be used.
Specifying a ``setpoint`` allows you to specify a target utilization for your service.
The default ``setpoint`` is 0.8 (80%).
Expand All @@ -46,8 +46,9 @@ Let's look at sample kubernetes config file:
min_instances: 30
max_instances: 50
autoscaling:
metrics_provider: cpu
setpoint: 0.5
metrics_providers:
- type: cpu
setpoint: 0.5

This makes the instance ``main`` autoscale using the ``cpu`` metrics provider.
PaaSTA will aim to keep this service's CPU utilization at 50%.
Expand Down Expand Up @@ -77,6 +78,23 @@ The currently available metrics providers are:
With the ``gunicorn`` metrics provider, Paasta will configure your pods to run an additional container with the `statsd_exporter <https://github.com/prometheus/statsd_exporter>`_ image.
This sidecar will listen on port 9117 and receive stats from the gunicorn service. The ``statsd_exporter`` will translate the stats into Prometheus format, which Prometheus will scrape.

:active-requests:
With the ``active-requests`` metrics provider, Paasta will use Envoy metrics to scale your service based on the amount
of incoming traffic. Note that, instead of using ``setpoint``, the active requests provider looks at the
``desired_active_requests_per_replica`` field of the autoscaling configuration to determine how to scale.

:piscina:
This metrics provider is only valid for the Yelp-internal server-side-rendering (SSR) service. With the ``piscina``
metrics provider, Paasta will scale your SSR instance based on how many Piscina workers are busy.

:arbitrary_promql:
The ``arbitrary_promql`` metrics provider allows you to specify any Prometheus query you want using the `Prometheus
query language (promql) <https://prometheus.io/docs/prometheus/latest/querying/basics/>`. The autoscaler will attempt
to scale your service to keep the value of this metric at whatever setpoint you specify.

.. warning:: Using arbitrary prometheus queries to scale your service is challenging, and should only be used by
advanced users. Make sure you know exactly what you're doing, and test your changes thoroughly in a safe environment
before deploying to production.

Decision policies
^^^^^^^^^^^^^^^^^
Expand All @@ -101,6 +119,40 @@ The currently available decicion policies are:
An external process should periodically decide how many replicas this service needs to run, and use the Paasta API to tell Paasta to scale.
See the :ref:`How to create a custom (bespoke) autoscaling method` section for details.

Using multiple metrics providers
--------------------------------

Paasta allows you to configure multiple metrics providers for your service, from the list above. The service autoscaler
will scale your service up if *any* of the configured metrics are exceeding their target value; conversely, it will
scale down only when *all* of the configured metrics are below their target value. You can configure multiple metrics
providers using a list in the ``autoscaling.metrics_providers`` field, as follows:

.. sourcecode:: yaml

---
main:
cpus: 1
mem: 300
min_instances: 30
max_instances: 50
autoscaling:
metrics_providers:
- type: cpu
setpoint: 0.5
- type: active-requests
desired_active_requests_per_replica: 10

There are a few restrictions on using multiple metrics for scaling your service, namely:

1. You cannot specify the same metrics provider multiple times
2. You cannot use bespoke autoscaling (see Decision Policies, above) with multiple metrics providers
drmorr0 marked this conversation as resolved.
Show resolved Hide resolved
3. For Yelp-internal services, you cannot use the PaaSTA autotuner on cpu metrics combined with multiple metrics
providers, if one of the metrics providers is CPU scaling. You must explicitly opt-out of autotuning by setting a
``cpus`` value for this service instance.

If you run ``paasta validate`` for your service, it will check these conditions for you.


How to create a custom (bespoke) autoscaling method
---------------------------------------------------

Expand Down
Loading