Throttle fetch requests on response size quota #7658

ZeDRoman · 2022-12-08T15:51:33Z

Don't use request throttling to fetch requests.
Fetch requests are small, so we don't need to throttle it on requests size.

We need to throttle fetch requests on response size.
This pr creates separate quota for response sizes. This quota is used for fetch requests.

We cannot calculate response size before we calculate response. So we can get current response size rate only when we have already evaluated response.
Delaying response when we have already done all calculation has no sense. So we write response size to quota manager and use this history to delay next fetch request that will come to node.

Backports Required

Release Notes

Features

Added fetch throttling mechanism. target_fetch_quota_byte_rate - target fetch size quota byte rate for client (bytes per second). Disabled by default

dlex · 2022-12-15T23:38:25Z

src/v/kafka/server/connection_context.h

@@ -29,6 +29,7 @@
 #include <absl/container/flat_hash_map.h>

 #include <memory>
+#include <string_view>


string_view is not used in this header

src/v/kafka/server/connection_context.cc

dlex · 2022-12-16T02:56:29Z

src/v/kafka/server/connection_context.h

@@ -302,6 +309,7 @@ class connection_context final
    const bool _enable_authorizer;
    ctx_log _authlog;
    std::optional<security::tls::mtls_state> _mtls_state;
+    request_data _request_data;


(discuss) Can it be a better idea to save this data in session_resources instead? As far as I understand, session_resources is an object that is local to handling of a request, and is strogly associated with it. Whereas connection_context is a long living one with lifetime same as connection's, and shared between all the requests from that connection.

dlex · 2022-12-16T03:15:15Z

src/v/config/configuration.cc

+      "target_fetch_quota_byte_rate",
+      "Target fetch size quota byte rate (bytes per second) - disabled default",
+      {.needs_restart = needs_restart::no, .visibility = visibility::user},
+      std::nullopt)


(discuss) I think there is some value in making the response TP rate limiting symmetric to request TP limiting. Request TP limit default is 2GB/s, and when it was introduced it was not 100% compatible to what had been before. However the 2GB/s is quite a huge limit and I don't think that many of our customers can ever face it. Please let me know if you think there are any pros for making it different in this case.

I think that we can make this disabled by default to save previous behavior.
It is hard to understand how much customers really consume. Maybe somebody have multiple consumers with same client_id
If customer wants this limit, he can configure it

dlex · 2022-12-16T03:25:08Z

src/v/kafka/server/quota_manager.cc

+          rate,
+          *_target_fetch_tp_rate(),
+          it->second.tp_fetch_rate.window_size());
+    }


I would suggest to supply rate_tracker& and uint32_t target_fetch_rate as arguments to quota_manager::throttle(). Then enum request_type won't be needed and the code will be saved from extra branching and from the code duplication above.

src/v/kafka/server/quota_manager.cc

src/v/resource_mgmt/rate.h

src/v/kafka/server/quota_manager.cc

dlex · 2022-12-16T03:48:47Z

tests/rptest/tests/cluster_quota_test.py

+        consumer.poll(timeout_ms=1000, max_records=1)
+        assert consumer.metrics(
+        )["consumer-fetch-manager-metrics"]["fetch-throttle-time-max"] > 0
+


Besides just the fact that throttling occurs, it would be great to verify that the actual configured limit is honoured. For example, consume 100MB at 10MB/s rate and make sure that it does not happen faster than in 10s.

tests/rptest/tests/cluster_quota_test.py

emaxerrno · 2022-12-30T15:27:55Z

Question - per session (tcp) couldn’t the source core keep a float32 of a rate of bytes of fetches and predictively slow down the fetch request size by mutating the request object itself for some time period like say every 3 seconds it resets if no fetch is received

ZeDRoman · 2023-01-06T12:27:49Z

Question - per session (tcp) couldn’t the source core keep a float32 of a rate of bytes of fetches and predictively slow down the fetch request size by mutating the request object itself for some time period like say every 3 seconds it resets if no fetch is received

Do you mean something like predict next fetch size according to previous results?

dlex

LGTM

dlex · 2023-01-07T01:54:36Z

src/v/kafka/server/connection_context.cc

+              == fetch_api::key) {
+                _server.quota_mgr().record_fetch_tp(
+                  resp_and_res.resources->request_data.client_id, msg.size());
+            }


nit: I think the try...catch here is only intended to mask exceptions thrown in conn->write(), so I would rather place this call before try.

ZeDRoman · 2023-01-09T13:12:55Z