-
Notifications
You must be signed in to change notification settings - Fork 4.9k
ratelimit: add support for emitting filter state stats for access logging #39086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
7f1adf4
to
4ca3fbd
Compare
…ging Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution. From my perspective, I think the tracing is the best the way to collect the overhead of the rate limit calling. And the clusters level request time buckets (histogram stats) is another choice which could be used to observe this performance of the rate limit server.
At last, it's fine to add additional simple timepoint in the downstream timing system. See the DownstreamTiming::setValue
|
||
/** | ||
* Returns streamInfo of the current request if possible. By default just return a nullptr. | ||
*/ | ||
virtual StreamInfo::StreamInfo const* streamInfo() const { return nullptr; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The client in the future will be refactored to be shared by multiple requests, I actually don't think this is a correct way to expose the inner stream info.
|
||
// When set to true, the filter will emit per-stream stats for access logging. These stats will be stored in the | ||
// filter state under the filter name. The filter will emit latency, bytes sent/received, upstream host, and | ||
// upstream cluster info. | ||
// | ||
// .. note:: | ||
// Stats are only emitted to the filter state if a check request is actually made to the rate limit service. | ||
bool emit_filter_state_stats = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this doesn't change any behavior and won't bring too much overhead, then I think we can enable it by default and needn't additional flag.
class RateLimitLoggingInfo : public Envoy::StreamInfo::FilterState::Object { | ||
public: | ||
RateLimitLoggingInfo() {} | ||
|
||
absl::optional<std::chrono::microseconds> latency() const { return latency_; }; | ||
absl::optional<uint64_t> bytesSent() const { return bytes_sent_; } | ||
absl::optional<uint64_t> bytesReceived() const { return bytes_received_; } | ||
Upstream::ClusterInfoConstSharedPtr clusterInfo() const { return cluster_info_; } | ||
Upstream::HostDescriptionConstSharedPtr upstreamHost() const { return upstream_host_; } | ||
|
||
void setLatency(std::chrono::microseconds ms) { latency_ = ms; }; | ||
void setBytesSent(uint64_t bytes_sent) { bytes_sent_ = bytes_sent; } | ||
void setBytesReceived(uint64_t bytes_received) { bytes_received_ = bytes_received; } | ||
void setClusterInfo(Upstream::ClusterInfoConstSharedPtr cluster_info) { | ||
cluster_info_ = std::move(cluster_info); | ||
} | ||
void setUpstreamHost(Upstream::HostDescriptionConstSharedPtr upstream_host) { | ||
upstream_host_ = std::move(upstream_host); | ||
} | ||
|
||
bool hasFieldSupport() const override { return true; } | ||
|
||
private: | ||
absl::optional<std::chrono::microseconds> latency_; | ||
absl::optional<uint64_t> bytes_sent_; | ||
absl::optional<uint64_t> bytes_received_; | ||
Upstream::ClusterInfoConstSharedPtr cluster_info_; | ||
Upstream::HostDescriptionConstSharedPtr upstream_host_; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see how this works for logging. And I think this actually expose too much info which have exceeded the original requirements of #39018. In this complex system, I think don't do something is more important than do something.
/wait |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
##Description
This PR adds support for emitting rate limiting stats to the filter state, accessible via access logs. It allows users to collect details about the rate limit service interactions, including latency measurements, bytes sent/received, and upstream host information.
This PR adds a new configuration option called
emit_filter_state_stats
to the rate limit filter which, when enabled, stores detailed statistics about each rate limit request in the filter state. This data can then be accessed via access logs for monitoring and performance analysis.This feature could be valuable for tracking the performance overhead introduced by rate limiting operations in production environments and helps users better understand and optimize their rate limiting configuration.
Fixes: #39018
Commit Message: Add filter state statistics collection for rate limit filter
Additional Description: This PR adds a new configuration option called
emit_filter_state_stats
which, when enabled, stores rate limit statistics in filter state. The statistics include latency measurements, bytes sent/received, upstream host info, and cluster info. These statistics can be accessed via access logs for monitoring purposes.Risk Level: Low
Testing: Added integration tests
Docs Changes: Added
Release Notes: Added