Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(cardinality): Use a Lua script and in-memory cache for the cardinality limiter #2849

Merged
merged 12 commits into from
Dec 19, 2023

Conversation

Dav1dde
Copy link
Member

@Dav1dde Dav1dde commented Dec 13, 2023

Changes the cardinality limiter implementation to use a Lua script. The Lua script evaluates each individual hash and returns a decision wether this hash is accepted or rejected by the cardinality limiter, this allows us to save one additional round trip to Redis.

Additionally there is an in memory cache infront of Redis remembering previous decisions from the Redis script, allowing us to skip Redis calls alltogether.

Epic: #2717

@Dav1dde Dav1dde force-pushed the dav1d/cardinality-lua branch 3 times, most recently from 226d4b1 to 313f0df Compare December 13, 2023 15:12
@Dav1dde Dav1dde requested a review from jjbayer December 13, 2023 15:53
@Dav1dde Dav1dde self-assigned this Dec 13, 2023
@Dav1dde Dav1dde marked this pull request as ready for review December 13, 2023 17:25
@Dav1dde Dav1dde requested a review from a team as a code owner December 13, 2023 17:25
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, see comment on batch calls. I assume that getting reducing the number of redis commands is still beneficial, even within a Lua script.

relay-cardinality/src/cardinality.lua Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/statsd.rs Outdated Show resolved Hide resolved
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some open comments, but no blockers as far as I am concerned!

for i = 1, #KEYS do
local is_working_set = i == 1

for from, to in batches(#t, 7000, offset, max) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should batch exclusively on the Relay side. That would simplify this script, and we need some sort of batching there anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like batching in the lua script, makes the relay side easier to read and we have a guarantee that this never fails in lua (one less error to handle and invariant).

The lua implementation would get slightly easier, but we would still need a slicing function like batches (to slice offset and max), which now is just a minimal overhead to the batches function (just a change of the max value and the offset).

So overall imo we don't gain much by moving it to relay.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we would still need a slicing function like batches

That makes sense!

So overall imo we don't gain much by moving it to relay.

Even if we keep batching on the Lua side, don't we also need batching on the Rust side? Not sure if we would run into size limits or network timeouts otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how Redis and the network behaves here, in my local tests it didn't make a difference but there is also no latency and Redis does not do anything else (e.g. rate limiting etc.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we batch and use a pipeline, I feel like that would make no difference then?

relay-cardinality/src/cardinality.lua Outdated Show resolved Hide resolved
relay-cardinality/src/cardinality.lua Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
@Dav1dde Dav1dde requested a review from jjbayer December 18, 2023 17:12
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of questions, but no blockers!

Please update the PR title & description to mention the cache as well as the Lua script.

for i = 1, #KEYS do
local is_working_set = i == 1

for from, to in batches(#t, 7000, offset, max) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we would still need a slicing function like batches

That makes sense!

So overall imo we don't gain much by moving it to relay.

Even if we keep batching on the Lua side, don't we also need batching on the Rust side? Not sure if we would run into size limits or network timeouts otherwise.

/// Implementation uses Redis sets to keep track of cardinality.
pub struct RedisSetLimiter {
redis: RedisPool,
window: SlidingWindow,
script: CardinalityScript,
cache: Cache,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Does the cache have to be a property of RedisSetLimiter? Since it's independent of redis, could it be part of CardinalityLimiter?

Or go fancy and use the Composite Pattern:

classDiagram
    Limiter <|-- CachedLimiter
    Limiter <|-- RedisSetLimiter
    CachedLimiter "1" *-- "2" Limiter
Loading

Copy link
Member Author

@Dav1dde Dav1dde Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 definitely can be improved, initially this was heavily tied into the Lua logic, but it is pretty separate now. Ill do this in a followup PR.

It is still a bit tricky since it does require the same sliding window logic and ties into the splitting by scope logic which is Redis specific.

relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
relay-cardinality/src/redis.rs Outdated Show resolved Hide resolved
@Dav1dde Dav1dde changed the title ref(cardinality): Use a lua script for the cardinality limiter ref(cardinality): Use a Lua script and in-memory cache for the cardinality limiter Dec 19, 2023
@Dav1dde Dav1dde merged commit c9198f9 into master Dec 19, 2023
21 checks passed
@Dav1dde Dav1dde deleted the dav1d/cardinality-lua branch December 19, 2023 13:20
jan-auer added a commit that referenced this pull request Dec 19, 2023
* master: (35 commits)
  fix(spans): Parse quotes in MySQL (#2846)
  ref(cardinality): Use a Lua script and in-memory cache for the cardinality limiter (#2849)
  fix(spans): Detect hex with fallback scrubber (#2868)
  release: 23.12.0
  Revert "ci: Update upload-artifact and download-artifact actions" (#2866)
  Revert "build: Update axum and http" (#2863)
  feat(spans): Allow resource.img spans (#2855)
  build: Update axum and http (#2844)
  fix(build): Add additional dependencies to the release build (#2858)
  ci: Update upload-artifact and download-artifact actions (#2861)
  feat(spans): Parse timestamps from strings (#2857)
  fix(spans): Scrub integer file extensions (#2856)
  feat(spans): Remove unused transaction tag from resource metrics (#2853)
  ref(cardinality): Recover buckets on cardinality limiter failure (#2852)
  feat(server): Org rate limit per metric bucket (#2836)
  ref(spans): List metric tags explicitly (#2834)
  feat(spans): Resource response sizes as measurements (#2845)
  feat(crons): Add thresholds to monitor config payload (#2842)
  feat(spans): Allow ingestion of metrics summary on spans (#2823)
  ref(crons): Add documentation to CheckInMessageType (#2840)
  ...
jan-auer added a commit that referenced this pull request Dec 20, 2023
* master:
  feat(processor): Add a metric to track the time of all messages (#2877)
  build(py): Update to python 3.9 (#2876)
  fix(spans): Collapse constants in SQL select (#2867)
  fix(spans): Parse quotes in MySQL (#2846)
  ref(cardinality): Use a Lua script and in-memory cache for the cardinality limiter (#2849)
  fix(spans): Detect hex with fallback scrubber (#2868)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants