Add frequency tables to TSDB. #2637
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds a new data type, frequency tables, to the time series database and the API methods for:
The notable implementation of this is implemented in the Redis backend, which uses Lua scripting to implement the frequency table data structure as a combination of a top-N index (modeled as a sorted set in Redis) and an estimation matrix for a Count-Min Sketch (implemented using a hash table). These two structures allows implementing estimated top-N queries of high (effectively unbounded) cardinality in almost fixed space. Counts are 100% accurate until the index is filled (and no extra space is used for the estimation matrix until this point), after which the data structure switches to a probabilistic implementation and accuracy begins to degrade for less frequently observed items, but remains accurate for more frequently observed items. Check out the comment in
src/sentry/scripts/tsdb/cmsketch.lua
for a more detailed explanation on implementation.This adds 6 new metrics to the TSDB:
This depends on getsentry/rb#14, which adds a
cluster.execute_commands
method that can be used to run pipelines containing Lua scripts.