Skip to content

Document index:quantile() #4975

@TarantoolBot

Description

@TarantoolBot

Related dev. issue(s): tarantool/tarantool#11111
Product: Tarantool
Since: 3.4
Root document: https://www.tarantool.io/en/doc/latest/reference/reference_lua/box_index/
SME: @ locker

Details

Scope:

The new index method finds a quantile point in an indexed data range.
It takes the quantile level (0 < L < 1) and optionally the target range
boundaries and returns such a key that the ratio of tuples less than
the key equals L.

Example:

local s = box.schema.space.create('test')
s:create_index('pk')

for i = 1, 100 do
    s:insert({i * i, i})
end

-- Find the median in the whole index.
s.index.pk:quantile(0.5) -- returns {2601}

-- Find the 90th percentile among all keys < 1000
s.index.pk:quantile(0.9, {}, {1000}) -- returns {784}

The target range is defined as the intersection of the following read
requests:

s:select(begin_key, {iterator = 'ge'})
s:select(end_key, {iterator = 'lt'})

This means that using the empty key {} or nil for both begin_key
and end_key finds the quantile in the whole index. The function raises
an error if there can't possibly be any tuples in the target range, i.e.
if key_begin >= key_end.

The function returns nil if there are no tuples in the target
range or if it failed to find the quantile due to the limitations of
the storage engine, which are described below.

The function is implemented by memtx and vinyl tree indexes only.
In memtx the function returns the quantile point exactly. It has
the logarithmic complexity. It may return nil only if there are
no tuples in the target range. There's guaranteed to be a tuple
in the target range corresponding to the returned key.

The vinyl implementation returns the quantile point with a reasonable
degree of error for performance considerations. It has the logarithmic
complexity as well. There may or may not be a tuple in the target
range corresponding to the returned key. The function may return nil
even if the target range is not empty, in particular, if there are
no disk layers or if the range is smaller than the vinyl page size.

The function doesn't yield.

The function doesn't participate in transactions.

Like other index methods, index:quantile() can be used through
a space object as space:quantile(), in which case it works as
a shortcut for the primary index.

For more details, see the RFC document: https://github.com/orgs/tarantool/discussions/11194
Requested by @ locker in tarantool/tarantool@a8df9b5.

UPD: Since Tarantool 3.6, MemCS engine supports this method as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.4indexRelated to Tarantool indexes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions