Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add range search support for ETS table of type ordered_set #7399

Open
yushli opened this issue Jun 13, 2023 · 5 comments
Open

Add range search support for ETS table of type ordered_set #7399

yushli opened this issue Jun 13, 2023 · 5 comments
Assignees
Labels
enhancement team:VM Assigned to OTP team VM

Comments

@yushli
Copy link

yushli commented Jun 13, 2023

Is your feature request related to a problem? Please describe.
For example, suppose an ETS table of type ordered_set with key is timestamp, using
ets:select(Ets, ets:fun2ms(fun({Timestamp,Value} = Term) when Timestamp > Time1, Timestamp < Time2 -> Term end).
If ETS is smart enough to know that it is doing a range search then it just uses binary search to compare the limited set of items, otherwise it need to scan the whole table and compare all items which is very inefficient.
Also asked in Erlang forum: https://erlangforums.com/t/is-ets-table-of-type-ordered-set-able-to-do-efficient-range-search-with-match-spec/2689
Describe the solution you'd like
ETS table of type ordered_set is able to support efficient range search.

Describe alternatives you've considered
A simpler alternative would be to provide two functions getClosestSmaller or getClosestLarger that can be used to get the first one that is within the range. For example,
ets:getClosestLarger(Ets, 100) and ets:getClosestSmaller(Ets, 200) will get the bound of 100 < Key < 200. Then et:next() or ets:prev() can be used to iterate to do an efficient range search.

Additional context

@sverker
Copy link
Contributor

sverker commented Jun 13, 2023

getClosestLarger and getClosestSmaller already exist (for ordered_set). They are called ets:next and ets:prev :-)

Docs for ets:next says:
For table type ordered_set, the function always returns the next key after Key1 in term order, regardless whether Key1 ever existed in the table.

@yushli
Copy link
Author

yushli commented Jun 13, 2023

That is good to know. I should have read the doc more carefully. It provides a workaround to this problem. Thank you for pointing this out.

Still, I think it is good that ETS will add some support for doing this range search from match spec. It is more powerful, more flexible and adds room for improvement from within ETS boundary. For example, do a range search with other conditions, then only returns a subset of the items.

Even a step further, if ETS can support some parts of GraphQL with match spec that will greatly help improve the efficiency of data handling in Erlang. For example, current match spec supports using element to select the parts to return. It will help if it supports selecting a subset of a map to return.

@IngelaAndin IngelaAndin added the team:VM Assigned to OTP team VM label Jun 14, 2023
@epfahl
Copy link

epfahl commented Aug 10, 2023

As noted, a sequential range scan between two keys in an ordered_set can be accomplished with ets:next and ets:prev. But I echo support for the inclusion of a library function that executes efficient range queries for an ordered_set, similar to what is provided for Redis sorted set and other sorted set implementations, where it's often called slice.

@starbelly
Copy link
Contributor

Hi @sverker

Out of curiosity, I imagine this could be implemented in two ways. One would essentially be an erlang wrapper on ets, the other would be a bif. Do you think there would be any substantial performance gain that makes the latter worth doing?

@sverker
Copy link
Contributor

sverker commented Nov 30, 2023

The traversal of a sub-range of an ordered set already exists in the ets:select BIF implementation. It analyses the match spec and finds the smallest and largest possible keys and then searches that range of the ordered_set (sorted binary tree).
The problem is that is does not use the guard part of the match spec when doing this analysis.

An alternative could be to do a wrapper around ets:select (and friends) but only to analyse the matchspec. Find the possible key range in Erlang code and then pass that information down to a slightly modified ets:select BIF. That way we don't have to do the match spec analysis in C but still utilize the existing optimized C code for the table traversal with match spec execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

5 participants