Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automagically apply some functions to columns in WHERE, if func(column) present in PK #10685

Open
UnamedRus opened this issue May 6, 2020 · 3 comments
Labels
comp-optimizers Query optimizations feature

Comments

@UnamedRus
Copy link
Contributor

Use case
We have some column with very long values in it, but we want to use that column in PRIMARY KEY for faster query lookup (WHERE column = 'long_string' or WHERE column IN ('long_string1','long_string2')).
We dont want to load all this long values in ram, so we can use some deterministic function(hash, startsWith, slice) to reduce key size.
Describe the solution you'd like
Add some setting and optimization which would replace WHERE column = 'long_row' with
WHERE cityHash64(column) = cityHash64('long_string') AND column = 'long_string' (if table definition contains ORDER BY (cityHash64(column))

Describe alternatives you've considered
Sometimes we can write that optimization by yourselfs, but sometimes cant.
For example if we run queries from software with query builder.

@filimonov filimonov added the comp-optimizers Query optimizations label May 6, 2020
@amosbird
Copy link
Collaborator

Sounds reasonable. Perhaps we can have a special operator for this, like column ~ 'long_string' and some additional definitions added when creating merge tree tables.

@zhang2014
Copy link
Contributor

CityHash64 function is not injective, how does it work in a sparse index in this case?

BTW: Maybe bloom filter skip index is better?

@amosbird
Copy link
Collaborator

amosbird commented Sep 8, 2021

I just realize this can be achieved simply by extending KeyCondition. = is another case as an "always monotonic" indicator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp-optimizers Query optimizations feature
Projects
None yet
Development

No branches or pull requests

4 participants