Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add arrays support for match function #40484

Open
pavel-sbor opened this issue Aug 22, 2022 · 1 comment
Open

Add arrays support for match function #40484

pavel-sbor opened this issue Aug 22, 2022 · 1 comment
Labels

Comments

@pavel-sbor
Copy link

Use case

We need to search on columns of Array type using regular expressions. The current realization of 'match' function (https://clickhouse.com/docs/en/sql-reference/functions/string-search-functions/#matchhaystack-pattern) works with single value columns only and doesn't support columns of Array type. We know, it's possible to use something like arrayExists(x -> match(x,'<regex_value>'),<column>) = 1, but lambdas work too slow on big amount of data (1kkk values and more).

Describe the solution you'd like

It would be nice to add support for columns of Array type in 'match' function. Regular expression pattern should be applied to all array items. 0 should be returned if none of array items matched the pattern, 1 otherwise. Separate function will be also accepted.

Describe alternatives you've considered

Array iteration with lambda could be a solution as shown above, but we need better lambdas performance on big amount of data (>1kkk values in table).

@canhld94
Copy link
Contributor

canhld94 commented Sep 1, 2022

I think the main issue here is not with the lambda but the amount of data. Array columns are big and the cost of reading are expensive (especially array of strings in your case). I think what you need is an inverted index, which haven't supported yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants