Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In otel-arrow, we're building a query-engine based on arrow & datafusion. There are two features we'd like to support:
a) perform a case insensitive match on telemetry attribute's key (a string)
b) perform case insensitive match on some other string value
This was implemented in open-telemetry/otel-arrow#2501 using ilike and escaping the special like characters (&, _, \) (see here). This is not ideal if the text we're comparing against has these escaped characters because, if it does, the comparison gets done using a regex match (which is slower) instead of simply using eq_ignore_ascii_case:
|
pub(crate) fn ilike(pattern: &'a str, is_ascii: bool) -> Result<Self, ArrowError> { |
|
if is_ascii && pattern.is_ascii() { |
|
if !contains_like_pattern(pattern) { |
|
return Ok(Self::IEqAscii(pattern)); |
|
} else if pattern.ends_with('%') |
|
&& !pattern.ends_with("\\%") |
|
&& !contains_like_pattern(&pattern[..pattern.len() - 1]) |
|
{ |
|
return Ok(Self::IStartsWithAscii(&pattern[..pattern.len() - 1])); |
|
} else if pattern.starts_with('%') && !contains_like_pattern(&pattern[1..]) { |
|
return Ok(Self::IEndsWithAscii(&pattern[1..])); |
|
} |
|
} |
|
Ok(Self::Regex(regex_like(pattern, true)?)) |
I'm thinking if I could simply expose a way to evaluate Predicate::IEqAscii on my arrays, it would be simple for me to write a ScalarUDF to achieve what I need in my query-engine.
Describe the solution you'd like
I'd like if we could expose a like::eq_ignore_ascii_case function from the arrow-string care that does a equality comparison on two string Datums using a case insensitive ascii match.
Describe alternatives you've considered
- Ilike and escape (can have performance overhead when there are special characters)
- Duplicate the predicate code into my query-engine (fixing in arrow-string seemed like less work)
Additional context
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In otel-arrow, we're building a query-engine based on arrow & datafusion. There are two features we'd like to support:
a) perform a case insensitive match on telemetry attribute's key (a string)
b) perform case insensitive match on some other string value
This was implemented in open-telemetry/otel-arrow#2501 using
ilikeand escaping the special like characters (&, _, \) (see here). This is not ideal if the text we're comparing against has these escaped characters because, if it does, the comparison gets done using a regex match (which is slower) instead of simply usingeq_ignore_ascii_case:arrow-rs/arrow-string/src/predicate.rs
Lines 68 to 81 in 7ad2299
I'm thinking if I could simply expose a way to evaluate
Predicate::IEqAsciion my arrays, it would be simple for me to write a ScalarUDF to achieve what I need in my query-engine.Describe the solution you'd like
I'd like if we could expose a
like::eq_ignore_ascii_casefunction from the arrow-string care that does a equality comparison on two stringDatums using a case insensitive ascii match.Describe alternatives you've considered
Additional context