Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upFR: split %like% into %like% and %rlike% #3333
Comments
|
should be resolved together with #2519 |
|
https://www.postgresql.org/docs/9.3/functions-matching.html Not too familiar with postgres myself, but I see |
|
I suggests to not look for pg older than 9.3. I believe postgres is more kind of standard than spark. Anyway best to add features to |
|
Agreed... though to be fair, the fully robust version of |
|
I guess breaking |
|
No objection to offering variants of |
|
I'm about to create a PR on this matter. I have on question: Is it better to create one function with varying operators or several functions? My idea was to change the function as such: like <- function(vector, pattern, ignore.case = FALSE, fixed = FALSE)
{
# Intended for use with a data.table 'where'
# Don't use * or % like SQL's like. Uses regexpr syntax - more powerful.
if (is.factor(vector)) {
as.integer(vector) %in% grep(pattern, levels(vector), ignore.case = ignore.case, fixed = fixed)
} else {
# most usually character, but integer and numerics will be silently coerced by grepl
grepl(pattern, vector, ignore.case = ignore.case, fixed = fixed)
}
# returns 'logical' so can be combined with other where clauses.
}
"%like%" = like
"%ilike%" = function(vector, pattern, ignore.case) like(vector, pattern, ignore.case = TRUE)
"%flike%" = function(vector, pattern, fixed) like(vector, pattern, fixed = TRUE)However, the usage of |
|
it will be possible as calling
|
|
@andreasLD what is the difference between proposed |
|
Gain seems to be very minimal for everyone to sift through their code and fix this when broken.. Why not add a |
|
I like But still don't see the point of a |
|
At least it feels more natural for SQL users |
|
Oh... just noticed we have |
This would be mimicking the syntax of Spark SQL (
like/rlike) which I quite like.%like%would begin to usefixed = TRUE(potentially a breaking change) and%rlike%would be more like the current%like%. The idea is to provide afixed = TRUEoption since this is more efficient.Do a lot of people rely currently on
%like%accepting regex?Would also be possible to go the opposite way by offering e.g.
%flike%as thefixed version of%like%, at the expense of being somewhat confusing for frequent users of Spark SQL &data.table(such as myself)