Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str_detect always treats pattern as fixed string #168

Closed
dlindelof opened this issue Sep 26, 2018 · 7 comments
Closed

str_detect always treats pattern as fixed string #168

dlindelof opened this issue Sep 26, 2018 · 7 comments
Labels
bug an unexpected problem or unintended behavior func trans 🌍 Translation of individual functions to SQL wip work in progress
Milestone

Comments

@dlindelof
Copy link

When str_detect is used in a call to filter on a tbl, I expect pattern to be treated as a regular expression. Instead, it seems that it is treated as a fixed pattern:

> MASTER %>% filter(str_detect(FOO, "pen$")) %>% show_query
<SQL>
SELECT *
FROM "MASTER"
WHERE (INSTR("FOO", 'pen$') > 0)

The SQL INSTR function performs a case-insensitive search, and does not treat its second argument as a regex.

@colearendt
Copy link
Collaborator

What database are you using here?

@kieran-mace
Copy link

I have this issue too. I am using teradata.

@PNorvaisas
Copy link

I have the same problem with MySQL database. Need to collect data to be able to use patterns in str_detect.

@yutannihilation
Copy link
Member

str_detect() can be translated as INSTR(), CHARINDEX(), or CHARINDEX(), but neither of them is a regex function.

dbplyr/R/db-postgres.r

Lines 68 to 70 in 79d2a03

str_detect = function(string, pattern){
sql_expr(strpos(!!string, !!pattern) > 0L)
}

dbplyr/R/db-odbc-mssql.R

Lines 93 to 96 in 79d2a03

str_detect = function(string, pattern){
build_sql(
"CHARINDEX(", pattern, ", ", string, ") > 0"
)}

According to #151 (comment), this seems a design decision.

There's unfortunately no way that dbplyr can translate every stringr feature into every database backend.

You can directly specify a function particular to the backend, though. For example (this is only for illustration purpose and the SQLs are invalid for SQLite):

library(DBI)
library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)

con <- dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(con, iris)

# MySQL has REGEXP
tbl(con, "iris") %>%
  filter(FOO %regexp% "pen$") %>%
  show_query
#> <SQL>
#> SELECT *
#> FROM `iris`
#> WHERE (`FOO` REGEXP 'pen$')

# Teradata has REGEXP_LIKE
tbl(con, "iris") %>%
  filter(regexp_like(FOO, "pen$")) %>%
  show_query
#> <SQL>
#> SELECT *
#> FROM `iris`
#> WHERE (REGEXP_LIKE(`FOO`, 'pen$'))

Created on 2018-10-12 by the reprex package (v0.2.1)

@yutannihilation
Copy link
Member

@KMace
BTW, for tidyverse/stringr#262, you might feel you were using regex, but it was, sadly, just a bug (c.f. #81)... The dev version of dbplyr will give you 0 rows with the very same code.

@kieran-mace
Copy link

@yutannihilation
Oh wow. Thanks for that heads up.

@hadley hadley added bug an unexpected problem or unintended behavior func trans 🌍 Translation of individual functions to SQL labels Jan 2, 2019
@hadley
Copy link
Member

hadley commented Jan 3, 2019

@colearendt do you want to take this one on too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior func trans 🌍 Translation of individual functions to SQL wip work in progress
Projects
None yet
Development

No branches or pull requests

6 participants