-
Notifications
You must be signed in to change notification settings - Fork 5
/
detection.Rmd
55 lines (46 loc) · 1.81 KB
/
detection.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
title: "And/Or Detection"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{And/Or Detection}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup}
library(strex)
```
## How it works
`strex` offers easy and/or versions of `stringr::str_detect()` via `str_detect_all()` and `str_detect_any()`. These are vectorized over `string` but not `pattern`. `stringr::fixed()` and `stringr::coll())` are handled correctly. Otherwise, `stringr` regular expressions are used. For `str_detect_all()`, a pattern argument `c("x", "y")` is converted to `"(?=.*x)(?=.*y)"`. For `str_detect_any()`, a pattern argument `c("x", "y")` is converted to `"x|y"`.
## Examples
```{r examples}
str_detect_all("quick brown fox", c("x", "y", "z"))
str_detect_all(c(".", "-"), ".")
str_detect_all(c(".", "-"), coll("."))
str_detect_all(c(".", "-"), coll("."), negate = TRUE)
str_detect_all(c(".", "-"), c(".", ":"))
str_detect_all(c(".", "-"), coll(c(".", ":")))
str_detect_all("xyzabc", c("a", "c", "z"))
str_detect_all(c("xyzabc", "abcxyz"), c(".b", "^x"))
str_detect_any("quick brown fox", c("x", "y", "z"))
str_detect_any(c(".", "-"), ".")
str_detect_any(c(".", "-"), coll("."))
str_detect_any(c(".", "-"), coll("."), negate = TRUE)
str_detect_any(c(".", "-"), c(".", ":"))
str_detect_any(c(".", "-"), coll(c(".", ":")))
str_detect_any(c("xyzabc", "abcxyz"), c(".b", "^x"))
```
## Performance
Unless you're doing a huge amount of computation, it won't matter, but FWIW, it's faster to convert to regex using `str_escape()` rather than using `coll()`.
```{r performance}
bench::mark(
str_detect_all(rep("*", 1000), rep(str_escape("*"), 555)),
str_detect_all(rep("*", 1000), coll(rep("*", 555))),
min_iterations = 100
)
```