New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split incompatibility (stringr/stringi) #126

Closed
hadley opened this Issue Dec 3, 2014 · 3 comments

Comments

Projects
None yet
2 participants
@hadley

hadley commented Dec 3, 2014

I've realised there's a rather important difference between stringr's n and stringi's n_max:

ncol(stringi::stri_split_regex("", "a", n_max = 3, simplify = TRUE))
# [1] 0
ncol(stringr::str_split_fixed("", "a", n = 3))
# [1] 3

ncol(stringi::stri_split_regex("bab", "a", n_max = 3, simplify = TRUE))
# [1] 2
ncol(stringr::str_split_fixed("bab", "a", n = 3))
# [1] 3

ncol(stringi::stri_split_regex(character(), "a", n_max = 3, simplify = TRUE))
# [1] 0
ncol(stringr::str_split_fixed(character(), "a", n = 3))
# [1] 3

I noticed this because it breaks httr which relies on str_split_fixed always returning exactly the same number of pieces.

@gagolews gagolews added this to the stringi-0.4 milestone Dec 4, 2014

@gagolews gagolews self-assigned this Dec 4, 2014

gagolews added a commit that referenced this issue Dec 4, 2014

@gagolews

This comment has been minimized.

Show comment
Hide comment
@gagolews

gagolews Dec 4, 2014

Owner

@hadley , for consistency, now n_max arg is named n

Owner

gagolews commented Dec 4, 2014

@hadley , for consistency, now n_max arg is named n

gagolews added a commit that referenced this issue Dec 4, 2014

(ref #126) [IMPORTANT CHANGE] `simplify=FALSE` in `stri_extract_all_*…
…` and `stri_split_*` now calls `stri_list2matrix` with `fill=""`. `fill=NA_character_` may be obtained by using `simplify=NA`.
@gagolews

This comment has been minimized.

Show comment
Hide comment
@gagolews

gagolews Dec 4, 2014

Owner

Works like a charm now:

> stringi::stri_split_regex(character(), "a", n = 3, simplify = TRUE)
     [,1] [,2] [,3]
> stringr::str_split_fixed(character(), "a", n = 3)
     [,1] [,2] [,3]
> stringi::stri_split_regex("", "a", n = 3, simplify = TRUE)

     [,1] [,2] [,3]
[1,] ""   ""   ""  
> stringr::str_split_fixed("", "a", n = 3)
     [,1] [,2] [,3]
[1,] ""   ""   ""  

> stringi::stri_split_regex("bab", "a", n = 3, simplify = TRUE)
     [,1] [,2] [,3]
[1,] "b"  "b"  ""  
> stringr::str_split_fixed("bab", "a", n = 3)
     [,1] [,2] [,3]
[1,] "b"  "b"  ""  

Note that I had to change the meaning of simplify=TRUE in stri_extract_all_* and stri_split_*. The old behavior may be obtained by setting simplify=NA now.

Owner

gagolews commented Dec 4, 2014

Works like a charm now:

> stringi::stri_split_regex(character(), "a", n = 3, simplify = TRUE)
     [,1] [,2] [,3]
> stringr::str_split_fixed(character(), "a", n = 3)
     [,1] [,2] [,3]
> stringi::stri_split_regex("", "a", n = 3, simplify = TRUE)

     [,1] [,2] [,3]
[1,] ""   ""   ""  
> stringr::str_split_fixed("", "a", n = 3)
     [,1] [,2] [,3]
[1,] ""   ""   ""  

> stringi::stri_split_regex("bab", "a", n = 3, simplify = TRUE)
     [,1] [,2] [,3]
[1,] "b"  "b"  ""  
> stringr::str_split_fixed("bab", "a", n = 3)
     [,1] [,2] [,3]
[1,] "b"  "b"  ""  

Note that I had to change the meaning of simplify=TRUE in stri_extract_all_* and stri_split_*. The old behavior may be obtained by setting simplify=NA now.

@gagolews gagolews closed this in 0f93928 Dec 4, 2014

@hadley

This comment has been minimized.

Show comment
Hide comment
@hadley

hadley Dec 4, 2014

Awesome, thanks!

hadley commented Dec 4, 2014

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment