stri_count_boundaries #109

Closed
gagolews opened this Issue Oct 29, 2014 · 1 comment

Comments

Projects
None yet
1 participant
@gagolews
Owner

gagolews commented Oct 29, 2014

add stri_count_boundaries - counting number of words, sentences, characters

sth like

sapply(stri_split_boundaries(stri_trans_nfkd("ąść"), stri_opts_brkiter(type="character")), length)

@gagolews gagolews self-assigned this Oct 29, 2014

@gagolews gagolews added this to the stringi-0.3 milestone Oct 29, 2014

@gagolews

This comment has been minimized.

Show comment
Hide comment
@gagolews

gagolews Oct 30, 2014

Owner
> test <- "The\u00a0above-mentioned    features are very useful. Warm thanks to their developers."
> stri_count_boundaries(test, stri_opts_brkiter(type="word"))
[1] 28
> stri_count_boundaries(test, stri_opts_brkiter(type="sentence"))
[1] 2
> stri_count_boundaries(test, stri_opts_brkiter(type="character"))
[1] 81
> stri_count_words(test)
[1] 12
> 
> test2 <- stri_trans_nfkd("\u03c0\u0153\u0119\u00a9\u00df\u2190\u2193\u2192")
> stri_count_boundaries(test2, stri_opts_brkiter(type="character"))
[1] 8
> stri_length(test2)
[1] 9
> stri_numbytes(test2)
[1] 20
Owner

gagolews commented Oct 30, 2014

> test <- "The\u00a0above-mentioned    features are very useful. Warm thanks to their developers."
> stri_count_boundaries(test, stri_opts_brkiter(type="word"))
[1] 28
> stri_count_boundaries(test, stri_opts_brkiter(type="sentence"))
[1] 2
> stri_count_boundaries(test, stri_opts_brkiter(type="character"))
[1] 81
> stri_count_words(test)
[1] 12
> 
> test2 <- stri_trans_nfkd("\u03c0\u0153\u0119\u00a9\u00df\u2190\u2193\u2192")
> stri_count_boundaries(test2, stri_opts_brkiter(type="character"))
[1] 8
> stri_length(test2)
[1] 9
> stri_numbytes(test2)
[1] 20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment