Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stri_order - multiple keys #219

Closed
gagolews opened this issue Apr 16, 2016 · 2 comments
Closed

stri_order - multiple keys #219

gagolews opened this issue Apr 16, 2016 · 2 comments
Assignees
Labels

Comments

@gagolews
Copy link
Owner

gagolews commented Apr 16, 2016

allow stri_order(list(x, y, z))

However, a common use case would be mixed-type inputs (e.g., a data frame columns): numeric, character, factor, etc.; coercion of such inputs to character [as usual in stringi] might not be the best idea..

@gagolews gagolews self-assigned this Apr 16, 2016
@gagolews gagolews added this to the stringi-1.1 milestone Apr 16, 2016
@gagolews gagolews removed this from the stringi-1.1 milestone Apr 21, 2018
@gagolews
Copy link
Owner Author

gagolews commented Oct 1, 2020

stri_sort_key() could be used for this purpose + pass the converted data (character only? don't convert factors to strings?) to the base order(0).

Update: stri_sort_key returns bytes-encoded strings, order does not accept them. Moreover, order must've been called with C/POSIX locale set.

To do: introduce stri_rank

@gagolews
Copy link
Owner Author

gagolews commented Apr 29, 2021

Sorting wrt 2+ arguments can be performed, for example, like:

> x <- ToothGrowth[sample(nrow(ToothGrowth), 10), ]
> x[order(stri_rank(x$supp), x$len), ]
    len supp dose
35 14.5   OJ  0.5
41 19.7   OJ  1.0
54 24.5   OJ  2.0
52 26.4   OJ  2.0
44 26.4   OJ  1.0
10  7.0   VC  0.5
18 14.5   VC  1.0
14 17.3   VC  1.0
19 18.8   VC  1.0
24 25.5   VC  2.0
> x[order(-stri_rank(x$supp), x$len), ]
    len supp dose
10  7.0   VC  0.5
18 14.5   VC  1.0
14 17.3   VC  1.0
19 18.8   VC  1.0
24 25.5   VC  2.0
35 14.5   OJ  0.5
41 19.7   OJ  1.0
54 24.5   OJ  2.0
52 26.4   OJ  2.0
44 26.4   OJ  1.0


> X <- data.frame(a=c("b", NA, "b", "b", NA, "a", "a", "c"), b=runif(8))
> X[order(stri_rank(X$a), X$b), ]
     a          b
7    a 0.01393522
6    a 0.70113965
4    b 0.08760199
1    b 0.28077931
3    b 0.70474734
8    c 0.73574582
5 <NA> 0.22125663
2 <NA> 0.95334338

Because of this, I don't think we need another function.
I will include a relevant example in the manual.

@gagolews gagolews mentioned this issue Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant