Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex support for reorder verb #1325

Open
osevill opened this issue Jun 17, 2023 · 2 comments · Fixed by #1473
Open

Regex support for reorder verb #1325

osevill opened this issue Jun 17, 2023 · 2 comments · Fixed by #1473
Assignees

Comments

@osevill
Copy link

osevill commented Jun 17, 2023

Would it be possible to allow regex matching when reordering column headers of a csv file?
The documentation describes reorder as requiring the specific field names, e.g., "i" and "b" in mlr --opprint reorder -f i,b data/small

My use case is that I don't necessarily know the exact field names, but I know that some will start with prefix XXX and other with YYY, and I would like to be able to reorder so that any (or 0) fields starting with YYY come first, followed by any (or 0) that start with XXX.

Thanks!

@indera
Copy link

indera commented Jul 15, 2023

An option is to rename the columns if you know the column position,
then sort by the name you choose.

See
https://miller.readthedocs.io/en/latest/csv-with-and-without-headers/

cat unknown_col.csv
abc, xxx_like, yyy_unlike
10, 1, z
11, 2, y
12, 3, x

Processing

tail -n +2 unknown_col.csv | mlr --csv --implicit-csv-header label a,xxx,yyy then sort -f yyy,xxx

a,xxx,yyy
12, 3, x
11, 2, y
10, 1, z

@johnkerl johnkerl changed the title Regex Support for Reorder Verb Regex support for reorder verb Aug 19, 2023
@johnkerl johnkerl self-assigned this Aug 20, 2023
@johnkerl johnkerl removed the on deck label Jan 21, 2024
@osevill
Copy link
Author

osevill commented Jan 28, 2024

Thanks for adding this in v6.11!

Test file: reorder_regex_test_2.csv

I'm testing regex support for the reorder verb, and noticing unexpected behavior.

For the attached file, why does this give the expected results:
mlr --c2p reorder -f 'aaa_aaa','ccc_aaa','bbb_aaa' ./reorder_regex_test_2.csv

but this doesn't: (changing the -f to -r)
mlr --c2p reorder -r 'aaa_aaa','ccc_aaa','bbb_aaa' ./reorder_regex_test_2.csv

In the second expression, column order of the results is 'bbb_aaa' 'aaa_aaa' 'ccc_aaa'

I tried this first and also had unexpected results:
mlr --c2p reorder -r '^aaa.*$','^ccc.*$','^bbb.*$' ./reorder_regex_test_2.csv

..with results in a similar column order... '^bbb.*$','^aaa.*$','^ccc.*$'

Providing just one regex expression seems to work fine however:
mlr --c2p reorder -r '^aaa.*$' ./reorder_regex_test_2.csv

Am I using incorrect syntax to combine the regex fields? Apologies if I'm missing something obvious.

@johnkerl johnkerl reopened this Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants