-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using +
with choose
field separator considered harmful
#21
Comments
That's a great point! And an interesting optimization unto itself... I'll update the benchmarks here shortly running I like your tool btw. I posted |
No worries! No toes were harmed in the making of this software. When I first posted about |
I figure out why The benchmarks have been updated and I apologize for the double issue running against the wrong input file and not taking the greedy splitting into account! |
No worries, thanks for the update! I'll be back when I am done improving performance 😈 |
Hey there, I just discovered this project from following the breadcrumbs from your issue over on
choose
! I noticed you had some great benchmark reports (really impressive performance btw!). It looks like you are testing a couple choice inputs tochoose
using regexes with a+
at the end. By default, field separators inchoose
are greedy, and so using+
is not necessary, and actually hurts performance (my understanding is that it is typical for regex engine's performance to be hurt by any repetition).So where you have
choose -f '[[:space:]]+' -i ./hyper_data.txt 0 7 18 > /dev/null
, I recommend you instead usechoose -f '[[:space:]]' -i ./hyper_data.txt 0 7 18 > /dev/null
(note the lack of+
). It looks like there are a few other spots as well unnecessarily using+
.Admittedly, this is something of a foot gun, so if you felt that it would be good to demonstrate both, that would also be fair. I am thinking of adding some docs explaining that this should be avoided, and possibly stripping repeat operators in code as well.
For reference, with a quick test using
time
, it looks like adding the+
makeschoose
take very roughly twice as long:The text was updated successfully, but these errors were encountered: