-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support string alternative syntax (curly braces pattern) #2
Comments
/cc @alexcrichton (nothing important, no need to read/reply; just if you're interested) TL;DR: A while back I attempted a complete rewrite of glob. I called it glob-prime, as in I'll probably take a shot at implementing this alternation syntax whenever I'm bored if no one else gets around to it. I actually rewrote this glob package from scratch with that goal, to translate globs to regular expressions in order to make the code thinner and easier to expand/maintain. This is what Python does, and it wouldn't surprise me if others did the same. Adding this alternative syntax would've been very straightforward, as it would more or less map to a regex alternation, I also re-implemented the actual globbing algorithm to be one which, as this one currently does, splits the patterns on the path separator and then translates each one into a given token which has 'globbing' behavior defined for it. The goal was to make things much easier to expand: adding the alternative syntax would've basically consisted of creating an AST token for it, defining how it translates to a regex, and then defining a selector variant for it with its selection behavior, which would have been, for example, something like When all was said and done, however, I found it to be much slower. First, the new selection algorithm I used, which I actually got inspiration from python's glob for, wasn't exactly optimal because on each iteration I would have to re-trace through each of the selectors. Python has the Perhaps I could've optimized this by storing a reference to the old position or something, but I didn't get around to it because I also discovered that the regular expression approach in general was much slower. I already kind of expected this because regular expressions are much more general and have more machinery to handle other use cases, which can hardly compete against a hand-tuned specific parser like the one we have now. But I didn't expect it to be that much slower. I don't remember the numbers, but it was on the order of something like, glob would take maybe 50ns max with the benchmarking test set, and mine would take like 200,000ns. I figured I'd cut my losses and stick with the current globbing mechanism while keeping my regular expression back-end, but it was still something like 80,000ns to the current one's 50ns max. With a difference like that I figured my redesigns would never be accepted, even if it was still orders of magnitude faster than for example python's, and even though globbing isn't something that's typically done in a hot loop. So I figured I'd cut my losses once more and proposed what remained, which was relative path support (#25), error reporting (#26), recursive wildcard simplification (#27), and I/O error propagation (#28). I didn't really bother too much with attempting to optimize it further because it was mostly a temporary diversion for me, since I need this package for one of mine. But more importantly what we have now works fine enough, and globbing isn't necessarily very extensible, so it's not like we really required that extensibility; the gains would've mostly been in the area of maintainability, but this package shouldn't need to be touched too much for the aforementioned reason. |
Is there a plan to support string alternatives with curly braces, like
{foo,bar}program
matchingfooprogram
andbarprogram
?To avoid code duplication in the matcher, I was thinking it might be a good idea to use the
regex
crate to implement this (as well as the rest ofglob
).Support in other languages:
GLOB_BRACE
option.The text was updated successfully, but these errors were encountered: