Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONIG_OPTION_FIND_LONGEST both interprets regex to be greedy and skips shorter matches, need to only be greedy but not skip #37

Closed
yurivict opened this issue Dec 8, 2016 · 3 comments

Comments

@yurivict
Copy link

yurivict commented Dec 8, 2016

Current behavior of ONIG_OPTION_FIND_LONGEST causes two effects:

  • regex is matched in a "greedy" way (the longest possible match is found at every location)
  • shorter matches are skipped by onig_match, so that it returns only the longest matches (multiples are allowed)

I need it to not skip anything, but match in a greedy fashion. Is this currently possible?
If this is not possible, I would like to suggest to split ONIG_OPTION_FIND_LONGEST into two new options: ONIG_OPTION_GREEDY and ONIG_OPTION_ONLY_LONGEST. It's best to be able to specify different bits of behavior separately.

@kkos
Copy link
Owner

kkos commented Dec 9, 2016

I can't image the "regex is matched in a "greedy" way" or ONIG_OPTION_GREEDY.
Is it different from a regex which does not include any non-greedy operators.

@yurivict
Copy link
Author

For example, in the string "2 km 1 km/h 5 km/h" onig_search with regex "[0-9]+[[:space:]](km|km/h)" with ONIG_OPTION_NONE finds three matches: "2 km", "1 km" and "5 km", but with ONIG_OPTION_FIND_LONGEST the matches become greedy: "1 km/h" and "5 km/h".

It both makes group matches greedy, and only keeps the longest matches. This isn't logical that greediness of the group operator also changes, because some people might need to only get the longest matches, and some people might only need the greedy group operator, and this option does both at the same time.

@kkos
Copy link
Owner

kkos commented Dec 12, 2016

It is too difficult to select the longest element in alternatives at runtime.
And as you know, alternative operator has priorities from the left to the right.
I think that'll be enough for many cases.

@kkos kkos closed this as completed Jul 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants