New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ivy--regex-ignore-order matches too many results #296

Closed
Konubinix opened this Issue Nov 19, 2015 · 5 comments

Comments

Projects
None yet
2 participants
@Konubinix
Contributor

Konubinix commented Nov 19, 2015

AFAICU, the function ivy--regex-ignore-order should match the elements of the search in any order, but still make sure that ALL the elements are in the candidates

In practice, it matches candidates that contain as many elements as in the query, but not necessarily that match ALL the elements of the query one.

As an example, say you run

(completing-read ">"
                 '(
                   "foo bar"
                   "bar foo"
                   "bar bar"
                   "foo foo"
                   )
                 )

then search for "foo bar". I would expect it to match only two candidates: "foo bar" and "bar foo".

Instead, the four candidates are matched.

This is understandable, because the generated regexp is

(ivy--regex-ignore-order "foo bar") -> "\\(foo\\|bar\\).*?\\(foo\\|bar\\)"

I think that we would need to transposing the logic to match all possibilities of order, resulting in a query more like:

"\\(foo.*bar\\)\\|\\(bar.*foo\\)"

The problem with this proposal is that the regexp grows exponentially with the number of elements (more precisely, n elements need n! groups) to match. For instance "foo bar egg" would result in 6 groups:

"\\(foo.*bar.*egg\\)\\|\\(bar.*foo.*egg\\)\\|\\(bar.*egg.*foo\\)\\|\\(*foo.*egg.*bar\\)\\|\\(egg.*foo.*bar\\)\\|\\(egg.*bar.*foo\\)"

IIRC my language theory lectures, there is no other way to perform such match as ignore order with regexp since by definition they don't have a memory of the matched results. I don't know if it is still true with extended regexp.

We can take a look at the number of groups needed for few elements:

  • 3 elements => 6 groups
  • 4 elements => 24 goups
  • 5 elements => 120 groups # it is becoming large
  • 6 elements => 720 groups # too large I think

IMHO, we can assume that people barely use more than 5 elements in such a query

@Konubinix

This comment has been minimized.

Show comment
Hide comment
@Konubinix

Konubinix Nov 19, 2015

Contributor

I tried generating such big regexp to test emacs behavior. You can generate them with the following python snippet:

import itertools
print "\(" + "\)\|\(".join([".*".join(l) for l in list(itertools.permutations(["a", "b", "c", "d", "e", "f"]))]) + "\)"

With only 5 elements (a to e), there is no problem in a small buffer. With 6 elements, emacs complains that the regexp is too big...

Contributor

Konubinix commented Nov 19, 2015

I tried generating such big regexp to test emacs behavior. You can generate them with the following python snippet:

import itertools
print "\(" + "\)\|\(".join([".*".join(l) for l in list(itertools.permutations(["a", "b", "c", "d", "e", "f"]))]) + "\)"

With only 5 elements (a to e), there is no problem in a small buffer. With 6 elements, emacs complains that the regexp is too big...

@abo-abo

This comment has been minimized.

Show comment
Hide comment
@abo-abo

abo-abo Nov 20, 2015

Owner

4 elements => 24 goups

This implies 24 times slower matching. Considering that e.g. counsel-describe-function typically has 30k candidates, there could be a significant slowdown.

I added ivy--regex-ignore-order as a proof of concept, I don't think it's great. But if you like it and want to improve it, I'll accept a PR. The only thing is that you need the Emacs Copyright Assignment for changes >15 lines.

Owner

abo-abo commented Nov 20, 2015

4 elements => 24 goups

This implies 24 times slower matching. Considering that e.g. counsel-describe-function typically has 30k candidates, there could be a significant slowdown.

I added ivy--regex-ignore-order as a proof of concept, I don't think it's great. But if you like it and want to improve it, I'll accept a PR. The only thing is that you need the Emacs Copyright Assignment for changes >15 lines.

@Konubinix

This comment has been minimized.

Show comment
Hide comment
@Konubinix

Konubinix Nov 20, 2015

Contributor

I already have it since I contribute to org-mode :-).

I think that indeed it is too slow. Did you know ivy--regex-ignore-order had the behavior I describe?

Contributor

Konubinix commented Nov 20, 2015

I already have it since I contribute to org-mode :-).

I think that indeed it is too slow. Did you know ivy--regex-ignore-order had the behavior I describe?

@abo-abo

This comment has been minimized.

Show comment
Hide comment
@abo-abo

abo-abo Nov 20, 2015

Owner

Did you know ivy--regex-ignore-order had the behavior I describe?

No. But it's not too bad: choosing ivy--regex-ignore-order over ivy--regex means you want more candidates. And more candidates you get:)

Owner

abo-abo commented Nov 20, 2015

Did you know ivy--regex-ignore-order had the behavior I describe?

No. But it's not too bad: choosing ivy--regex-ignore-order over ivy--regex means you want more candidates. And more candidates you get:)

@Konubinix

This comment has been minimized.

Show comment
Hide comment
@Konubinix

Konubinix Nov 24, 2015

Contributor

You are right. Nonetheless, I have the feeling that you get more candidate than you expect when you use the function.

I propose to add a comment in the documentation string to indicate its behavior.

Contributor

Konubinix commented Nov 24, 2015

You are right. Nonetheless, I have the feeling that you get more candidate than you expect when you use the function.

I propose to add a comment in the documentation string to indicate its behavior.

Konubinix added a commit to Konubinix/swiper that referenced this issue Nov 24, 2015

@abo-abo abo-abo closed this Nov 24, 2015

abo-abo added a commit that referenced this issue Jan 7, 2016

Properly support matching ignoring order
* ivy.el (ivy--regex-ignore-order): Return a list of regexps rather than
  a single regexp.
(ivy--re-filter): New defun, extracted from `ivy--filter'.
(ivy--filter): Update.
(ivy--format-minibuffer-line): Add special highlighting for
`ivy--regex-ignore-order'.

* counsel.el (counsel--find-file-matcher): Use `ivy--re-filter'.

Fixes #296
Fixes #329
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment