Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integrating with flx #207

Closed
bling opened this issue Aug 27, 2015 · 24 comments
Closed

integrating with flx #207

bling opened this issue Aug 27, 2015 · 24 comments

Comments

@bling
Copy link
Contributor

bling commented Aug 27, 2015

i did some hacking and was able to get a prototype working with flx. here's what it looks like so far:

  (setq ivy-sort-functions-alist '((t . nil)))
  (setq my-ivy-cache (flx-make-string-cache))
  (defun ivy--filter (name candidates)
    (if (> (length name) 0)
        (let ((scores))
          (setq scores (mapcar (lambda (e)
                                 (let ((score (flx-score e ivy-text my-ivy-cache)))
                                   `(,e . ,score))) candidates))
          (setq scores (cl-delete-if-not (lambda (e) (cdr e)) scores))
          (setq scores (cl-sort scores (lambda (a b)
                                         (let ((score1 (cdr a))
                                               (score2 (cdr b)))
                                           (> (car score1) (car score2))))))
          (mapcar (lambda (e) (car e)) scores))
      candidates))

what i did was a brute-force approach...and it will break other functionality like auto-selecting the previous selection. i'm curious to see how to better integrate into ivy. worthy of note is that there is a (flx-make-filename-cache) which is optimized for files, so the algorithm would need to be applied differently depending on the source.

any help is appreciated. thanks!

@abo-abo
Copy link
Owner

abo-abo commented Sep 8, 2015

Thanks for the code. It's nice to keep in mind, but it's too slow in my opinion. For example of counsel-describe-variable with 10000 candidates, there's a noticeable delay after each key press. And I don't think it's an issue of your implementation, because I remember the similar delays in flx-ido. flx is a really cool idea, and it works great for small collections of candidates, but I'd like to have a method that works smoothly for typical collections (less than 100K).

Unfortunately, to get a smooth method, I see only two ways: implement flx in C, which makes it inflexible to amend, or implement a lightweight flx in Elisp, which means worse sorting with more speed.

@bling
Copy link
Contributor Author

bling commented Sep 13, 2015

interesting, describe-variable is plenty fast on my machine. the very first time it's pretty slow because it needs to populate the cache, but once that's done it's blazing fast and i don't notice a delay after each keystroke. it's actually faster than helm's implementation, because over there you can see the entire list flicker as it redraws everything.

@PythonNut
Copy link
Contributor

@bling I can see just a hint of lag when there are many candidates.

One strategy that I've used in the past (for adding flx sorting to company) was to sort only the first N shortest candidates. This may sound like a bad idea, but it works really well in practice: the math ends up such that you can rarely tell that such filtering is going on.

The advantage of this is a strict upper-bound on the sorting time, which is also user-configurable.

I'm sorry if this lisp is something of a monstrosity. I'm specifically using very low-level constructs to keep performance sane.

(with-eval-after-load 'ivy
  (require 'flx)
  (setq ivy-display-style t
        ivy-extra-directories nil
        ivy-wrap t
        ivy-use-virtual-buffers t
        ivy-sort-functions-alist '((t . nil)))

  (defvar my/ivy-cache (flx-make-string-cache))
  (defvar my/ivy-flx-limit 500)

  (defun nadvice/ivy--filter (name candidates)
    (if (= (length name) 0)
        candidates
      (let* (;; an optimized regex for fuzzy matching
             ;; "abc" → "\\`[^a]*a[^b]*b[^c]*c"
             (fuzzy-regex (concat "\\`"
                                  (mapconcat
                                   (lambda (x)
                                     (setq x (string x))
                                     (concat "[^" x "]*" (regexp-quote x)))
                                   name
                                   "")))

             ;; filter out non-fuzzy-matching candidates
             (cands (let ((res))
                      ;; this also lets us avoid a copy-seq
                      (dolist (cand candidates)
                        (when (string-match-p fuzzy-regex cand)
                          (push cand res)))
                      res))

             ;; partition the candidates into sorted and unsorted groups
             (cands-left)
             (cands-to-sort (if (< (length cands) my/ivy-flx-limit)
                                (progn
                                  (setq cands-left nil)
                                  cands)
                              (setq cands-left (sort (let ((res))
                                                       (dolist (cand cands)
                                                         (push cand res))
                                                       res)
                                                     (lambda (c1 c2)
                                                       (< (length c1)
                                                          (length c2)))))
                              (let ((num (min my/ivy-flx-limit
                                              (length cands)))
                                    (result nil))
                                ;; take the first num elements from cands-left
                                ;; and add them to result (cands-to-sort)
                                (while (and cands-left
                                            (>= (setq num (1- num)) 0))
                                  (push (pop cands-left) result))
                                result))))
        (append
         ;; compute all of the flx scores in one pass and sort
         (mapcar #'car
                 (sort (mapcar
                        (lambda (cand)
                          (cons cand
                                (or (car (flx-score cand
                                                    name
                                                    my/ivy-cache))
                                    0)))
                        cands-to-sort)
                       (lambda (c1 c2)
                         (> (cdr c1)
                            (cdr c2)))))

         ;; add the unsorted candidates
         cands-left))))

  (advice-add 'ivy--filter :override #'nadvice/ivy--filter))

@abo-abo does this version have the same performance problems? It works in real-time for me, but my computer is pretty decent.

@abo-abo
Copy link
Owner

abo-abo commented Sep 15, 2015

Thanks for the code, but it doesn't work currently for me (getting 0 candidates for M-x and bad selection for C-x b). I'm traveling this week, so I won't be coding much anyway. I'll see if I can adapt your code next week.

@bling
Copy link
Contributor Author

bling commented Sep 17, 2015

you just have to remove the "^" as the starting character, since that's an invalid fuzzy query.

i have the following to remove that automatically:

(cl-delete-if (lambda (e) (equal 'counsel-M-x (car e))) ivy-initial-inputs-alist)
(cl-delete-if (lambda (e) (equal 'counsel-describe-function (car e))) ivy-initial-inputs-alist)
(cl-delete-if (lambda (e) (equal 'counsel-describe-variable (car e))) ivy-initial-inputs-alist)

@PythonNut's optimizations are pretty slick -- doing an initial filter on the list had a noticeable performance boost, especially on the first query.

@PythonNut
Copy link
Contributor

The following should be even a little bit faster:

It avoids some useless data shuffling, and moves a hidden let binding out of the inner loop.

(with-eval-after-load 'ivy
  (require 'flx)

  (defvar my/ivy-cache (flx-make-string-cache))
  (defvar my/ivy-flx-limit 500)

  (defun nadvice/ivy--filter (name candidates)
    (if (= (length name) 0)
        candidates
      (let* (;; an optimized regex for fuzzy matching
             ;; "abc" → "\\`[^a]*a[^b]*b[^c]*c"
             (fuzzy-regex (concat "\\`"
                                  (mapconcat
                                   (lambda (x)
                                     (setq x (string x))
                                     (concat "[^" x "]*" (regexp-quote x)))
                                   name
                                   "")))

             ;; disable side-effects of string-match
             (inhibit-changing-match-data t)
             (cands-left)
             (cands-to-sort))

        ;; filter out non-matching candidates
        (dolist (cand candidates)
          (when (string-match fuzzy-regex cand)
            (push cand cands-left)))

        ;; pre-sort the candidates by length before partitioning
        (setq cands-left (sort cands-left
                               (lambda (c1 c2)
                                 (< (length c1)
                                    (length c2)))))

        ;; partition the candidates into sorted and unsorted groups
        (dotimes (_n (min (length cands-left) my/ivy-flx-limit))
          (push (pop cands-left) cands-to-sort))

        (append
         ;; compute all of the flx scores in one pass and sort
         (mapcar #'car
                 (sort (mapcar
                        (lambda (cand)
                          (cons cand
                                (car (flx-score cand
                                                name
                                                my/ivy-cache))))
                        cands-to-sort)
                       (lambda (c1 c2)
                         (> (cdr c1)
                            (cdr c2)))))

         ;; add the unsorted candidates
         cands-left))))

  (advice-add 'ivy--filter :override #'nadvice/ivy--filter))

@PythonNut
Copy link
Contributor

@bling here's a version that respects a leading ^.

(with-eval-after-load 'ivy
  (require 'flx)

  (defvar my/ivy-cache (flx-make-string-cache))
  (defvar my/ivy-flx-limit 500)

  (defun nadvice/ivy--filter (name candidates)
    (if (or (= (length name) 0)
            (string= name "^"))
        candidates
      (let* (;; an optimized regex for fuzzy matching
             ;; "abc" → "\\`[^a]*a[^b]*b[^c]*c"
             (fuzzy-regex (if (= (elt name 0) ?^)
                              (concat "^"
                                      (regexp-quote (substring name 1 2))
                                      (mapconcat
                                       (lambda (x)
                                         (setq x (string x))
                                         (concat "[^" x "]*" (regexp-quote x)))
                                       (substring name 2)
                                       ""))
                            (concat "^"
                                    (mapconcat
                                     (lambda (x)
                                       (setq x (string x))
                                       (concat "[^" x "]*" (regexp-quote x)))
                                     name
                                     ""))))

             (flx-name (if (= (elt name 0) ?^)
                           (substring name 1)
                         name))

             ;; disable side-effects of string-match
             (inhibit-changing-match-data t)
             (cands-left)
             (cands-to-sort))

        ;; filter out non-matching candidates
        (dolist (cand candidates)
          (when (string-match fuzzy-regex cand)
            (push cand cands-left)))

        ;; pre-sort the candidates by length before partitioning
        (setq cands-left (sort cands-left
                               (lambda (c1 c2)
                                 (< (length c1)
                                    (length c2)))))

        ;; partition the candidates into sorted and unsorted groups
        (dotimes (_n (min (length cands-left) my/ivy-flx-limit))
          (push (pop cands-left) cands-to-sort))

        (append
         ;; compute all of the flx scores in one pass and sort
         (mapcar #'car
                 (sort (mapcar
                        (lambda (cand)
                          (cons cand
                                (car (flx-score cand
                                                flx-name
                                                my/ivy-cache))))
                        cands-to-sort)
                       (lambda (c1 c2)
                         (> (cdr c1)
                            (cdr c2)))))

         ;; add the unsorted candidates
         cands-left))))

  (advice-add 'ivy--filter :override #'nadvice/ivy--filter))

As far as I can tell, the following still needs to be done:

I'll do as much of this as I can, but I might need some pointers on how to get started, especially on the first point.

@abo-abo
Copy link
Owner

abo-abo commented Sep 30, 2015

Thanks for the code, flx sorting should now work for ivy--regex-fuzzy. Please test.

abo-abo added a commit that referenced this issue Sep 30, 2015
* ivy-test.el (ivy--regex-fuzzy): Update test.

Re #207
@PythonNut
Copy link
Contributor

@abo-abo it seems to be working wonderfully. Is the new highlighting supposed to look like plp ⇒ package-list-packages, or does it highlight the whole candidate? I can't see any (fuzzy-style) highlighting being done.

@bling
Copy link
Contributor Author

bling commented Oct 1, 2015

@abo-abo awesome, works great over here. one last thing, do ivy sources have any metadata that could let us switch between (flx-make-string-cache) and (flx-make-filename-cache)? the latter has optimizations for files. thanks.

@abo-abo
Copy link
Owner

abo-abo commented Oct 2, 2015

Is the new highlighting supposed to look like plp ⇒ package-list-packages

Yes, it works like this for me.

@abo-abo
Copy link
Owner

abo-abo commented Oct 2, 2015

switch between (flx-make-string-cache) and (flx-make-filename-cache)?

There can be a dispatch on the collection function, e.g read-file-name-internal, but that's not very reliable. Otherwise, I could add user-side customization based on this-command. Try to use it as is for a while, and if you feel the need open a new issue.

@PythonNut
Copy link
Contributor

@abo-abo ah, but it requires (setq ivy-display-style 'fancy). Gotcha.

It looks like the highlighting simply grabs the first possible match, instead of highlighting the matches that flx scores highest, but that's not a terribly big deal.

@ilohmar
Copy link
Contributor

ilohmar commented Jan 5, 2016

I eventually realized that what was bugging me the last few days was exactly what @PythonNut remarked above: the sorting is correct (per the flx score), but the matching it scores is discarded, and ivy adds its own interpretation. Eg switching to a buffer "company-statistics-tests.el" by typing "cst" (flx-matching very well the initials), which highlights "_c_ompany-statistics-te_st_s.el". Is there any way to use flx's pattern (also returned by flx-score)? I guess it's not easy...

As it stands, it's quite irritating, as it re-trains my mind to think of flx matching the wrong way. Ironically, that's amplified by ivy's very nice highlighting of the match :)

abo-abo added a commit that referenced this issue Jan 6, 2016
* ivy.el (ivy--flx-sort): `flx-score' returns ((score i1 i2 i3 ...) str).
Use i1, i2, i3 ... to highlight str appropriately.
(ivy--format-minibuffer-line): Don't re-highlight str for
`ivy--regex-fuzzy', it was already highlighted by `ivy--flx-sort'.

Re #207
@abo-abo
Copy link
Owner

abo-abo commented Jan 6, 2016

Thanks, have a look.

@ilohmar
Copy link
Contributor

ilohmar commented Jan 6, 2016

Works wonderfully! Thanks a lot.

@PythonNut
Copy link
Contributor

@ilohmar hey, thanks for dropping by to clarify what I meant. I was pretty confusing, and I doubt this would have been cleared up without someone jumping in.

@bling
Copy link
Contributor Author

bling commented Jan 15, 2016

i've noticed some odd behavior, it looks like exact matches don't sort to the top. e.g. try searching for erc, it's not the top choice. is the algorithm being called incorrectly?

@abo-abo
Copy link
Owner

abo-abo commented Jan 15, 2016

@bling I've added a fix for the case when the input is a perfect match.

Otherwise, have a look at ivy--flx-sort: any time the number of matched candidates is more that 200, flx isn't used and the candidates are returned unsorted.

You can try to change the number to e.g. 2000, but this will introduce noticeable lag.

Another solution is to input ^erc instead of erc: this brings down the amount of matched cands from 1734 to 171.

@bling
Copy link
Contributor Author

bling commented Jan 15, 2016

interesting, 2000 is still very snappy on my machine. perhaps this can be made configurable?

@jojojames
Copy link
Contributor

Is it just me or evaling the snippet by @PythonNut above makes the fuzzy faster than the default fuzzy implementation.

@NightMachinery
Copy link

Is flx enabled by default for the fuzzy regex, or should I do sth to enable it?

@abo-abo
Copy link
Owner

abo-abo commented May 19, 2020

@NightMachinary Yes. If you have ivy--regex-fuzzy configured, then you just need to have flx or amx installed. Either one will be picked up automatically.

@basil-conto
Copy link
Collaborator

If you have ivy--regex-fuzzy configured

This means configuring the variable ivy-re-builders-alist to include the function ivy--regex-fuzzy. See (info "(ivy) Completion Styles") for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants