Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sorting with collection functions #1592

Closed
wants to merge 3 commits into from

Conversation

@ericdanan
Copy link
Contributor

commented May 24, 2018

Commit 1734ea9 changed (cl-sort coll sort-fn) to (sort (copy-sequence coll) sort-fn) at one place in ivy-reset-state, line 1956. This PR simply does the same change in the same function, line 1952.

I'm not exactly sure why this matters, but it solves an issue I got in counsel-projectile, ericdanan/counsel-projectile/issues/87.

@ericdanan

This comment has been minimized.

Copy link
Contributor Author

commented May 24, 2018

I just added two commits related to sorting.

The first one deals with the case where ivy-read is called with a collection function and a :caller argument. If there is no entry for the collection function in ivy-sort-functions-alist, then look for a caller entry before falling back to the default t entry. For ivy-rotate-sort, look for collection then caller but not the t entry, I hesitated about this but since the effect of rotating is persistent, I thought it would be better not to rotate the t entry.

The second commit makes it so ivy-sort-max-size is always taken into account, since currently it is not in the case of a collection function.

@ericdanan ericdanan changed the title ivy.el (ivy-reset-state): use copy-sequence with sort Improve sorting with collection functions May 24, 2018

@ericdanan

This comment has been minimized.

Copy link
Contributor Author

commented May 24, 2018

I actually have a couple of questions regarding my second commit (try caller after collection):

  1. Wouldn't it be better to try caller before collection? I noticed that in ivy--sort caller has precedence over collection.
  2. In ivy-reset-state, when the collection is not a function, is the check for org-refile-history needed? I see it was added 3 years ago, before entries for org-refile and related were added to the default value of ivy-sort-functions-alist.

My first impression is yes for 1 and no for 2, but I'd rather wait for your opinion before pushing more commits.

@basil-conto

This comment has been minimized.

Copy link
Collaborator

commented May 25, 2018

Commit 1734ea9 changed (cl-sort coll sort-fn) to (sort (copy-sequence coll) sort-fn) at one place in ivy-reset-state, line 1956.

No, commit 1734ea9 only changed cl-sort calls to the equivalent sort calls. That commit neither added nor removed any calls to copy-sequence. Perhaps you meant some other commit?

(if (functionp collection)
(when (and (not (eq collection 'read-file-name-internal))
(<= (length coll) ivy-sort-max-size)
(setq sort-fn (ivy--sort-function collection caller)))

This comment has been minimized.

Copy link
@basil-conto

basil-conto May 25, 2018

Collaborator

Note that this changes the fall-through logic. Previously, the alternative (else branch) of the if conditional would have been taken if, for example, (functionp collection) was satisfied, but (ivy--sort-function collection) was not. Now, this combination causes neither of the conditional if's branches to be taken. Is this intentional?

This comment has been minimized.

Copy link
@ericdanan

ericdanan May 25, 2018

Author Contributor

Yes but unless I'm wrong, taking the else branch would only have had any effect if (ivy--sort-function caller) was satisfied. But in that case, (ivy--sort-function collection caller) is now satisfied as well, so the effect should be the same I think.

This comment has been minimized.

Copy link
@basil-conto

basil-conto May 25, 2018

Collaborator

OK, thanks for explaining (the logic here has always confused me).

@ericdanan

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2018

You're right, the commit I linked to didn't have anything to do with copy-sequence, I got confused somehow. copy-sequence was called in one place and not the other from the beginning, I don't know what is the reason for that if any.

Are cl-sort and sort really equivalent? Looking at the definition of cl-sort, it does rely on sort but it's not clear to me that it is exactly the same.

@basil-conto

This comment has been minimized.

Copy link
Collaborator

commented May 25, 2018

Are cl-sort and sort really equivalent? Looking at the definition of cl-sort, it does rely on sort but it's not clear to me that it is exactly the same.

Yes, cl-sort always boils down to a sort. The main difference is that cl-sort accepts convenient keywords such as :key.

ericdanan added a commit to ericdanan/counsel-projectile that referenced this pull request May 27, 2018
counsel-projectile, cp-switch-to-buffer: fix default sorting
When the collection is a function, it is this function that must be
added to `ivy-sort-functions-alist`. Adding the caller instead has no
effect.

Note that at the moment, sorting may not work correctly if the default
is changed, due to an issue in ivy with sorting with collection
functions. See abo-abo/swiper/pull/1592.

@abo-abo abo-abo closed this in c6b5b12 Jun 7, 2018

@abo-abo

This comment has been minimized.

Copy link
Owner

commented Jun 7, 2018

Thanks, I've merged your code after some edits. Please check if it fixes your issue.

One change I did is remove copy-sequence if the collection is functionp. Since in that case we generate a new list by all-completions, so we should not care about sorting destructively here.
The main concern with copy-sequence is to not modify a collection list that got passed by the user to ivy-read, since that would be surprising and inconvenient.

@ericdanan

This comment has been minimized.

Copy link
Contributor Author

commented Jun 8, 2018

Thank you. Unfortunately my issue in counsel-projectile is not fixed, for some reason it seems to need copy-sequence even though the collection is functionp.

Here is a minimal example where the issue arises for me:

(defvar my/collvar)

(defun my/collfn (&rest _r)
  (setq my/collvar '("b" "a")))

(defun my/matchfn (regexp _candidates)
  (ivy--re-filter regexp my/collvar))

(ivy-read ">" #'my/collfn
	  :matcher #'my/matchfn
	  :sort t)

When I execute the ivy-read call, I only see "b" as candidate, (with the default value of ivy-sort-functions-alist, so ivy-string< is used).

I checked that the value my/collvar is correctly set to '("b" "a") just before the sort call of ivy--reset-state, but it is '("b") just after. So the sort call seems to alter the variable. If I add copy-sequence in that call, the issue disappears.

I agree that the example looks silly because my/collvar and my/matchfn are useless. In the real case with the counsel-projectile command, the collection function sets two such variables to split the cadndidates into two groups (each variable holds a list of candidates, and the collection function returns the concatenated list), and the matcher relies on these two variables rather than the global candidate list. But perhaps that's not a good way to go.

@abo-abo

This comment has been minimized.

Copy link
Owner

commented Jun 8, 2018

But perhaps that's not a good way to go.

I think so too. Thing about ivy-read is that as a library it can't assume that the collection passed to it will be small, so it's a good idea to avoid extra copies unless necessary.
counsel-projectile as a user of the library can actually afford to return a copy-sequence from it's collection function, since it knows it's inexpensive (I assume that the candidate count isn't in tens of thousands).

Here's the suggested fix for your example:

(defvar my/collvar)

(defun my/collfn (&rest _r)
  (let ((res (list "b" "a")))
    (setq my/collvar (copy-sequence res))
    res))

(defun my/matchfn (regexp _candidates)
  (ivy--re-filter regexp my/collvar))

(ivy-read ">" #'my/collfn
	  :matcher #'my/matchfn
	  :sort t)
@abo-abo

This comment has been minimized.

Copy link
Owner

commented Jun 8, 2018

On the other hand, we're already calling sort on a potentially big collection. So it might be appropriate to just bite the bullet and do a copy-sequence in addition. This would relieve the users from extra work on edge cases but (slightly) decrease the overall performance.

@ericdanan

This comment has been minimized.

Copy link
Contributor Author

commented Jun 8, 2018

My projects are not in the tens of thousands of files, but one user who submitted an issue was working with 400k files!

Thanks a lot for the suggested fix. Indeed I can call copy-sequence in my collection function.

But I think one advantage of doing it in the call inside ivy--reset-state is that it only happens if the collection is smaller than ivy-sort-max-size, whose default value is 30k. So at least for very large collections, we avoid copy-sequence and sort alltogether. For collections close to the threshold, I have no idea if the performance impact of copy-sequence would be significant.

@basil-conto

This comment has been minimized.

Copy link
Collaborator

commented Jun 9, 2018

FWIW, I agree with @ericdanan that Ivy should do TRT, namely not destructively modify any external data. Obviously only profiling can tell, but I doubt copy-sequence is where most of the time is spent. Besides, function collections are often used precisely because calculating the entire collection ahead of time is impractically costly, and, as has been mentioned, there are other thresholds in place to protect against large collections.

@ericdanan

This comment has been minimized.

Copy link
Contributor Author

commented Jun 10, 2018

I don't if that's relevant, I just used benchmark-run to measure the time taken by calling copy-sequence on dummy lists I generated:

(defvar my/list nil)

(dotimes (i 30000)
  (push (format "candidate%d" i) my/list))

(benchmark-run 1 (copy-sequence my/list))
=> (0.0026226 0 0.0)

(dotimes (i 400000)
  (push (format "candidate%d" i) my/list))

(benchmark-run 1 (copy-sequence my/list))
=> (0.0927932 1 0.06187020000000043)
@abo-abo

This comment has been minimized.

Copy link
Owner

commented Jun 11, 2018

But I think one advantage of doing it in the call inside ivy--reset-state is that it only happens if the collection is smaller than ivy-sort-max-size, whose default value is 30k. So at least for very large collections, we avoid copy-sequence and sort alltogether.

Makes sense. Please make a PR to bring back copy-sequence. Put this argument in the commit message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.