Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sorting with collection functions #1592

Closed
wants to merge 3 commits into from
Closed

Conversation

ericdanan
Copy link
Contributor

@ericdanan ericdanan commented May 24, 2018

Commit 1734ea9 changed (cl-sort coll sort-fn) to (sort (copy-sequence coll) sort-fn) at one place in ivy-reset-state, line 1956. This PR simply does the same change in the same function, line 1952.

I'm not exactly sure why this matters, but it solves an issue I got in counsel-projectile, ericdanan/counsel-projectile/issues/87.

@ericdanan
Copy link
Contributor Author

@ericdanan ericdanan commented May 24, 2018

I just added two commits related to sorting.

The first one deals with the case where ivy-read is called with a collection function and a :caller argument. If there is no entry for the collection function in ivy-sort-functions-alist, then look for a caller entry before falling back to the default t entry. For ivy-rotate-sort, look for collection then caller but not the t entry, I hesitated about this but since the effect of rotating is persistent, I thought it would be better not to rotate the t entry.

The second commit makes it so ivy-sort-max-size is always taken into account, since currently it is not in the case of a collection function.

@ericdanan ericdanan changed the title ivy.el (ivy-reset-state): use copy-sequence with sort Improve sorting with collection functions May 24, 2018
@ericdanan
Copy link
Contributor Author

@ericdanan ericdanan commented May 24, 2018

I actually have a couple of questions regarding my second commit (try caller after collection):

  1. Wouldn't it be better to try caller before collection? I noticed that in ivy--sort caller has precedence over collection.
  2. In ivy-reset-state, when the collection is not a function, is the check for org-refile-history needed? I see it was added 3 years ago, before entries for org-refile and related were added to the default value of ivy-sort-functions-alist.

My first impression is yes for 1 and no for 2, but I'd rather wait for your opinion before pushing more commits.

@basil-conto
Copy link
Collaborator

@basil-conto basil-conto commented May 25, 2018

Commit 1734ea9 changed (cl-sort coll sort-fn) to (sort (copy-sequence coll) sort-fn) at one place in ivy-reset-state, line 1956.

No, commit 1734ea9 only changed cl-sort calls to the equivalent sort calls. That commit neither added nor removed any calls to copy-sequence. Perhaps you meant some other commit?

(if (functionp collection)
(when (and (not (eq collection 'read-file-name-internal))
(<= (length coll) ivy-sort-max-size)
(setq sort-fn (ivy--sort-function collection caller)))
Copy link
Collaborator

@basil-conto basil-conto May 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this changes the fall-through logic. Previously, the alternative (else branch) of the if conditional would have been taken if, for example, (functionp collection) was satisfied, but (ivy--sort-function collection) was not. Now, this combination causes neither of the conditional if's branches to be taken. Is this intentional?

Copy link
Contributor Author

@ericdanan ericdanan May 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but unless I'm wrong, taking the else branch would only have had any effect if (ivy--sort-function caller) was satisfied. But in that case, (ivy--sort-function collection caller) is now satisfied as well, so the effect should be the same I think.

Copy link
Collaborator

@basil-conto basil-conto May 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for explaining (the logic here has always confused me).

@ericdanan
Copy link
Contributor Author

@ericdanan ericdanan commented May 25, 2018

You're right, the commit I linked to didn't have anything to do with copy-sequence, I got confused somehow. copy-sequence was called in one place and not the other from the beginning, I don't know what is the reason for that if any.

Are cl-sort and sort really equivalent? Looking at the definition of cl-sort, it does rely on sort but it's not clear to me that it is exactly the same.

@basil-conto
Copy link
Collaborator

@basil-conto basil-conto commented May 25, 2018

Are cl-sort and sort really equivalent? Looking at the definition of cl-sort, it does rely on sort but it's not clear to me that it is exactly the same.

Yes, cl-sort always boils down to a sort. The main difference is that cl-sort accepts convenient keywords such as :key.

ericdanan added a commit to ericdanan/counsel-projectile that referenced this issue May 27, 2018
When the collection is a function, it is this function that must be
added to `ivy-sort-functions-alist`. Adding the caller instead has no
effect.

Note that at the moment, sorting may not work correctly if the default
is changed, due to an issue in ivy with sorting with collection
functions. See abo-abo/swiper/pull/1592.
@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Jun 7, 2018

Thanks, I've merged your code after some edits. Please check if it fixes your issue.

One change I did is remove copy-sequence if the collection is functionp. Since in that case we generate a new list by all-completions, so we should not care about sorting destructively here.
The main concern with copy-sequence is to not modify a collection list that got passed by the user to ivy-read, since that would be surprising and inconvenient.

@ericdanan
Copy link
Contributor Author

@ericdanan ericdanan commented Jun 8, 2018

Thank you. Unfortunately my issue in counsel-projectile is not fixed, for some reason it seems to need copy-sequence even though the collection is functionp.

Here is a minimal example where the issue arises for me:

(defvar my/collvar)

(defun my/collfn (&rest _r)
  (setq my/collvar '("b" "a")))

(defun my/matchfn (regexp _candidates)
  (ivy--re-filter regexp my/collvar))

(ivy-read ">" #'my/collfn
	  :matcher #'my/matchfn
	  :sort t)

When I execute the ivy-read call, I only see "b" as candidate, (with the default value of ivy-sort-functions-alist, so ivy-string< is used).

I checked that the value my/collvar is correctly set to '("b" "a") just before the sort call of ivy--reset-state, but it is '("b") just after. So the sort call seems to alter the variable. If I add copy-sequence in that call, the issue disappears.

I agree that the example looks silly because my/collvar and my/matchfn are useless. In the real case with the counsel-projectile command, the collection function sets two such variables to split the cadndidates into two groups (each variable holds a list of candidates, and the collection function returns the concatenated list), and the matcher relies on these two variables rather than the global candidate list. But perhaps that's not a good way to go.

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Jun 8, 2018

But perhaps that's not a good way to go.

I think so too. Thing about ivy-read is that as a library it can't assume that the collection passed to it will be small, so it's a good idea to avoid extra copies unless necessary.
counsel-projectile as a user of the library can actually afford to return a copy-sequence from it's collection function, since it knows it's inexpensive (I assume that the candidate count isn't in tens of thousands).

Here's the suggested fix for your example:

(defvar my/collvar)

(defun my/collfn (&rest _r)
  (let ((res (list "b" "a")))
    (setq my/collvar (copy-sequence res))
    res))

(defun my/matchfn (regexp _candidates)
  (ivy--re-filter regexp my/collvar))

(ivy-read ">" #'my/collfn
	  :matcher #'my/matchfn
	  :sort t)

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Jun 8, 2018

On the other hand, we're already calling sort on a potentially big collection. So it might be appropriate to just bite the bullet and do a copy-sequence in addition. This would relieve the users from extra work on edge cases but (slightly) decrease the overall performance.

@ericdanan
Copy link
Contributor Author

@ericdanan ericdanan commented Jun 8, 2018

My projects are not in the tens of thousands of files, but one user who submitted an issue was working with 400k files!

Thanks a lot for the suggested fix. Indeed I can call copy-sequence in my collection function.

But I think one advantage of doing it in the call inside ivy--reset-state is that it only happens if the collection is smaller than ivy-sort-max-size, whose default value is 30k. So at least for very large collections, we avoid copy-sequence and sort alltogether. For collections close to the threshold, I have no idea if the performance impact of copy-sequence would be significant.

@basil-conto
Copy link
Collaborator

@basil-conto basil-conto commented Jun 9, 2018

FWIW, I agree with @ericdanan that Ivy should do TRT, namely not destructively modify any external data. Obviously only profiling can tell, but I doubt copy-sequence is where most of the time is spent. Besides, function collections are often used precisely because calculating the entire collection ahead of time is impractically costly, and, as has been mentioned, there are other thresholds in place to protect against large collections.

@ericdanan
Copy link
Contributor Author

@ericdanan ericdanan commented Jun 10, 2018

I don't if that's relevant, I just used benchmark-run to measure the time taken by calling copy-sequence on dummy lists I generated:

(defvar my/list nil)

(dotimes (i 30000)
  (push (format "candidate%d" i) my/list))

(benchmark-run 1 (copy-sequence my/list))
=> (0.0026226 0 0.0)

(dotimes (i 400000)
  (push (format "candidate%d" i) my/list))

(benchmark-run 1 (copy-sequence my/list))
=> (0.0927932 1 0.06187020000000043)

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Jun 11, 2018

But I think one advantage of doing it in the call inside ivy--reset-state is that it only happens if the collection is smaller than ivy-sort-max-size, whose default value is 30k. So at least for very large collections, we avoid copy-sequence and sort alltogether.

Makes sense. Please make a PR to bring back copy-sequence. Put this argument in the commit message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants