Sort perfect match first #2285

xeruf · 2020-12-14T09:34:54Z

junegunn · 2020-12-14T09:55:04Z

Mine gives dragon as the top match. You have something in your $FZF_DEFAULT_OPTS?

xeruf · 2020-12-14T10:21:11Z

Sorry, you are right:

printf 'fire-dragon 1\ndragon 2' | fzf --tiebreak=end --height=2

even though I can now understand why the algorithm selects it, it is not intuitive. The tiebreak end is there mainly for filenames, but it shouldn't prefer a longer entry because of that. Unfortunately there is no option to weigh two tiebreakers (end and length), they can only be ordered.

junegunn · 2020-12-14T12:59:15Z

--tiebreak is unintuitive and hard to understand especially with the new algorithm so I don't recommend relying on it. I even considered removing it but I kept it mostly for backward compatibility.
fzf was designed to be a context-free text filter rather than a path picker, so I decided that it should not be particularly optimized for file paths. It's a design choice. Maybe there are other programs that work better with file paths than fzf, but that is not a problem fzf wants to solve.
Consider not using --tiebreak option. The default option (length,index) is actually quite good for file paths.
There is a nice trick to make it prefer matches at the latter part of the path. We can take advantage of the fact that most file paths end with an extension, so simply put . at the end of your query. `abc.
```
# Compare "the-end" vs. "the-end."
fzf --height 10 --query 'the-end.' << EOF
the-end-is-near/foo-bar.sh
foo-bar/the-end-is-near.sh
EOF
```

xeruf · 2020-12-14T19:08:34Z

My point here is that I use fzf for paths in the overwhelming majority, and in that case it makes sense to use end rather than length as the tiebreaker (short paths such as under /etc often aren't what I'm looking for when searching for example for an rc file), that's why I set that as default option.
My fix here was to reverse the tiebreaks in that particular script.

That tip does indeed sound nice, though when searching through rc files is not particularly useful, as they rarely have an extension by convention.

The question remains whether the end-tiebreaker may be tweaked in a way that it yields the more expected output. I guess currently it goes by percentage or position from beginning, when it might just be better to count from behind, where in this example both are equal and thus the length-tiebreaker would select the correct option ;)

junegunn · 2020-12-15T01:29:34Z

when it might just be better to count from behind

We did that, and it didn't turn out great.

0541c0d

Sadly, there is no one-size-fits-all option. It's better to understand how the default ranking algorithm works and tweak the query string to assist this dumb filter to do a better job. fzf implements all major Readline key bindings so it's easy to move the cursor around and tweak the search terms once you get used to it.

> foo bar┃

# ALT-B
> foo ┃bar

# CTRL-U
> ┃bar

# Prepend /.
> /.┃bar

# CTRL-E
> /.bar┃

# CTRL-Y
> /.barfoo ┃

xeruf · 2020-12-15T10:54:00Z

printf 'fire-dragon\ndragon' | fzf --tiebreak=end --height=2

Just realized that even in this case entering "dragon" will select "fire-dragon", so I can't enter anything more to get a better search result.

The behavior is the same with a space:

printf 'fire dragon\ndragon' | fzf --tiebreak=end --height=2

But removing the delimiter sorts "dragon" first as soon as I enter a single 'd':

printf 'firedragon\ndragon' | fzf --tiebreak=end --height=2

So I'm assuming you are splitting on non-word characters and the matching against the terms.
Let me suggest some tweaks to the algorithm, even before the tiebreakers come into play:

a split by '-' should be weaker than other non-word characters, because it often shows run-together words whereas underscore, space, tab & co show a stronger separation
the percentage of the term matched should be factored in

xeruf · 2020-12-15T10:54:45Z

We did that, and it didn't turn out great.

0541c0d

What do you mean? Did you revert it?

junegunn · 2020-12-15T13:13:18Z

I can't enter anything more to get a better search result.

dragon ^d
dragon !f

Let me suggest some tweaks to the algorithm

Thanks for the input, but I'm not interested in fine-tuning the scoring algorithm because like I said above, fzf is a text filter and the criteria may not make sense with different types of input.

What do you mean? Did you revert it?

Yes, please read the commit message.

xeruf · 2020-12-15T17:39:01Z

Let me suggest some tweaks to the algorithm

Thanks for the input, but I'm not interested in fine-tuning the scoring algorithm because like I said above, fzf is a text filter and the criteria may not make sense with different types of input.

I thought these changes would be pretty general.

If you don't want to do more tweaking, I would suggest to at least pick the issue title back up and always prefer a perfect match.

The percentage-match could also be an additional tiebreaker.

What do you mean? Did you revert it?

Yes, please read the commit message.

oh woops, I read it the other way around, sorry ^^

xeruf · 2022-01-23T12:17:58Z

I am still interested in looking into improving this - look at this:
printf 'fire-dragonflame\ndragon' | fzf --tiebreak=index --height=2

I am wondering why the algorithm has to resort to the tiebreaker here at all?
Very similar to my my real-world issue in #2537.

junegunn · 2022-01-23T15:14:33Z

Because, unfortunately, the current scoring algorithm doesn't have a mechanism to prefer the latter. The scores calculated by the algorithm are just the same for the two cases.

https://github.com/junegunn/fzf/blob/master/src/algo/algo.go

Not saying there isn't room for improvement. We could try fine-tuning the algorithm.

xeruf · 2022-07-26T06:04:29Z

How about taking inspiration from https://github.com/jhawthorn/fzy/blob/master/ALGORITHM.md#fzys-scoring ?
I love the scoring in fzy but the features of fzf.

junegunn · 2022-07-29T03:10:03Z

Very similar to my my real-world issue in #2537

So your real problem here is that the default length tiebreak is not taking --nth into account (see). If you use the default length tiebreak, fzf will prefer dragon over fire-dragonflame because it's shorter. In the context of fuzzy matching, "the perfect match" is always the shortest one, so with length tiebreak, we can say that fzf already prefers the perfect match.

It looks like the algorithm of fzy isn't very different from that of fzf. The document you linked above is severely outdated and it describes the algorithm of fzf 0.14.0 and below. fzy doesn't have --nth or --tiebreak, so the situation isn't too different if you use the default options of fzf.

junegunn · 2022-07-29T06:07:02Z

But we could consider adding a bonus point to a character at the end of a word so for a query dragon, dragon light is preferred over dragonflight (before --tiebreak)

If I'm not mistaken, fzy isn't doing that either.

Favors the line with shorter matched chunk. A chunk is a set of consecutive non-whitespace characters. Unlike the default `length`, this new scheme works well with tabular input. # length prefers item #1, because the whole line is shorter, # chunk prefers item #2, because the matched chunk ("foo") is shorter fzf --height=6 --header-lines=2 --tiebreak=chunk --reverse --query=fo << "EOF" N | Field1 | Field2 | Field3 - | ------ | ------ | ------ 1 | hello | foobar | baz 2 | world | foo | bazbaz EOF If the input does not contain any spaces, `chunk` is equivalent to `length`. But we're not going to set it as the default because it is computationally more expensive. Close #2285 Close #2537 - Not the exact solution to --tiebreak=length not taking --nth into account, but this should work. And the added benefit is that it works well even when --nth is not provided. - Adding a bonus point to the last character of a word didn't turn out great. The order of the result suddenly changes when you type in the last character in the word producing a jarring effect.

junegunn closed this as completed Dec 15, 2020

xeruf mentioned this issue Jun 23, 2021

Tiebreak options do not suffice with nth #2537

Closed

10 tasks

xeruf mentioned this issue May 29, 2022

Elaborate on more differences jhawthorn/fzy#172

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort perfect match first #2285

Sort perfect match first #2285

xeruf commented Dec 14, 2020

junegunn commented Dec 14, 2020

xeruf commented Dec 14, 2020

junegunn commented Dec 14, 2020 •

edited

xeruf commented Dec 14, 2020 •

edited

junegunn commented Dec 15, 2020

xeruf commented Dec 15, 2020 •

edited

xeruf commented Dec 15, 2020

junegunn commented Dec 15, 2020

xeruf commented Dec 15, 2020 •

edited

xeruf commented Jan 23, 2022

junegunn commented Jan 23, 2022

xeruf commented Jul 26, 2022

junegunn commented Jul 29, 2022 •

edited

junegunn commented Jul 29, 2022 •

edited

Sort perfect match first #2285

Sort perfect match first #2285

Comments

xeruf commented Dec 14, 2020

Info

Problem / Steps to reproduce

junegunn commented Dec 14, 2020

xeruf commented Dec 14, 2020

junegunn commented Dec 14, 2020 • edited

xeruf commented Dec 14, 2020 • edited

junegunn commented Dec 15, 2020

xeruf commented Dec 15, 2020 • edited

xeruf commented Dec 15, 2020

junegunn commented Dec 15, 2020

xeruf commented Dec 15, 2020 • edited

xeruf commented Jan 23, 2022

junegunn commented Jan 23, 2022

xeruf commented Jul 26, 2022

junegunn commented Jul 29, 2022 • edited

junegunn commented Jul 29, 2022 • edited

junegunn commented Dec 14, 2020 •

edited

xeruf commented Dec 14, 2020 •

edited

xeruf commented Dec 15, 2020 •

edited

xeruf commented Dec 15, 2020 •

edited

junegunn commented Jul 29, 2022 •

edited

junegunn commented Jul 29, 2022 •

edited