fuzzy match for typoes #34

danielb2 · 2020-03-14T23:02:08Z

Hi, I'm coming from using autojump after being recommended this from a coworker. I love the blazing speed of zoxide.

Below are some examples for fuzzy matching.

( Note: you can see the echo that's mentioned in #22 )

I also miss jo (open matching directory. for mac this is something like open (zoxide query $argv) )

The text was updated successfully, but these errors were encountered:

ajeetdsouza · 2020-04-08T16:02:29Z

@cole-h and I did some work on this, and it seems like this feature might take a while, since none of the existing solutions for Rust seem to suit our purpose.

bbqsrc · 2020-05-30T20:36:24Z

Have you tried https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance ? It considers transpositions. The strsim crate has a lot of useful stuff for this. :)

ajeetdsouza · 2020-05-31T16:49:12Z

@bbqsrc I have tried Levenshtein distance, but I haven't yet found a satisfactory way to apply it. The major problem is that the number of characters in a query is typically 3-4, so even one edit can significantly change the word. Moreover, there can be multiple words in a query which have to match the path in order. Allowing all of them to be fuzzy can sometimes significantly alter results.

I do have a rough algorithm in mind, but I'd gladly appreciate any inputs!

bbqsrc · 2020-05-31T18:47:20Z

Note that Damerau-Levenshtein is a different algorithm derived from Levenshtein. You can introduce your own penalty weighting based on the distance of the result to device whether a 'typo' is sufficient to be considered the winner. You can also pass the decision to the user.

I don't really have a convenient answer for your use case, unfortunately. But I'd be happy to provide feedback on any proposed algorithm, particularly if you can provide example cases of what you would hope the behaviour to be. My day to day work is basically in supporting linguistic tooling so I hope i can be of some use. xD

ajeetdsouza · 2020-05-31T18:55:31Z

@bbqsrc Thanks, that would be great!

Regarding strsim, I don't think that fits my use case, since I'm trying to match a substring within the path, and I would require the last index of the match. I'll do some searching.

I have a hectic week ahead, but I'll get back to you with something concrete as soon as I can.

NightMachinery · 2020-06-02T10:56:39Z

I like a fuzzy mode like fzf. For example, js currently returns no result for me, while I have javascript in my zoxide db.

See this for turning the text into a fuzzy search and this for sorting them intelligently.

ajeetdsouza · 2020-06-02T19:45:32Z

@NightMachinary, I'm not sure that is the kind of fuzzy matching we're looking for. I think this issue about fuzziness as a solution to minor spelling errors, where the correct spelling is one or two characters off the query.

fzf expects each of the input characters to be present in the output in order (no errors), but allows an arbitrary number of characters in between each matching character. This works well for fzf as a selector, but I have my doubts about how well it will work for something like zoxide. According to me, there'd be a large number of paths in a typical database containing a j followed by an s, making such a search unreliable.

All the same, you'd get similar results if you tried something like z j s (with a space in between).

NightMachinery · 2020-06-02T21:53:38Z

@ajeetdsouza

I think that if you add a penalty to their score for drifting edit distances, things will be fine even if the fuzziness leads to lots of matches? (Also, on my system, I currently have no match for js at all. It seems to me that the assumption of big path DBs might not be all that true in lots of cases.)

Another interesting thing is to support a query like sc/ja for ~/scripts/javascript.

ajeetdsouza · 2020-06-03T09:12:57Z

@NightMachinary fzf already does add a penalty to edit distance, but still I doubt it would work well outside of this isolated case. However, since zoxide matches each keyword in order in the path, matching against sc/ja is already quite easy:

z sc ja

ajeetdsouza · 2021-03-05T17:55:47Z

Closing this - zoxide queries are almost always just a few characters long, and fuzzy matching would only lower their accuracy at this point. It also makes queries annoyingly unpredictable in the case where it's wrong.

bjesus · 2022-04-22T18:17:17Z

Sorry to bring this up again - but just raising the voice of us that uses zoxide differently. I'm never typing down to get to downloads, but I'm more likely to type downlad or something like that. I just don't think like that: if I want to go to ABC I type ABC, not AB, but I might make some mistakes. My queries aren't just a few characters long and could very much benefit from such an algorithm - but perhaps it's just me 🤔

not-matthias · 2022-06-11T13:11:37Z

I'd also love such a feature. Especially because it's kind of annoying to rewrite the entire command just because of one small typo. You could only use fuzzy matching when there are no results.

ThatXliner · 2022-07-11T22:28:05Z

Maybe it could be configurable?

You could only use fuzzy matching when there are no results.

Also this

danielb2 · 2024-01-13T09:05:22Z

@ajeetdsouza I decided to check in again. I have some really long directory names, and the whole point of a directory jumper is being lazy and not typing in a full path. I don't really get the logic. Can you reconsider?

AdrianArtiles · 2024-05-07T22:56:49Z

since this is the top search result for zoxide fuzzy search I thought I'd share what I do for my interactive workflow (the main way I use zoxide) in case it is helpful for others.
I do one of two things:

use zi and have a space between my characters that I want to fuzzy search. like zi ar doc to match /Archives/misc/documentation. I mainly do this in interactive mode after calling zi.
use my custom zf command below that I've added to .zshrc. This is basically just using fzf but ordered by zoxide's algorithm. I use eza for directory preview, but it should be clear how to customize how you're calling fzf.

zf () {
  cd $(zoxide query --list --score | fzf --height 40% --layout reverse --info inline --border --preview "eza --all --group-directories-first --header --long --no-user --no-permissions --color=always {2}" --no-sort | awk '{print $2}')
}

noelzubin · 2024-07-02T05:50:43Z

Can we do a search with levensteins distance with threshold incase when no results are found. This should keep zoxide as fast as it is now given a match exists. If not it can try and pick the closest one. Having used autojump all this time and Given how often i make typos this feature is indispensable. After all even if it slower than other algos, its still faster than me going back and editing my shell command. @ajeetdsouza

xiaket mentioned this issue Mar 21, 2020

Add zsh completion #9

Closed

ajeetdsouza added the C-feature-request label May 2, 2020

ajeetdsouza closed this as completed Mar 5, 2021

tgross35 mentioned this issue Dec 17, 2022

Fuzzy matching with levenshtein: take 2 #503

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fuzzy match for typoes #34

fuzzy match for typoes #34

danielb2 commented Mar 14, 2020

ajeetdsouza commented Apr 8, 2020

bbqsrc commented May 30, 2020

ajeetdsouza commented May 31, 2020

bbqsrc commented May 31, 2020

ajeetdsouza commented May 31, 2020

NightMachinery commented Jun 2, 2020

ajeetdsouza commented Jun 2, 2020

NightMachinery commented Jun 2, 2020

ajeetdsouza commented Jun 3, 2020

ajeetdsouza commented Mar 5, 2021

bjesus commented Apr 22, 2022

not-matthias commented Jun 11, 2022 •

edited

Loading

ThatXliner commented Jul 11, 2022

danielb2 commented Jan 13, 2024

AdrianArtiles commented May 7, 2024

noelzubin commented Jul 2, 2024 •

edited

Loading

fuzzy match for typoes #34

fuzzy match for typoes #34

Comments

danielb2 commented Mar 14, 2020

ajeetdsouza commented Apr 8, 2020

bbqsrc commented May 30, 2020

ajeetdsouza commented May 31, 2020

bbqsrc commented May 31, 2020

ajeetdsouza commented May 31, 2020

NightMachinery commented Jun 2, 2020

ajeetdsouza commented Jun 2, 2020

NightMachinery commented Jun 2, 2020

ajeetdsouza commented Jun 3, 2020

ajeetdsouza commented Mar 5, 2021

bjesus commented Apr 22, 2022

not-matthias commented Jun 11, 2022 • edited Loading

ThatXliner commented Jul 11, 2022

danielb2 commented Jan 13, 2024

AdrianArtiles commented May 7, 2024

noelzubin commented Jul 2, 2024 • edited Loading

not-matthias commented Jun 11, 2022 •

edited

Loading

noelzubin commented Jul 2, 2024 •

edited

Loading