Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How should Nucleo work? #38

Closed
zjp-CN opened this issue Feb 18, 2024 · 2 comments
Closed

How should Nucleo work? #38

zjp-CN opened this issue Feb 18, 2024 · 2 comments

Comments

@zjp-CN
Copy link

zjp-CN commented Feb 18, 2024

Thanks for creating the fuzzy library.

I encounter a weird problem for Nucleo struct.

For the following code which you can run on rust-explorer

use std::sync::Arc;
use nucleo::Nucleo;
use nucleo::pattern::{CaseMatching, Normalization};

fn main() {
    let mut matcher = init_fuzzy_matcher();
    let inject = matcher.injector();
    let list = ["foobar", "fxxoo", "oo", "a"];
    list.iter().for_each(|s| {
        inject.push(s, |_| {});
    });
    matcher
        .pattern
        .reparse(0, "f", CaseMatching::Ignore, Normalization::Smart, false);
    let _status = matcher.tick(1000);
    dbg!(matcher.pattern.column_pattern(0));

    let mut counter = 0;
    loop {
        let _status = matcher.tick(100);
        // if status.changed {
        let snapshot = matcher.snapshot();
        let total = snapshot.item_count();
        let got = snapshot.matched_item_count();
        let res: Vec<_> = snapshot
            .matched_items(..)
            .map(|item| item.data)
            .collect();
        dbg!(total, got, res);
        // }
        // if !status.running {
        //     break;
        // }
        println!("running");
        if counter > 4 {
            break;
        }
        counter += 1;
    }
}

type Matcher = Nucleo<&'static str>;

fn init_fuzzy_matcher() -> Matcher {
    Nucleo::new(
        nucleo::Config::DEFAULT,
        Arc::new(|| println!("notified")),
        None,
        1,
    )
}

The res is always empty:

[src/main.rs:34:9] total = 4
[src/main.rs:34:9] got = 0
[src/main.rs:34:9] res = []

By using nucleo::Matcher, for the same config, input and needle string, there is the desired output.

use nucleo::pattern::{Atom, AtomKind, CaseMatching, Normalization};
use nucleo::Matcher;

fn main() {
    let mut matcher = init_fuzzy_matcher();
    let list = ["foobar", "fxxoo", "oo", "a"];
    let res = Atom::new(
        "f",
        CaseMatching::Ignore,
        Normalization::Smart,
        AtomKind::Fuzzy,
        false,
    )
    .match_list(&list, &mut matcher);
    dbg!(res);
}

fn init_fuzzy_matcher() -> Matcher {
    Matcher::new(nucleo::Config::DEFAULT)
}
[src/main.rs:20:5] res = [
    (
        "foobar",
        36,
    ),
    (
        "fxxoo",
        36,
    ),
]

So the question is how we use Nucleo in the right way? I see an issue asking for examples, but no replies in there.
I also scan the code in helix's source files, though nucleo is used as its dependency, the real use of it is Matcher, not Nucleo.

@zjp-CN
Copy link
Author

zjp-CN commented Feb 18, 2024

Well, I think the problem is from

// Injector<T>
pub fn push(
    &self,
    value: T,
    fill_columns: impl FnOnce(&mut [Utf32String])
) -> u32

I didn't use fill_columns to add the source string to the search list because I mistakenly think value: T is like T in Atom/Pattern:

// Atom/Pattern
pub fn match_list<T>(
    &self,
    items: impl IntoIterator<Item = T>,
    matcher: &mut Matcher
) -> Vec<(T, u16)>
where
    T: AsRef<str>,

Actually, I indeed noticed Injector<T> lacks AsRef<str> bound, and was wondering from where the matcher knows the string source. Now I understand T on Injector<T> and Pattern::match_list<T> mean different things.

And Nucleo is indeed what I need. Here's the working code:

// ...
    let list = [
        "foobar".to_owned(),
        "fxxoo".to_owned(),
        "oo".to_owned(),
        "a long string".to_owned(),
    ];
    for (idx, item) in list.iter().enumerate() {
        inject.push(Idx(idx), |buf| {
            dbg!(buf.len());
            if let Some(buf) = buf.first_mut() {
                *buf = item.as_str().into();
            }
        });
    }
// ...
         snapshot
                .matched_items(..)
                .map(|item| &list[item.data.0])
                .collect();

The last thing I don't understand is why the argument in fill_columns callback is &mut [Utf32String].

@zjp-CN
Copy link
Author

zjp-CN commented Feb 18, 2024

The last thing I don't understand is why the argument in fill_columns callback is &mut [Utf32String].

Hah, I just realized it's due to Nucleo::<T>::new(..., columns).

Nucleo can match items with multiple orthogonal properties. columns indicates how many matching columns each item (and the pattern) has. The number of columns can not be changed after construction.

I created a 1 column Nucleo<T>, thus Injector<T> should fill exactly 1 cloumn of Utf32String.

@zjp-CN zjp-CN closed this as completed Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant