-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spurious matches with substring matching and non-ASCII #34
Comments
I'm actually getting the opposite problem now where now it's not matching even though it should be. |
hmm yeah that is an orthogonal bug, fixed in c7893db |
Now I'm getting a crash: let needle = Utf32String::from("adi");
let haystack =
Utf32String::from("At the Road’s End - Seeming - SOL: A Self-Banishment Ritual");
let mut matcher = Matcher::new(Config::DEFAULT);
assert_ne!(
matcher.substring_match(haystack.slice(..), needle.slice(..)),
None
)
|
For this crash it looks like above the code in the commit that Pascal refrenced where it is enumerating to the end of the haystack it should be enumerating to the end minus the length of the needle. |
Yeah I have a fix for that but I dont want to rush another fix (altough it is yet another existing bug, I guess I just need to weite moee unicode-unicode suabteing tests. I never type unicode so I rarely run into these). |
Yeah, I don't type Unicode usually but I have a music collection with a lot of foreign stuff or weird smart quotes and I'm writing a music player so I've been hitting the ascii-needle/unicode haystack codepath a lot. |
This should pass, but it fails with a score of 30; running with
indices
indicates that only the first codepoint in the haystack matches. If I get rid of the Japanese text the match goes away as expected. Fuzzy, postfix, and prefix match all indicate that there is no match; it's only substring match that breaks.edit: If I use
Utf32String::Unicode("lying".chars().collect())
there's no match, so I think the 'ascii needle, unicode haystack' codepath is the one with the problem.The text was updated successfully, but these errors were encountered: