Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QSSense fixup #2342

Merged
merged 6 commits into from May 16, 2017
Merged

QSSense fixup #2342

merged 6 commits into from May 16, 2017

Conversation

tiennou
Copy link
Member

@tiennou tiennou commented Apr 15, 2017

This is also a really old branch (2012 was a good year !).

So, we have 2 different versions of the ranking algorithm in-tree, both dating back from the Dawn of Time. The main ranking loop would use QSSense (through QSStringRanker). I haven't seen any use of the NSString category, but since it's hidden behind forwarding, it might be used sometimes.

They had a differing behavior (but I never found the time to thoroughly check).

The differing behavior was introduced in 87d328a actually. The comparison options are also a give-away (QSSense is case and diacritic insensitive and respect localization, the other is just case-insensitive.

So THE DIFFERENCE is what it is.

I've added a test case for it, because I wanted to 1) see it work 2) see what was different.

@skurfer
Copy link
Member

skurfer commented Apr 18, 2017

Cool!

Strange that Git doesn’t show QSSense.m getting moved or removed, but the original is gone, so I guess it “knows” somehow.

@tiennou
Copy link
Member Author

tiennou commented Apr 18, 2017

I think it's just the way GitHub show (or more precisely, doesn't) renames though.

Also, you might want to take a look at the test case. It has some "interesting" characteristics I didn't expect.


strRange.length += STEP;
score = QSScoreForAbbreviationWithRanges(str, CFSTR("test"), indexes, strRange, abbrRange);
XCTAssertEqualWithAccuracy(score, 0.76129, ACC);
Copy link
Member Author

@tiennou tiennou Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this. It actually matches higher just because I added one word (which I don't think should matter) anyway.

Just noticing my test is not what I expected too, I forgot there's a 't' 2nd place in "string" 😆.

Copy link
Member

@pjrobertson pjrobertson Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments to this section, I got my head round it, but it took me a while (e.g. line 89 you could say what each index in the NSIndexSet represents)

So here you're saying matching test with This excellent string tells us␣ gives a higher score than matching without the space at the end (This excellent string tells)? Seems strange to me!

Also can you explain what ACC is?

Copy link
Member Author

@tiennou tiennou Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. I was actually surprised that the index set is for each subsequent matching character of the abbreviation in the search string, not just "the best match", as I expected (ie, it doesn't correspond to what is shown underlined in the UI).

Yes, that's the weird thing, scoring increases proportionally to the string's length. I think it's that way because of the change you made (which is why I built the test case), because the other test case doesn't behave like that.

ACC is just a constant I made up because non-determinism related to floating-point calculation.

Copy link
Member Author

@tiennou tiennou Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test case here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiennou Perhaps it's bad form to comment on a year-old PR thread (and perhaps this has already been figured out), but I also ran into the index set issue in porting the scoring algorithm to JavaScript (and I've been poking into old PRs here to try to understand why the scores in TestQSSense.m don't quite line up with the results of my code).

Consider matching "test" against the string "t e s t tess!". The full abbreviation doesn't appear as a substring in the first loop, so it's reduced to "tes" and the function recurses. That substring is found near the end of the string, and those indices ([8, 9, 10], I think) get set in the mask. Then it looks for the remaining "t" in the rest of the string, but only finds "s!", so it has to start over and shrink the abbreviation again. Eventually, it finds the individual letters separated by spaces and returns a score. But the mask still contains those first hits that it gave up on.

My solution was to clear the mask just before the return 0 at the end of the function. I believe that's correct, as the 0 score means the algorithm will start over with a shorter abbreviation, and that shorter bit might hit in a different place, rendering the previous hits obsolete. If the UI isn't underlining all those letters, then something else must be cleaning up the mask. (I tried clamping it to the length of the abbreviation, but I don't think that gives the correct results in all cases.)

XCTAssertEqualObjects(indexes, results);

strRange.length += STEP;
score = QSScoreForAbbreviationWithRanges(str, CFSTR("test"), indexes, strRange, abbrRange);
Copy link
Member

@pjrobertson pjrobertson Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also shouldn't you use abbr here instead of re-defining CFSTR("test")?

Copy link
Member Author

@tiennou tiennou Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Will fix.

@skurfer
Copy link
Member

skurfer commented Apr 19, 2017

Grrrr. For some reason, my scheme is missing everything but “Quicksilver Tests”. I guess I need to manually put the rest back.

Should we also add some tests for strings that don’t match?

@skurfer
Copy link
Member

skurfer commented May 2, 2017

Another thought on the tests…

Do we really need to check for a specific score? A small change to the algorithm down the road doesn’t mean the score is “wrong”. Should we adjust the tests to just check that the scores for certain strings are higher or lower than others?

@skurfer skurfer merged commit 8108b0b into master May 16, 2017
1 of 2 checks passed
@skurfer skurfer deleted the qssense-fixup branch May 16, 2017
skurfer added a commit that referenced this issue May 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants