forked from sphinx-doc/sphinx
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upstream issue #11961: alternative de-duplication approach #1
Closed
jayaddison
wants to merge
16
commits into
wlach:master
from
jayaddison:issue-11961/pr-11942-review-experimentation
Closed
Changes from 10 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
1a0e6a3
Fix duplicate search results
wlach eea244f
Another approach
wlach 6473d51
Review feedback
wlach d2648b2
HTML Search: Add test, refactoring to make that possible
wlach 2d82cb7
Keep anchor, de-duplicate on less keys
wlach 5819346
De-duplicate on document title only
wlach 4208655
experimentation: undo local fix attempt
jayaddison 17da9b1
experimentation: undo test expectation change
jayaddison 344274b
experimentation: alternative de-duplication method
jayaddison a94ab83
experimentation: compared to original, omit only the title field from…
jayaddison 1f2eb3d
experimentation: use jasmine.createSpy to avoid function contract cha…
jayaddison 2b7f49f
cleanup: remove unnecessary '.and.returnValue' call-chain
jayaddison 04ad76a
experimentation: update test case to avoid dependency on fix from sph…
jayaddison dcc3443
experimentation: apply patch derived from sphinx-doc/sphinx#11959
jayaddison e2d07a1
experimentation: expect anchor-less title result
jayaddison 251bf95
experimentation: cleanup: remove 'resultStr' de-dup key changes
jayaddison File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,9 @@ | ||
const DOCUMENTATION_OPTIONS = {}; | ||
|
||
// stub Stemmer / stopWords | ||
function Stemmer() { | ||
this.stemWord = function (word) { | ||
return word; | ||
} | ||
}; | ||
let stopwords = []; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't see how this would pass the test below. 🤔 The body search and title search will have different
resultStr
's (since they have different anchors), so you'd see them both in the results.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test does succeed with that - could that imply a bug in the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's only one search result because my other fixes haven't been applied (I left them out for ease of review). If you rebase your branch on top of these pull requests the test should start failing:
sphinx-doc#11960
sphinx-doc#11958
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Are you sure? I can still get the test to pass with a
docid
of1
instead of zero, and the partial-matching only affects the scoring, if I understand correctly?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look, I think it's actually sphinx-doc#11959 (the multiple matches) that allows multiple results to be generated as you'd expect. If you apply this squashed patch your tests should start failing: https://gist.github.com/wlach/74124608c2113d8eb9737cdced4762db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug raised as sphinx-doc#11965.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm confused-- that looks similar to my original solution which you rejected here:
sphinx-doc#11942 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not ideal, but it's smaller -- there's no
isDocumentTitle
, and no additional boolean passed down to the de-duplication phase.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and in fact, it might not even be required. if the index-construction issue is valid and fixed, then we could try removing it and relying purely on an updated de-duplication key)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I don't have a super strong opinion: I thought your original critique had a point but I don't think it matters all that much in the end.
I am pretty sure we still have the problem of the title search / term search returning more-or-less the same result. The unit test uses a hand-constructed index without this problem and still reproduces the bug.