forked from elastic/elasticsearch-definitive-guide
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
chapter24_part5: /270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc #154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,33 +1,29 @@ | ||
[[fuzzy-scoring]] | ||
=== Scoring Fuzziness | ||
=== 模糊性评分 | ||
|
||
Users love fuzzy queries. They assume that these queries will somehow magically find | ||
the right combination of proper spellings.((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and"))) Unfortunately, the truth is | ||
somewhat more prosaic. | ||
|
||
Imagine that we have 1,000 documents containing ``Schwarzenegger,'' and just | ||
one document with the misspelling ``Schwarzeneger.'' According to the theory | ||
of <<tfidf,term frequency/inverse document frequency>>, the misspelling is | ||
much more relevant than the correct spelling, because it appears in far fewer | ||
documents! | ||
用户喜欢模糊查询。他们认为这种查询会魔法般的找到正确拼写组合。 | ||
((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and"))) | ||
很遗憾,实际效果平平。 | ||
|
||
In other words, if we were to treat fuzzy matches((("match query", "fuzzy match query"))) like any other match, we | ||
would favor misspellings over correct spellings, which would make for grumpy | ||
users. | ||
|
||
TIP: Fuzzy matching should not be used for scoring purposes--only to widen | ||
the net of matching terms in case there are misspellings. | ||
假设我们有1000个文档包含 ``Schwarzenegger`` ,只是一个文档的出现拼写错误 ``Schwarzeneger`` 。 | ||
根据 <<tfidf,term frequency/inverse document frequency>> 理论,这个拼写错误文档比拼写正确的相关度更高,因为错误拼写出现在更少的文档中! | ||
|
||
|
||
换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户抓狂。 | ||
|
||
|
||
TIP: 模糊匹配不应用于参与评分--只能在有拼写错误时扩大匹配项的范围。 | ||
|
||
|
||
默认情况下, `match` 查询给定所有的模糊匹配的恒定评分为1。这可以满足在结果列表的末尾添加潜在的匹配记录,并且没有干扰非模糊查询的相关性评分。 | ||
|
||
By default, the `match` query gives all fuzzy matches the constant score of 1. | ||
This is sufficient to add potential matches onto the end of the result list, | ||
without interfering with the relevance scoring of nonfuzzy queries. | ||
|
||
[TIP] | ||
================================================== | ||
|
||
Fuzzy queries alone are much less useful than they initially appear. They are | ||
better used as part of a ``bigger'' feature, such as the _search-as-you-type_ | ||
{ref}/search-suggesters-completion.html[`completion` suggester] or the | ||
_did-you-mean_ {ref}/search-suggesters-phrase.html[`phrase` suggester]. | ||
|
||
在模糊查询最初出现时很少能单独使用。他们更好的作为一个 ``bigger`` 场景的部分功能特性,如 _search-as-you-type_ | ||
{ref}/search-suggesters-completion.html[`完成` 建议]或 | ||
_did-you-mean_ {ref}/search-suggesters-phrase.html[`短语` 建议]。 | ||
================================================== |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里预览时两个引用的背景加粗貌似有点奇怪