From 7e1b68f51cd565c1005a22b5542d4f6b1b755b5d Mon Sep 17 00:00:00 2001 From: luotitan Date: Fri, 29 Jul 2016 15:46:52 +0800 Subject: [PATCH 1/4] =?UTF-8?q?=E9=87=8D=E6=96=B0=E7=AC=AC=E4=B8=80?= =?UTF-8?q?=E6=AC=A1=E6=8F=90=E4=BA=A4?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../50_Scoring_fuzziness.asciidoc | 40 +++++++++---------- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc index f45176495..56ed92df3 100644 --- a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc +++ b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc @@ -1,33 +1,29 @@ [[fuzzy-scoring]] -=== Scoring Fuzziness +=== 模糊性评分 -Users love fuzzy queries. They assume that these queries will somehow magically find -the right combination of proper spellings.((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and"))) Unfortunately, the truth is -somewhat more prosaic. -Imagine that we have 1,000 documents containing ``Schwarzenegger,'' and just -one document with the misspelling ``Schwarzeneger.'' According to the theory -of <>, the misspelling is -much more relevant than the correct spelling, because it appears in far fewer -documents! +用户喜欢模糊查询。他们认为这种查询会魔法般的找到正确拼写组合。 +((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and"))) +很遗憾,实际效果平平。 -In other words, if we were to treat fuzzy matches((("match query", "fuzzy match query"))) like any other match, we -would favor misspellings over correct spellings, which would make for grumpy -users. -TIP: Fuzzy matching should not be used for scoring purposes--only to widen -the net of matching terms in case there are misspellings. +假设我们有1000个文档包含 ``Schwarzenegger'' ,只是一个文档的出现拼写错误 ``Schwarzeneger'' 。 +根据 <> 理论,这个拼写错误文档比拼写正确的相关度更高,因为它更少在文档中出现! + + +换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户发狂。 + + +TIP: 模糊匹配不应用于参与评分--只能在有拼写错误时扩大匹配项的范围。 + + +默认情况下, `match` 查询给定所有的模糊匹配的恒定评分为1。这可以满足在结果列表的末尾添加潜在的匹配记录,并且没有干扰非模糊查询的相关性评分。 -By default, the `match` query gives all fuzzy matches the constant score of 1. -This is sufficient to add potential matches onto the end of the result list, -without interfering with the relevance scoring of nonfuzzy queries. [TIP] ================================================== -Fuzzy queries alone are much less useful than they initially appear. They are -better used as part of a ``bigger'' feature, such as the _search-as-you-type_ -{ref}/search-suggesters-completion.html[`completion` suggester] or the -_did-you-mean_ {ref}/search-suggesters-phrase.html[`phrase` suggester]. - +在模糊查询最初出现时很少能单独使用。他们更好的作为一个 ``bigger'' 场景的部分功能特性,如 _search-as-you-type_ +{ref}/search-suggesters-completion.html[`完成` 建议]或 +_did-you-mean_ {ref}/search-suggesters-phrase.html[`短语` 建议]。 ================================================== From d53ccfe6dfb4758773f2bbee46b308ea6a003845 Mon Sep 17 00:00:00 2001 From: luotitan Date: Tue, 27 Sep 2016 22:56:35 +0800 Subject: [PATCH 2/4] =?UTF-8?q?=E4=BF=AE=E6=94=B9=E7=AC=A6=E5=8F=B7?= =?UTF-8?q?=E9=94=99=E8=AF=AF?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc index 56ed92df3..939811d77 100644 --- a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc +++ b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc @@ -7,11 +7,11 @@ 很遗憾,实际效果平平。 -假设我们有1000个文档包含 ``Schwarzenegger'' ,只是一个文档的出现拼写错误 ``Schwarzeneger'' 。 +假设我们有1000个文档包含 ``Schwarzenegger`` ,只是一个文档的出现拼写错误 ``Schwarzeneger`` 。 根据 <> 理论,这个拼写错误文档比拼写正确的相关度更高,因为它更少在文档中出现! -换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户发狂。 +换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户抓狂。 TIP: 模糊匹配不应用于参与评分--只能在有拼写错误时扩大匹配项的范围。 @@ -23,7 +23,7 @@ TIP: 模糊匹配不应用于参与评分--只能在有拼写错误时扩大匹 [TIP] ================================================== -在模糊查询最初出现时很少能单独使用。他们更好的作为一个 ``bigger'' 场景的部分功能特性,如 _search-as-you-type_ +在模糊查询最初出现时很少能单独使用。他们更好的作为一个 ``bigger`` 场景的部分功能特性,如 _search-as-you-type_ {ref}/search-suggesters-completion.html[`完成` 建议]或 _did-you-mean_ {ref}/search-suggesters-phrase.html[`短语` 建议]。 ================================================== From eee602e9567cbd22053b94103d78c20a6319b2e7 Mon Sep 17 00:00:00 2001 From: luotitan Date: Sun, 20 Nov 2016 23:02:32 +0800 Subject: [PATCH 3/4] =?UTF-8?q?=E6=8C=89=E7=85=A7nodesssreview=E6=84=8F?= =?UTF-8?q?=E8=A7=81=E4=BF=AE=E6=94=B9?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc index 939811d77..8fdf3b267 100644 --- a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc +++ b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc @@ -8,7 +8,7 @@ 假设我们有1000个文档包含 ``Schwarzenegger`` ,只是一个文档的出现拼写错误 ``Schwarzeneger`` 。 -根据 <> 理论,这个拼写错误文档比拼写正确的相关度更高,因为它更少在文档中出现! +根据 <> 理论,这个拼写错误文档比拼写正确的相关度更高,因为错误拼写出现在更少在文档中! 换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户抓狂。 From 944e8a27ad954820bb80a8da6f96e9f0aa88d9db Mon Sep 17 00:00:00 2001 From: luotitan Date: Mon, 21 Nov 2016 10:28:58 +0800 Subject: [PATCH 4/4] =?UTF-8?q?=E9=94=99=E5=AD=97=E4=BF=AE=E6=94=B9?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc index 8fdf3b267..d98a16ccc 100644 --- a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc +++ b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc @@ -8,7 +8,7 @@ 假设我们有1000个文档包含 ``Schwarzenegger`` ,只是一个文档的出现拼写错误 ``Schwarzeneger`` 。 -根据 <> 理论,这个拼写错误文档比拼写正确的相关度更高,因为错误拼写出现在更少在文档中! +根据 <> 理论,这个拼写错误文档比拼写正确的相关度更高,因为错误拼写出现在更少的文档中! 换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户抓狂。