Add metrics api guide #236

xuezhong · 2018-10-26T04:23:32Z

add metrics.rst

…to develop

xuezhong · 2018-10-26T04:26:01Z

luotao1 · 2018-10-26T06:12:18Z

doc/fluid/api/api_guides/low_level/metrics.rst

@@ -0,0 +1,61 @@
+..  _api_guide_optimizer:


_api_guide_metrics:

luotao1 · 2018-10-26T06:13:18Z

doc/fluid/api/api_guides/low_level/metrics.rst

+..  _api_guide_optimizer:
+
+
+Metrics


目录名请用中文：评价指标
下面的目录名也请对应改成中文

luotao1 · 2018-10-26T06:16:27Z

doc/fluid/api/api_guides/low_level/metrics.rst

+------------------
+
+:code:`Precision` 是准确率，用来衡量二分类中召回真值和召回值的比例。:code:`Accuracy` 是正确率，用来衡量二分类中二分类中召回真值和总样本数的比例。需要注意的是，准确率和正确率的定义是不同的，区别可以类比于误差分析中的 :code:`Variance` 和 :code:`Bias` 。:code:`Recall` 是召回率，用来衡量二分类中召回值和总样本数的比例。准确率和召回率的选取相互制约，实际模型中需要进行权衡，可以参考文档 `Precision_and_recall <https://en.wikipedia.org/wiki/Precision_and_recall>`_ 。
+:code:`Auc` 适用于二分类的分类模型评估，用来计算 `ROC曲线的累积面积 <https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve>`_。:code:`Auc` 通过python计算实现，如果关注性能，可以使用 :code:`fluid.layers.auc` 代替。


准确率:code:Precision: XXX

正确率:code:Accuracy：XXX

召回率:code:Recall： XXX

ROC曲线的累积面积率:code:Auc：XXX

doc/fluid/api/api_guides/low_level/metrics.rst

luotao1 · 2018-10-26T06:20:29Z

doc/fluid/api/api_guides/low_level/metrics.rst

+
+在神经网络训练过程中或者训练完成后，需要评估模型的训练效果，评估的方法一般是计算全体预测值和全体真值(label)之间的距离，不同模型会用不同的度量方法，比如分类模型常用 :code:`AUC` 作为分类效果的度量, OCR模型可以用 :code:`EditDistance` 作为识别效果的度量。
+
+1.MetricBase


去掉数字标号，且按照使用程度排序

MetricBase是自定义的时候才需要用的，一般用户用不到，可以考虑不放。

luotao1 · 2018-10-26T06:21:08Z

doc/fluid/api/api_guides/low_level/metrics.rst

+
+API Reference 请参考 :ref:`api_fluid_metrics_CompositeMetric`
+
+3.Precision/Accuracy/Recall/Auc


这个应该放在第一个介绍，因为用的很普遍

guoshengCS · 2018-10-27T05:38:28Z

doc/fluid/api/api_guides/low_level/metrics.rst

+Metrics
+#########
+
+在神经网络训练过程中或者训练完成后，需要评估模型的训练效果，评估的方法一般是计算全体预测值和全体真值(label)之间的距离，不同模型会用不同的度量方法，比如分类模型常用 :code:`AUC` 作为分类效果的度量, OCR模型可以用 :code:`EditDistance` 作为识别效果的度量。


感觉评估方法更多和任务强相关而非模型，建议调整下模型和任务的使用，如不同模型会用不同的度量方法

guoshengCS · 2018-10-29T03:33:39Z

doc/fluid/api/api_guides/low_level/metrics.rst

+4.ChunkEvaluator
+------------------
+
+:code:`ChunkEvaluator` 是分组评估度量，接收 :code:`chunk_eval` 接口的输出，累积每一个minibatch的分组统计，最后计算准确率、召回率和F1值。:code:`ChunkEvaluator` 支持IOB, IOE, IOBES and IO四种标注模式。可以参考文档 `Chunking with Support Vector Machines <https://aclanthology.info/pdf/N/N01/N01-1025.pdf>`_ 


建议chunk 翻译为语块，加上使用任务场景举例

guoshengCS · 2018-10-29T03:46:13Z

doc/fluid/api/api_guides/low_level/metrics.rst

+2.CompositeMetric
+------------------
+
+:code:`CompositeMetric` 可以组合多个度量指标，只需要在每一个minibatch提供一次预测值和真值，就可以获得多个指标值。


同 @luotao1 ，建议把 Precision/Accuracy/Recall/Auc 放在前面，CompositeMetric用其他几个举例说明。

…to add_metrics_api_guide

xuezhong · 2018-10-29T07:58:36Z

综合各位老师的建议，改了一版

luotao1 · 2018-10-29T08:08:38Z

doc/fluid/api/api_guides/low_level/metrics.rst

+
+评价指标
+#########
+在神经网络训练过程中或者训练完成后，需要评估模型的训练效果，评估的方法一般是计算全体预测值和全体真值(label)之间的距离，不同类型的任务会用不同的评价方法。


需要评估模型的训练效果，逗号改句号

luotao1 · 2018-10-29T08:11:58Z

doc/fluid/api/api_guides/low_level/metrics.rst

+序列标注任务评价
+------------------
+序列标注任务中，模型的首要目标是将输入的token分组，称为语块(chunk)。
+语块评估方法 :code:`ChunkEvaluator` ，接收 :code:`chunk_eval` 接口的输出，累积每一个minibatch的语块统计值，最后计算准确率、召回率和F1值。:code:`ChunkEvaluator` 支持IOB, IOE, IOBES和IO四种标注模式。可以参考文档 `Chunking with Support Vector Machines <https://aclanthology.info/pdf/N/N01/N01-1025.pdf>`_ 


38行末尾缺句号

guoshengCS · 2018-10-29T08:39:52Z

doc/fluid/api/api_guides/low_level/metrics.rst

+
+序列标注任务评价
+------------------
+序列标注任务中，模型的首要目标是将输入的token分组，称为语块(chunk)。


建议调整，除将token分组外还需分类，可以参考 https://github.com/PaddlePaddle/models/tree/develop/legacy/sequence_tagging_for_ner 中的README

嗯这个地方写了首要目标，分类可以提一下，只是和后面没有关系，所以没有提

guoshengCS · 2018-10-29T08:41:33Z

doc/fluid/api/api_guides/low_level/metrics.rst

+编辑距离 :code:`EditDistance` ，用来衡量两个字符串的相似度。可以参考文档 `Edit_distance <https://en.wikipedia.org/wiki/Edit_distance>`_。
+
+API Reference 请参考 :ref:`api_fluid_metrics_EditDistance`
+


可以在最后加上CompositeMetric

CompositeMetric 和 MetricBase感觉都不是很常用，所以就没有填写和上面的按照任务的分类方式也不太统一

guoshengCS · 2018-10-29T09:24:15Z

doc/fluid/api/api_guides/low_level/metrics.rst

+
+生成任务评价
+------------------
+生成任务会依据输入直接产生输出。对应NLP任务中，则生成新字符串，评估生成字符串和目标字符串之间的距离，可以使用编辑距离。


建议例子换为OCR或语音识别（目前models中已有）这种需要保持顺序的任务。
另外个人感觉生成任务评价这种划分可能不尽合理，翻译这种生成任务其实也可以使用Accuracy这种分类任务的评估方法，大家可以看下是否有更好的方式，如果有更好的方式建议调整。

这种评价方法主要用在类似翻译这种生成任务中，倒不是说翻译任务只用一种评价方法，这个观点可以在开始强调下。�按照任务类型划分，也主要是想从用户的角度出发，所以可能不是很严谨。�但guide主要也是引导用户，精确描述还得看api

luotao1 · 2018-10-29T11:08:30Z

doc/fluid/api/api_guides/low_level/metrics.rst

@@ -34,16 +34,18 @@

 序列标注任务评价
 ------------------
-序列标注任务中，模型的首要目标是将输入的token分组，称为语块(chunk)。
-语块评估方法 :code:`ChunkEvaluator` ，接收 :code:`chunk_eval` 接口的输出，累积每一个minibatch的语块统计值，最后计算准确率、召回率和F1值。:code:`ChunkEvaluator` 支持IOB, IOE, IOBES和IO四种标注模式。可以参考文档 `Chunking with Support Vector Machines <https://aclanthology.info/pdf/N/N01/N01-1025.pdf>`_ 
+序列标注任务中，模型首先将输入的token分组，称为语块(chunk)，其次会对语块中的tocken进行分类。分类的评估可以使用分类任务的评估方法，而tocken分组的评估使用语块评估方法。


tocken-》token笔误

guoshengCS · 2018-10-29T12:01:21Z

doc/fluid/api/api_guides/low_level/metrics.rst

+
+序列标注任务评价
+------------------
+序列标注任务中，模型首先将输入的token分组，称为语块(chunk)，其次会对语块中的tocken进行分类。分类的评估可以使用分类任务的评估方法，而tocken分组的评估使用语块评估方法。


建议修改为序列标注任务通常会同时进行语块分割和分类，避免用户认为是两阶段的歧义。ChunkEvaluator 的评估也同时包括了这两个的评估。

luotao1 · 2018-10-30T02:00:38Z

LGTM

shanyi15

LGTM,thanks!

xuezhong added 4 commits October 25, 2018 21:18

add metrics api guide

cb46521

Merge branch 'develop' of https://github.com/PaddlePaddle/FluidDoc in…

2b5186b

…to develop

update metrics.rst

a08272e

update metrics.rst

f8c3c73

shanyi15 requested review from luotao1 and shanyi15 October 26, 2018 04:27

shanyi15 added the API Guide docs related to API Guide label Oct 26, 2018

update metrics.rst

1937a67

luotao1 reviewed Oct 26, 2018

View reviewed changes

shanyi15 requested review from guoshengCS and shanyi15 and removed request for shanyi15 October 26, 2018 08:09

guoshengCS reviewed Oct 29, 2018

View reviewed changes

xuezhong added 2 commits October 29, 2018 14:51

Merge branch 'develop' of https://github.com/PaddlePaddle/FluidDoc in…

2136f93

…to add_metrics_api_guide

update metrics.rst

55dab19

luotao1 reviewed Oct 29, 2018

View reviewed changes

guoshengCS reviewed Oct 29, 2018

View reviewed changes

update metrics.rst

ab635ee

luotao1 reviewed Oct 29, 2018

View reviewed changes

guoshengCS reviewed Oct 29, 2018

View reviewed changes

update metrics.rst

42b2fa9

shanyi15 approved these changes Oct 30, 2018

View reviewed changes

shanyi15 merged commit 211ebcd into PaddlePaddle:develop Oct 30, 2018

shanyi15 mentioned this pull request Oct 30, 2018

API Guide-评价指标-提交截止时间10月22日 #163

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add metrics api guide #236

Add metrics api guide #236

xuezhong commented Oct 26, 2018

xuezhong commented Oct 26, 2018

luotao1 Oct 26, 2018

xuezhong Oct 29, 2018

luotao1 Oct 26, 2018

xuezhong Oct 29, 2018

luotao1 Oct 26, 2018

luotao1 Oct 26, 2018

xuezhong Oct 29, 2018

luotao1 Oct 26, 2018

xuezhong Oct 29, 2018

guoshengCS Oct 27, 2018

xuezhong Oct 29, 2018

guoshengCS Oct 29, 2018

guoshengCS Oct 29, 2018

xuezhong Oct 29, 2018

xuezhong commented Oct 29, 2018

luotao1 Oct 29, 2018

xuezhong Oct 29, 2018

luotao1 Oct 29, 2018

xuezhong Oct 29, 2018

guoshengCS Oct 29, 2018

xuezhong Oct 29, 2018

guoshengCS Oct 29, 2018

xuezhong Oct 29, 2018

guoshengCS Oct 29, 2018

xuezhong Oct 29, 2018

luotao1 Oct 29, 2018

xuezhong Oct 30, 2018

guoshengCS Oct 29, 2018

xuezhong Oct 30, 2018

luotao1 commented Oct 30, 2018

shanyi15 left a comment


		在神经网络训练过程中或者训练完成后，需要评估模型的训练效果，评估的方法一般是计算全体预测值和全体真值(label)之间的距离，不同模型会用不同的度量方法，比如分类模型常用 :code:`AUC` 作为分类效果的度量, OCR模型可以用 :code:`EditDistance` 作为识别效果的度量。

		1.MetricBase


		API Reference 请参考 :ref:`api_fluid_metrics_CompositeMetric`

		3.Precision/Accuracy/Recall/Auc

		编辑距离 :code:`EditDistance` ，用来衡量两个字符串的相似度。可以参考文档 `Edit_distance <https://en.wikipedia.org/wiki/Edit_distance>`_。

		API Reference 请参考 :ref:`api_fluid_metrics_EditDistance`

		.. _api_guide_optimizer:


		Metrics

Add metrics api guide #236

Add metrics api guide #236

Conversation

xuezhong commented Oct 26, 2018

xuezhong commented Oct 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuezhong commented Oct 29, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented Oct 30, 2018

shanyi15 left a comment

Choose a reason for hiding this comment