[MLLIB] [SPARK-2222] Add multiclass evaluation metrics #1155

avulanov · 2014-06-20T14:24:53Z

Adding two classes:

MulticlassMetrics implements various multiclass evaluation metrics
MulticlassMetricsSuite implements unit tests for MulticlassMetrics

AmplabJenkins · 2014-06-20T14:24:59Z

Can one of the admins verify this patch?

xiejuncs · 2014-06-22T05:00:28Z

Nice work.

I am reading the implementation of MulticlassMetrics. According to your code, for Micro average, you calculate the recall and then let precision and f1 measure equal to the recall. I am not sure whether this makes sense.

According to this post: http://rushdishams.blogspot.com/2011/08/micro-and-macro-average-of-precision.html

Assume we just have three classes. For each class, we have three numbers, true positive(tp), false positive(fp), false negative(fn). Hence, we have tp1, fp1 and fn1 for class 1. so on so forth.

For Micro-Average Precision: (tp1 + tp2 + tp3) / (tp1 + tp2 + tp3 + fp1 + fp2 + fp3)
For Micro-Average Recall: (tp1 + tp2 + tp3) / (tp1 + tp2 + tp3 + fn1 + fn2 + fn3)
For Micro-Average F1Measure: it is just the harmonic mean of precision and recall.

Based on the above definition, recall and precision should not be the same. Is it correct?

avulanov · 2014-06-22T11:06:53Z

The micro averaged Precision and Recall are equal for multiclass classifier, because sum(fni)=sum(fpi), i.e. they are just the sum of all non-diagonal elements in confusion matrix. F1-measure, as a harmonic mean of teo equal numbers, also equals to P and R. For more details please refer to the book "Introduction to IR" by Manning.

xiejuncs · 2014-06-22T16:56:54Z

It makes sense. You are right. sum(fni)=sum(fpi). The recall and precision are the same. Thanks very much.

SpyderRivera · 2014-06-23T16:58:56Z

👍

mengxr · 2014-07-02T18:09:14Z

Jenkins, add to whitelist.

mengxr · 2014-07-02T18:09:55Z

Jenkins, test this please.

AmplabJenkins · 2014-07-02T18:10:46Z

Merged build triggered.

AmplabJenkins · 2014-07-02T18:10:52Z

Merged build started.

AmplabJenkins · 2014-07-02T18:20:57Z

Merged build finished.

AmplabJenkins · 2014-07-02T18:20:57Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16297/

mengxr · 2014-07-03T09:35:05Z

mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala

+ * Evaluator for multiclass classification.
+ * NB: type Double both for prediction and label is retained
+ * for compatibility with model.predict that returns Double
+ * and MLUtils.loadLibSVMFile that loads class labels as Double


It is not necessary to mention loadLibSVMFile in particular here. This is a "global" assumption in MLlib.

SparkQA · 2014-07-14T09:57:44Z

QA tests have started for PR 1155. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16619/consoleFull

SparkQA · 2014-07-14T11:37:56Z

QA results for PR 1155:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
class MulticlassMetrics(predictionAndLabels: RDD[(Double, Double)]) {
* (equals to precision for multiclass classifier

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16619/consoleFull

mengxr · 2014-07-14T17:28:15Z

@avulanov In Scala, "for" is slower than "while". See https://issues.scala-lang.org/browse/SI-1338 for example. So please replace the for loop with two while loops in your implementation.

mengxr · 2014-07-14T17:31:14Z

mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala

+   * as in "labels"
+   */
+  lazy val confusionMatrix: Matrix = {
+    val transposedFlatMatrix = Array.ofDim[Double](labels.size * labels.size)


Save labels.size to n? Btw, I'm not sure whether we should use lazy val here because the result matrix could be 1000x1000, different from other lazy vals used here.

avulanov · 2014-07-15T09:32:43Z

@mengxr I've addressed your comments. Thanks for pointing me to the Scala issue

SparkQA · 2014-07-15T09:32:51Z

QA tests have started for PR 1155. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16670/consoleFull

mengxr · 2014-07-15T09:55:45Z

@avulanov I made some minor updates and send you a PR at avulanov#1 . If it looks good to you, please merge that PR and the changes should show up here. Thanks!

minor updates

avulanov · 2014-07-15T09:59:59Z

@mengxr done!

SparkQA · 2014-07-15T10:03:06Z

QA tests have started for PR 1155. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16671/consoleFull

SparkQA · 2014-07-15T11:38:27Z

QA results for PR 1155:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
class MulticlassMetrics(predictionAndLabels: RDD[(Double, Double)]) {
* (equals to precision for multiclass classifier

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16671/consoleFull

mengxr · 2014-07-15T15:42:01Z

Merged. Thanks for your contribution!

avulanov · 2014-07-15T16:36:09Z

Thanks! I'll be glad to contribute more.

Adding two classes: 1) MulticlassMetrics implements various multiclass evaluation metrics 2) MulticlassMetricsSuite implements unit tests for MulticlassMetrics Author: Alexander Ulanov <nashb@yandex.ru> Author: unknown <ulanov@ULANOV1.emea.hpqcorp.net> Author: Xiangrui Meng <meng@databricks.com> Closes apache#1155 from avulanov/master and squashes the following commits: 2eae80f [Alexander Ulanov] Merge pull request apache#1 from mengxr/avulanov-master 5ebeb08 [Xiangrui Meng] minor updates 79c3555 [Alexander Ulanov] Addressing reviewers comments mengxr 0fa9511 [Alexander Ulanov] Addressing reviewers comments mengxr f0dadc9 [Alexander Ulanov] Addressing reviewers comments mengxr 4811378 [Alexander Ulanov] Removing println 87fb11f [Alexander Ulanov] Addressing reviewers comments mengxr. Added confusion matrix e3db569 [Alexander Ulanov] Addressing reviewers comments mengxr. Added true positive rate and false positive rate. Test suite code style. a7e8bf0 [Alexander Ulanov] Addressing reviewers comments mengxr c3a77ad [Alexander Ulanov] Addressing reviewers comments mengxr e2c91c3 [Alexander Ulanov] Fixes to mutliclass metics d5ce981 [unknown] Comments about Double a5c8ba4 [unknown] Unit tests. Class rename fcee82d [unknown] Unit tests. Class rename d535d62 [unknown] Multiclass evaluation

Implementation of various multi-label classification measures, including: Hamming-loss, strict and default Accuracy, macro-averaged Precision, Recall and F1-measure based on documents and labels, micro-averaged measures: https://issues.apache.org/jira/browse/SPARK-2329 Multi-class measures are currently in the following pull request: #1155 Author: Alexander Ulanov <nashb@yandex.ru> Author: avulanov <nashb@yandex.ru> Closes #1270 from avulanov/multilabelmetrics and squashes the following commits: fc8175e [Alexander Ulanov] Merge with previous updates 43a613e [Alexander Ulanov] Addressing reviewers comments: change Set to Array 517a594 [avulanov] Addressing reviewers comments: Scala style cf4222b [avulanov] Addressing reviewers comments: renaming. Added label method that returns the list of labels 1843f73 [Alexander Ulanov] Scala style fix 79e8476 [Alexander Ulanov] Replacing fold(_ + _) with sum as suggested by srowen ca46765 [Alexander Ulanov] Cosmetic changes: Apache header and parameter explanation 40593f5 [Alexander Ulanov] Multi-label metrics: Hamming-loss, strict and normal accuracy, fix to macro measures, bunch of tests ad62df0 [Alexander Ulanov] Comments and scala style check 154164b [Alexander Ulanov] Multilabel evaluation metics and tests: macro precision and recall averaged by docs, micro and per-class precision and recall averaged by class

tolgap · 2015-01-19T15:16:21Z

@avulanov You have added a class called MulticlassMetrics, but I do not understand how it operates on multiclass classification? I would understand the usage if it accepts RDD[(Vector, Vector)], but it uses RDD[(Double, Double)] and that seems to me like binary classification?

Can you give me an example for... say: the MNIST dataset (10 output neurons). Thanks!

avulanov · 2015-01-20T19:21:03Z

@tolgap As documentation suggests, MulticlassMetrics accepts predictionAndLabels, an RDD of (prediction, label) pairs, where prediction is the predicted class/label, label is the actual class/label.

For example:

import org.apache.spark.mllib.util.MLUtils
import org.apache.spark.mllib.classification.ANNClassifier
import org.apache.spark.mllib.evaluation.MulticlassMetrics

/* load mnist data */
val data = MLUtils.loadLibSVMFile(sc, "mnist_file_in_svm_format")
val split = data.randomSplit(Array(0.9, 0.1), 11L)
val training = split(0)
val test = split(1)
/* train ANN with hidden layer of 32 neurons */
/* (input and output layer sizes will be derived from the data) */
val model = ANNClassifier.train(train, Array[Int](32), 40, 1.0, 1e-4)
val predictionAndLabels = test.map( lp => (model.predict(lp.features), lp.label))
val metrics = new MulticlassMetrics(predictionAndLabels)
println("Accuracy:" + metrics.precision)

tolgap · 2015-01-21T08:37:14Z

@avulanov How many neurons does the output layer have in this case? 1 or 10? Because my current implementation has an output layer of 10 neurons, e.g:

val output = Array[Double](7.466E-4, 4.16464E-9, 0.0, 0.0, 0.99462, /*..*/)

In this case, this example has the highest probability of being the digit 4 (fifth element has highest probability).

avulanov · 2015-01-21T18:26:56Z

@tolgap ANNClassifier will create 10 output neurons for mnist, 10 is the number of distinct labels derived from the data. Each class usually is encoded with a separate output neuron, especially when there are no explicit relations (or ordering) between classes. If you wish to learn more, there is a good explanation here: http://www.faqs.org/faqs/ai-faq/neural-nets/part2/index.html

https://github.pie.apple.com/IPR/apache-incubator-iceberg/compare/IPR:48834b0...IPR:1c9b798 [rdar://119151572 (Validate that partition tuples is same before and after data compaction](https://github.pie.apple.com/IPR/apache-incubator-iceberg/commit/6abc5d585a9c0d109883a422cd1eb101c0fac2d1) [Internal: Cherry Pick Delete stats file in CatalogUtil:dropTableData (apache#1155)](https://github.pie.apple.com/IPR/apache-incubator-iceberg/commit/e4b1bef06df411c7790df776bdf4a8828f30a42d)https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/1155 [Internal: Remove Check for non-null Sequence Number for Manifest Entries ](https://github.pie.apple.com/IPR/apache-incubator-iceberg/commit/7d13aeefcc2144347a310d2ac51d626d5067ddf9) (https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/1158)

avulanov added 4 commits June 19, 2014 19:39

Multiclass evaluation

d535d62

Unit tests. Class rename

fcee82d

Unit tests. Class rename

a5c8ba4

Comments about Double

d5ce981

Fixes to mutliclass metics

e2c91c3

avulanov mentioned this pull request Jun 30, 2014

[MLLIB] SPARK-2329 Add multi-label evaluation metrics #1270

Closed

mengxr reviewed Jul 3, 2014
View reviewed changes

mengxr reviewed Jul 14, 2014
View reviewed changes

Addressing reviewers comments mengxr

79c3555

minor updates

5ebeb08

Merge pull request #1 from mengxr/avulanov-master

2eae80f

minor updates

asfgit closed this in 04b01bb Jul 15, 2014

avulanov mentioned this pull request Jan 20, 2015

[MLLIB] [spark-2352] Implementation of an Artificial Neural Network (ANN) #1290

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLLIB] [SPARK-2222] Add multiclass evaluation metrics #1155

[MLLIB] [SPARK-2222] Add multiclass evaluation metrics #1155

avulanov commented Jun 20, 2014

AmplabJenkins commented Jun 20, 2014

xiejuncs commented Jun 22, 2014

avulanov commented Jun 22, 2014

xiejuncs commented Jun 22, 2014

SpyderRivera commented Jun 23, 2014

mengxr commented Jul 2, 2014

mengxr commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

mengxr Jul 3, 2014

SparkQA commented Jul 14, 2014

SparkQA commented Jul 14, 2014

mengxr commented Jul 14, 2014

mengxr Jul 14, 2014

avulanov commented Jul 15, 2014

SparkQA commented Jul 15, 2014

mengxr commented Jul 15, 2014

avulanov commented Jul 15, 2014

SparkQA commented Jul 15, 2014

SparkQA commented Jul 15, 2014

mengxr commented Jul 15, 2014

avulanov commented Jul 15, 2014

tolgap commented Jan 19, 2015

avulanov commented Jan 20, 2015

tolgap commented Jan 21, 2015

avulanov commented Jan 21, 2015

[MLLIB] [SPARK-2222] Add multiclass evaluation metrics #1155

[MLLIB] [SPARK-2222] Add multiclass evaluation metrics #1155

Conversation

avulanov commented Jun 20, 2014

AmplabJenkins commented Jun 20, 2014

xiejuncs commented Jun 22, 2014

avulanov commented Jun 22, 2014

xiejuncs commented Jun 22, 2014

SpyderRivera commented Jun 23, 2014

mengxr commented Jul 2, 2014

mengxr commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

AmplabJenkins commented Jul 2, 2014

mengxr Jul 3, 2014

Choose a reason for hiding this comment

SparkQA commented Jul 14, 2014

SparkQA commented Jul 14, 2014

mengxr commented Jul 14, 2014

mengxr Jul 14, 2014

Choose a reason for hiding this comment

avulanov commented Jul 15, 2014

SparkQA commented Jul 15, 2014

mengxr commented Jul 15, 2014

avulanov commented Jul 15, 2014

SparkQA commented Jul 15, 2014

SparkQA commented Jul 15, 2014

mengxr commented Jul 15, 2014

avulanov commented Jul 15, 2014

tolgap commented Jan 19, 2015

avulanov commented Jan 20, 2015

tolgap commented Jan 21, 2015

avulanov commented Jan 21, 2015