[SPARK-39446][MLLIB][FOLLOWUP] Modify constructor of RankingMetrics class #36920

uchiiii · 2022-06-20T06:56:44Z

What changes were proposed in this pull request?

Merged the two constructor into one using RDD[_ <: Product].

Why are the changes needed?

To make code simpler.
To support even more inputs.
~~The previous code treats rel as an empty array when rel is not provided, which is not that beautiful. This change removes this.~~

Does this PR introduce any user-facing change?

NO

How was this patch tested?

zhengruifeng · 2022-06-20T07:38:09Z

mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala

I think we may create a private rdd, and than use it internally to minimize changes.

something like this:

private val rdd = predictionAndLabels.map { case (pred: Array[T], lab: Array[T]) => ... case (pred: Array[T], lab: Array[T], rel: Array[Double]) => ... }

Do you mean that we use rdd as a private variable whose type is RDD[(Array[T], Array[T], Array[Double])], and keep other methods almost the same?

yes, hope this can reduce modifications

AmplabJenkins · 2022-06-20T14:31:38Z

Can one of the admins verify this patch?

uchiiii · 2022-06-20T23:25:50Z

mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala

Which do you think is better for ndcgAt?

The previous one, where to use binary is decided based on whether rel is an empty array.

The current one, where to use binary is decided based on user input directly.

IMHO, the current one is easier to understand.

@zhengruifeng　@srowen
Sorry to interrupt you, but which do you think is better?

I think The previous one maybe more concise.

Thank you for your opinion.

Hm, I thought the current one may be more concise for the developers because you could easily understand the calculation process is different by input type (even though I wrote both of them).

Anyway, I changed this to the previous one.

srowen · 2022-06-21T13:19:27Z

mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala

Maybe we can rename this and the constructor arg; the constructor arg may also have relevance; this does not

We may want not to change the name because

MulticlassMetrics also has the arguments whose name is predictionAndLabels.

spark/mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala

Lines 28 to 35 in b588d07

/**

* Evaluator for multiclass classification.

*

* @param predictionAndLabels an RDD of (prediction, label, weight, probability) or

* (prediction, label, weight) or (prediction, label) tuples.

*/

@Since("1.1.0")

class MulticlassMetrics @Since("1.1.0") (predictionAndLabels: RDD[_ <: Product]) {

We change the interface of the constructor to be able to support more inputs. Specific names like predictionAndLabelsWithOptionalRelevance may go against the goal.

OK that's reasonable

why discarding rel here?

If rdd is a RDD[Array[T], Array[T], Array[Double]], then in the ndcgAt, we can simply check whether rel is empty?

also, maybe we can rename rdd to a more meaningful name

WDYT @uchiiii ?

How about fullRDD?
IMO, Names like predictionsAndLabelsAndRelevances are too long.

maybe predictionsLabelsRelevances?

srowen · 2022-06-25T19:16:54Z

Merged to master

github-actions bot added the MLLIB label Jun 20, 2022

uchiiii mentioned this pull request Jun 20, 2022

[SPARK-39446][MLLIB] Add relevance score for nDCG evaluation #36843

Closed

zhengruifeng changed the title ~~[MLLIB] Modify constructor of RankingMetrics class~~ [SPARK-39446][MLLIB][FOLLOWUP] Modify constructor of RankingMetrics class Jun 20, 2022

zhengruifeng reviewed Jun 20, 2022

View reviewed changes

uchiiii commented Jun 20, 2022

View reviewed changes

srowen reviewed Jun 21, 2022

View reviewed changes

uchiiii added 6 commits June 22, 2022 20:19

Modify constructor of RankingMetics class

37a22b2

Apply ./dev/scalafmt

b74679d

Modify to use private member rdd

5ca84f9

Remove unnecessary change

64549d1

Add arguments exception error handling

183b170

Revert the change of ndcg

6944723

uchiiii force-pushed the modify_ranking_metrics branch from da3c694 to 6944723 Compare June 22, 2022 11:20

zhengruifeng approved these changes Jun 24, 2022

View reviewed changes

srowen closed this in f465a3d Jun 25, 2022

	/**
	* Evaluator for multiclass classification.
	*
	* @param predictionAndLabels an RDD of (prediction, label, weight, probability) or
	* (prediction, label, weight) or (prediction, label) tuples.
	*/
	@Since("1.1.0")
	class MulticlassMetrics @Since("1.1.0") (predictionAndLabels: RDD[_ <: Product]) {

[SPARK-39446][MLLIB][FOLLOWUP] Modify constructor of RankingMetrics class #36920

[SPARK-39446][MLLIB][FOLLOWUP] Modify constructor of RankingMetrics class #36920

Uh oh!

Conversation

uchiiii commented Jun 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uchiiii Jun 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Jun 20, 2022

Uh oh!

uchiiii Jun 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uchiiii Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uchiiii Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srowen commented Jun 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

uchiiii commented Jun 20, 2022 •

edited

Loading

uchiiii Jun 20, 2022 •

edited

Loading

uchiiii Jun 20, 2022 •

edited

Loading

uchiiii Jun 22, 2022 •

edited

Loading

uchiiii Jun 22, 2022 •

edited

Loading