Permalink
Browse files

[SPARK-25758][ML] Deprecate computeCost on BisectingKMeans

## What changes were proposed in this pull request?

The PR proposes to deprecate the `computeCost` method on `BisectingKMeans` in favor of the adoption of `ClusteringEvaluator` in order to evaluate the clustering.

## How was this patch tested?

NA

Closes #22756 from mgaido91/SPARK-25758.

Authored-by: Marco Gaido <marcogaido91@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
  • Loading branch information...
mgaido91 authored and dongjoon-hyun committed Oct 18, 2018
1 parent 15524c4 commit c2962546d9a5900a5628a31b83d2c4b22c3a7936
@@ -125,8 +125,13 @@ class BisectingKMeansModel private[ml] (
/**
* Computes the sum of squared distances between the input points and their corresponding cluster
* centers.
*
* @deprecated This method is deprecated and will be removed in 3.0.0. Use ClusteringEvaluator
* instead. You can also get the cost on the training dataset in the summary.
*/
@Since("2.0.0")
@deprecated("This method is deprecated and will be removed in 3.0.0. Use ClusteringEvaluator " +
"instead. You can also get the cost on the training dataset in the summary.", "2.4.0")
def computeCost(dataset: Dataset[_]): Double = {
SchemaUtils.validateVectorCompatibleColumn(dataset.schema, getFeaturesCol)
val data = DatasetUtils.columnToOldVector(dataset, getFeaturesCol)
@@ -540,7 +540,13 @@ def computeCost(self, dataset):
"""
Computes the sum of squared distances between the input points
and their corresponding cluster centers.
..note:: Deprecated in 2.4.0. It will be removed in 3.0.0. Use ClusteringEvaluator instead.
You can also get the cost on the training dataset in the summary.
"""
warnings.warn("Deprecated in 2.4.0. It will be removed in 3.0.0. Use ClusteringEvaluator "
"instead. You can also get the cost on the training dataset in the summary.",
DeprecationWarning)
return self._call_java("computeCost", dataset)
@property

0 comments on commit c296254

Please sign in to comment.