[SPARK-22399][ML] update the location of reference paper

## What changes were proposed in this pull request? Update the url of reference paper. ## How was this patch tested? It is comments, so nothing tested. Author: bomeng <bmeng@us.ibm.com> Closes #19614 from bomeng/22399.
apache · Oct 31, 2017 · aa6db57 · aa6db57
1 parent 1ff41d8
commit aa6db57
Show file tree

Hide file tree

Showing 4 changed files with 7 additions and 6 deletions.
diff --git a/docs/mllib-clustering.md b/docs/mllib-clustering.md
@@ -134,7 +134,7 @@ Refer to the [`GaussianMixture` Python docs](api/python/pyspark.mllib.html#pyspa
 
 Power iteration clustering (PIC) is a scalable and efficient algorithm for clustering vertices of a
 graph given pairwise similarities as edge properties,
-described in [Lin and Cohen, Power Iteration Clustering](http://www.icml2010.org/papers/387.pdf).
+described in [Lin and Cohen, Power Iteration Clustering](http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf).
 It computes a pseudo-eigenvector of the normalized affinity matrix of the graph via
 [power iteration](http://en.wikipedia.org/wiki/Power_iteration)  and uses it to cluster vertices.
 `spark.mllib` includes an implementation of PIC using GraphX as its backend.

diff --git a/...ples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala b/...ples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala
@@ -28,7 +28,8 @@ import org.apache.spark.mllib.clustering.PowerIterationClustering
 import org.apache.spark.rdd.RDD
 
 /**
- * An example Power Iteration Clustering http://www.icml2010.org/papers/387.pdf app.
+ * An example Power Iteration Clustering app.
+ * http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf
  * Takes an input of K concentric circles and the number of points in the innermost circle.
  * The output should be K clusters - each cluster containing precisely the points associated
  * with each of the input circles.

diff --git a/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala b/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala
@@ -103,9 +103,9 @@ object PowerIterationClusteringModel extends Loader[PowerIterationClusteringMode
 
 /**
  * Power Iteration Clustering (PIC), a scalable graph clustering algorithm developed by
- * <a href="http://www.icml2010.org/papers/387.pdf">Lin and Cohen</a>. From the abstract: PIC finds
- * a very low-dimensional embedding of a dataset using truncated power iteration on a normalized
- * pair-wise similarity matrix of the data.
+ * <a href="http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf">Lin and Cohen</a>.
+ * From the abstract: PIC finds a very low-dimensional embedding of a dataset using
+ * truncated power iteration on a normalized pair-wise similarity matrix of the data.
  *
  * @param k Number of clusters.
  * @param maxIterations Maximum number of iterations of the PIC algorithm.

diff --git a/python/pyspark/mllib/clustering.py b/python/pyspark/mllib/clustering.py
@@ -636,7 +636,7 @@ def load(cls, sc, path):
 class PowerIterationClustering(object):
     """
     Power Iteration Clustering (PIC), a scalable graph clustering algorithm
-    developed by [[http://www.icml2010.org/papers/387.pdf Lin and Cohen]].
+    developed by [[http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf Lin and Cohen]].
     From the abstract: PIC finds a very low-dimensional embedding of a
     dataset using truncated power iteration on a normalized pair-wise
     similarity matrix of the data.