Skip to content

Commit

Permalink
[SPARK-25268][GRAPHX] run Parallel Personalized PageRank throws seria…
Browse files Browse the repository at this point in the history
…lization Exception

## What changes were proposed in this pull request?
mapValues in scala is currently not serializable. To avoid the serialization issue while running pageRank, we need to use map instead of mapValues.

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes #22271 from shahidki31/master_latest.

Authored-by: Shahid <shahidki31@gmail.com>
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
  • Loading branch information
shahidki31 authored and jkbradley committed Sep 6, 2018
1 parent 7ef6d1d commit 3b6591b
Showing 1 changed file with 5 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -198,9 +198,11 @@ object PageRank extends Logging {

val zero = Vectors.sparse(sources.size, List()).asBreeze
// map of vid -> vector where for each vid, the _position of vid in source_ is set to 1.0
val sourcesInitMap = sources.zipWithIndex.toMap.mapValues { i =>
Vectors.sparse(sources.size, Array(i), Array(1.0)).asBreeze
}
val sourcesInitMap = sources.zipWithIndex.map { case (vid, i) =>
val v = Vectors.sparse(sources.size, Array(i), Array(1.0)).asBreeze
(vid, v)
}.toMap

val sc = graph.vertices.sparkContext
val sourcesInitMapBC = sc.broadcast(sourcesInitMap)
// Initialize the PageRank graph with each edge attribute having
Expand Down

0 comments on commit 3b6591b

Please sign in to comment.