Skip to content
Permalink
Browse files

use sampleByKey for per user sampling

  • Loading branch information...
Debasish Das
Debasish Das committed Nov 8, 2014
1 parent 10cbb37 commit f38a1b59e27907f2aa9bd732c5f9147b738d3a0f
Showing with 1 addition and 2 deletions.
  1. +1 −2 examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala
@@ -141,9 +141,8 @@ object MovieLensALS {

println(s"Got $numRatings ratings from $numUsers users on $numMovies movies.")

//val splits = ratings.randomSplit(Array(0.8, 0.2))
val fractions = (0 until numUsers.toInt).map(x => (x + 1, 0.8)).toMap

val training = ratings.map { x => (x.user, x) }.sampleByKey(false, fractions).map { x => x._2 }
val testSplit = ratings.subtract(training)

0 comments on commit f38a1b5

Please sign in to comment.
You can’t perform that action at this time.