From 17a71c357bcc5ca68f3fd11f49bb61a91603527a Mon Sep 17 00:00:00 2001 From: Mathias Andersen Date: Wed, 7 Mar 2018 14:50:20 +0100 Subject: [PATCH] Added description of checkpointInterval parameter Current behavior of ALS and checkpointInterval can result in unexpected behavior, I have added explicit description to hopefully reduce confusion. --- docs/ml-collaborative-filtering.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/ml-collaborative-filtering.md b/docs/ml-collaborative-filtering.md index 58f2d4b531e70..ea74acf661f4f 100644 --- a/docs/ml-collaborative-filtering.md +++ b/docs/ml-collaborative-filtering.md @@ -19,6 +19,7 @@ by a small set of latent factors that can be used to predict missing entries. algorithm to learn these latent factors. The implementation in `spark.ml` has the following parameters: +* *checkpointInterval* helps with recovery when nodes fail and StackOverflow exceptions caused by long lineage. **Will be silently ignored if *SparkContext.CheckpointDir* is not set.** (defaults to 10). * *numBlocks* is the number of blocks the users and items will be partitioned into in order to parallelize computation (defaults to 10). * *rank* is the number of latent factors in the model (defaults to 10). * *maxIter* is the maximum number of iterations to run (defaults to 10).