From 9deb0a5bacd52a1d4d54a92d9080fd9f02693e6a Mon Sep 17 00:00:00 2001 From: WeichenXu Date: Thu, 2 Nov 2017 17:21:55 +0800 Subject: [PATCH] init pr --- docs/ml-tuning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ml-tuning.md b/docs/ml-tuning.md index 64dc46cf0c0e7..54d9cd21909df 100644 --- a/docs/ml-tuning.md +++ b/docs/ml-tuning.md @@ -55,7 +55,7 @@ for multiclass problems. The default metric used to choose the best `ParamMap` c method in each of these evaluators. To help construct the parameter grid, users can use the [`ParamGridBuilder`](api/scala/index.html#org.apache.spark.ml.tuning.ParamGridBuilder) utility. -By default, sets of parameters from the parameter grid are evaluated in serial. Parameter evaluation can be done in parallel by setting `parallelism` with a value of 2 or more (a value of 1 will be serial) before running model selection with `CrossValidator` or `TrainValidationSplit` (NOTE: this is not yet supported in Python). +By default, sets of parameters from the parameter grid are evaluated in serial. Parameter evaluation can be done in parallel by setting `parallelism` with a value of 2 or more (a value of 1 will be serial) before running model selection with `CrossValidator` or `TrainValidationSplit`. The value of `parallelism` should be chosen carefully to maximize parallelism without exceeding cluster resources, and larger values may not always lead to improved performance. Generally speaking, a value up to 10 should be sufficient for most clusters. # Cross-Validation