Skip to content

Commit

Permalink
[SPARK-34093][ML] param maxDepth should check upper bound
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
update the ParamValidators of `maxDepth`

### Why are the changes needed?
current impl of tree models only support maxDepth<=30

### Does this PR introduce _any_ user-facing change?
If `maxDepth`>30, fail quickly

### How was this patch tested?
existing testsuites

Closes #31163 from zhengruifeng/param_maxDepth_upbound.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
  • Loading branch information
zhengruifeng authored and srowen committed Jan 18, 2021
1 parent dee596e commit d8cbef1
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,9 @@ private[ml] trait DecisionTreeParams extends PredictorParams
*/
final val maxDepth: IntParam =
new IntParam(this, "maxDepth", "Maximum depth of the tree. (Nonnegative)" +
" E.g., depth 0 means 1 leaf node; depth 1 means 1 internal node + 2 leaf nodes.",
ParamValidators.gtEq(0))
" E.g., depth 0 means 1 leaf node; depth 1 means 1 internal node + 2 leaf nodes." +
" Must be in range [0, 30].",
ParamValidators.inRange(0, 30))

/**
* Maximum number of bins used for discretizing continuous features and for choosing how to split
Expand Down
3 changes: 2 additions & 1 deletion python/pyspark/ml/tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,8 @@ class _DecisionTreeParams(HasCheckpointInterval, HasSeed, HasWeightCol):
typeConverter=TypeConverters.toString)

maxDepth = Param(Params._dummy(), "maxDepth", "Maximum depth of the tree. (>= 0) E.g., " +
"depth 0 means 1 leaf node; depth 1 means 1 internal node + 2 leaf nodes.",
"depth 0 means 1 leaf node; depth 1 means 1 internal node + 2 leaf nodes. " +
"Must be in range [0, 30].",
typeConverter=TypeConverters.toInt)

maxBins = Param(Params._dummy(), "maxBins", "Max number of bins for discretizing continuous " +
Expand Down

0 comments on commit d8cbef1

Please sign in to comment.