Skip to content

Commit

Permalink
programming guide blurb
Browse files Browse the repository at this point in the history
  • Loading branch information
manishamde committed May 6, 2014
1 parent 8053fed commit 426bb28
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions docs/mllib-decision-tree.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,10 @@ The recursive tree construction is stopped at a node when one of the two conditi
1. The node depth is equal to the `maxDepth` training parammeter
2. No split candidate leads to an information gain at the node.

### Max memory requirements

For faster processing, the decision tree algorithm performs simultaneous histogram computations for all nodes at each level of the tree. This could lead to high memory requirements at deeper levels of the tree leading to memory overflow errors. To alleviate this problem, a 'maxMemoryInMB' training parameter is provided which specifies the maximum amount of memory at the workers (twice as much at the master) to be allocated to the histogram computation. The default value is conservatively chosen to be 128 MB to allow the decision algorithm to work in most scenarios. Once the memory requirements for a level-wise computation crosses the `maxMemoryInMB` threshold, the node training tasks at each subsequent level is split into smaller tasks.

### Practical limitations

1. The implemented algorithm reads both sparse and dense data. However, it is not optimized for
Expand Down

0 comments on commit 426bb28

Please sign in to comment.