## Decision tree hyperparameters

Decision tree hyperparameters control how the tree grows and how complex it becomes, which directly impacts overfitting and underfitting.

#### Tree parts

- **Root node**: Starting node with all training data.  
- **Internal node**: Non-final node that can still split.  
- **Leaf node**: Final node; no more splits, outputs prediction.  
- **Branch**: Connection between nodes representing an answer (e.g., “yes” / “no”).

#### Key hyperparameters (simple view)

- **criterion**: How to score splits.  
  - `"gini"` (default) or `"entropy"` to measure how mixed labels are in a node.

- **splitter**: How to search for the split.  
  - `"best"` (try all and pick best) or `"random"` (consider random candidates).

- **max_features**: How many features to consider at each split.  
  - Integer → that many features.  
  - Float → that fraction of all features.  
  - `"sqrt"`, `"log2"` → functions of number of features.  
  - `None` → use all features (default).

- **max_depth**: Maximum depth (levels) of the tree.  
  - Integer → hard depth limit.  
  - `None` → grow until pure or limited by other params (default).  
  - Smaller → simpler tree, less overfitting.

- **max_leaf_nodes**: Maximum number of leaves.  
  - Integer → cap on leaf count.  
  - `None` → no cap (default).

- **min_samples_split**: Minimum samples in a node to allow a split.  
  - Integer → minimum count.  
  - Float → minimum fraction of total data.  
  - Larger → fewer splits, less overfitting.

- **min_samples_leaf**: Minimum samples required in each leaf.  
  - Integer → minimum count in every leaf.  
  - Float → minimum fraction.  
  - Larger → smoother, less overfitted tree.

#### Why these matter

- Making the tree **too flexible** (deep, many leaves, tiny leaves) → high risk of overfitting.  
- Adding limits (depth, leaf size, min samples) makes the model more **robust**, so it generalizes better to new data.

Sources: 

[1](https://scikit-learn.org/stable/modules/tree.html)
[2](https://inside-machinelearning.com/en/decision-tree-and-hyperparameters/)
[3](https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html)
[4](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html)
[5](https://inria.github.io/scikit-learn-mooc/python_scripts/trees_hyperparameters.html)
[6](https://businessanalyticsinstitute.com/implementing-decision-trees-with-scikit-learn/)