Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression.

$$x_{i}\in\mathcal{R}^{n}$$
$i = 1, \dots , l$
$$y\in\mathcal{R}^{l}$$
$\ni$ samples with the same labels are grouped together

Let:
> node = $m$ <br>
> data = $\mathcal{Q}$ <br>
> feature = $j$ <br>
> threshold = $t_{m}$ <br>
> each candidate split = $\theta$

$$\theta = (j, t_{m})$$
$$\mathcal{Q}_{left}(\theta) = (x,y)$$
$x_{j} \le t_{m}$
$$\mathcal{Q}_{right}(\theta) = \mathcal{Q} | \mathcal{Q}_{left}(\theta)$$

impurity function $H()$

$$G(\mathcal{Q},\theta) = \frac{n_{left}}{N_{m}}H(\mathcal{Q}_{left}(\theta)) + \frac{n_{right}}{N_{m}}H(\mathcal{Q}_{right} (\theta))$$

$$\theta^{*} = argmin_{\theta}G(\mathcal{Q}, \theta)$$

$$1 = N_{m} \lt min_{samples}$$

### Classification criteria ###
target is classification outcome
> node = $m$ <br>
> training data in node $m$ = $X_{m}$ <br>
> total  observation = $K$ <br>
> outcome = $0, 1, \dots , (K-1)$ <br>
> Region = $R_{m}$ <br>
> observations = $N_{m}$ <br>

Gini measure of impurity
$$H(X_{m}) = \sum_{k}p_{mk}(1-p_{mk})$$

Cross entropy
$$H(X_{m}) = - \sum_{k}p_{mk}\log(p_{mk})$$

Misclassifacation 
$$H(X_{m}) = 1 - \max(p_{mk})$$


### Regression criteria ###
target is  continuous value
> node = $m$ <br>
> training data in node $m$ = $X_{m}$ <br>
> Region = $R_{m}$ <br>
> observations = $N_{m}$ <br>

Mean Squared Error : minimizes the $\mathcal{L}_{2}$ error using mean values at terminal nodes
$$c_{m} = \frac{1}{N_{m}}\sum_{i\in N_{m}}y_{i}$$
$$H(X_{m}) = \frac{1}{N_{m}}\sum_{i\in N_{m}} (y_{i} - c_{m})^{2}$$

Mean Absolute Error : minimizes the L1 error using median values at terminal nodes
$$\bar{y}_{m} = \frac{1}{N_{m}}\sum_{i\in N_{m}}y_{i}$$
$$H(X_{m}) = \frac{1}{N_{m}}\sum_{i\in N_{m}} |y_{i} - \bar{y}_{m}|$$

```python 
from sklearn.tree import DecisionTreeClassifier
```
> * classes_
> * feature_importances_
> * max_features_
> * n_classes_
> * n_features_
> * n_outputs_
> * tree_

| Parameters | type | value | dafault |
|------------|------|-------|---------|
| criterion | string | optional | ”gini” |
| splitter | string |optional |”best” |
| max_depth | int or None |optional | None |
| min_samples_split | int, float |optional |2|
| min_samples_leaf | int, float | optional | 1|
| min_weight_fraction_leaf | float | optional |0|
| max_features | int, float, string or None |optional |None |
| random_state | int, RandomState instance or None | optional | None|
| max_leaf_nodes |  int or None | optional |None |
| min_impurity_decrease | float |optional |0|
| min_impurity_split | float ||
| class_weight | dict, list of dicts | “balanced” or None | None|
| presort |  bool | |False|


| Methods | Parameters | return |
|---------|------------|--------|
| apply | X, check_input=True | X_leaves |
| decision_path | X, check_input=True | indicator |
| fit | X, y, sample_weight=None, check_input=True, X_idx_sorted=None | self |
| get_params | deep=True | params |
| predict | X, check_input=True | y |
| predict_log_proba | X | p |
| predict_proba | X, check_input=True | p |
| score | X, y, sample_weight=None | score |
| set_params | \*\*params | self |


```python 
from sklearn.tree import DecisionTreeRegressor
```
> * feature_importances_
> * max_features_
> * n_features_
> * n_outputs_
> * tree_

| Parameters | type | value | dafault |
|------------|------|-------|---------|
| criterion | string | optional | ”mse” |
| splitter | string |optional |”best” |
| max_depth | int or None |optional | None |
| min_samples_split | int, float |optional |2|
| min_samples_leaf | int, float | optional | 1|
| min_weight_fraction_leaf | float | optional |0|
| max_features | int, float, string or None |optional |None |
| random_state | int, RandomState instance or None | optional | None|
| max_leaf_nodes |  int or None | optional |None |
| min_impurity_decrease | float |optional |0|
| min_impurity_split | float | |
| presort |  bool  ||False|


| Methods | Parameters | return |
|---------|------------|--------|
| apply | X, check_input=True | X_leaves |
| decision_path | X, check_input=True | indicator |
| feature\_importances\_ | | array |
| fit | X, y, sample_weight=None, check_input=True, X_idx_sorted=None | self |
| get_params | deep=True | params |
| predict | X, check_input=True | y |
| score | X, y, sample_weight=None | score |
| set_params | \*\*params | self |
