### 1. sklearn.tree.DecisionTreeClassifier

_class_ sklearn.tree.DecisionTreeClassifier(_*_,  _criterion='gini'_,  _splitter='best'_,  _max_depth=None_,  _min_samples_split=2_,  _min_samples_leaf=1_,  _min_weight_fraction_leaf=0.0_,  _max_features=None_,  _random_state=None_,  _max_leaf_nodes=None_,  _min_impurity_decrease=0.0_,  _class_weight=None_,  _ccp_alpha=0.0_)[[source]](https://github.com/scikit-learn/scikit-learn/blob/7db5b6a98/sklearn/tree/_classes.py#L595)[](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier "Permalink to this definition")

### 2. Parameters
**criterion**{“gini”, “entropy”, “log_loss”}, default=”gini”

The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “log_loss” and “entropy” both for the Shannon information gain, see  [Mathematical formulation](https://scikit-learn.org/stable/modules/tree.html#tree-mathematical-formulation).

**splitter**{“best”, “random”}, default=”best”

The strategy used to choose the split at each node. Supported strategies are “best” to choose the best split and “random” to choose the best random split.

**max_depth**int, default=None

The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples.

**min_samples_split**int or float, default=2

The minimum number of samples required to split an internal node:

-   If int, then consider  `min_samples_split`  as the minimum number.
    
-   If float, then  `min_samples_split`  is a fraction and  `ceil(min_samples_split  *  n_samples)`  are the minimum number of samples for each split.

### 3. Attributes
**classes_**ndarray of shape (n_classes,) or list of ndarray

The classes labels (single output problem), or a list of arrays of class labels (multi-output problem).

[`feature_importances_`](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier.feature_importances_ "sklearn.tree.DecisionTreeClassifier.feature_importances_")ndarray of shape (n_features,)

Return the feature importances.

In [1]:
# max_depth = 3
# 평가 => accaracy

In [27]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import pandas as pd

# DecisionTreeClassifier 생성
df_clf = DecisionTreeClassifier(random_state=156)

# 붓꽃 데이터 로딩, 학습과 테스트 데이터 셋으로 분리

iris = load_iris()
x = iris.data
y = iris.target
iris_df = pd.DataFrame(x, columns=iris.feature_names)


x_train, x_test, y_train, y_test = train_test_split(iris_df, y, test_size=0.2, random_state=10)



# DecisionTreeClassifier 학습
df_clf.fit(x_train, y_train)
pred = df_clf.predict(x_test)

accuracy = accuracy_score(y_test, pred)
accuracy
pred
x_train.columns


Index(['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)',
       'petal width (cm)'],
      dtype='object')

In [26]:
df_clf_1 = DecisionTreeClassifier(max_depth=5)
df_clf_1.fit(x_train, y_train)
pred_1 = df_clf_1.predict(x_test)
accuracy_1 = accuracy_score(y_test, pred_1)
accuracy_1
pred_1

array([1, 2, 0, 1, 0, 1, 2, 1, 0, 1, 1, 2, 1, 0, 0, 2, 1, 0, 0, 0, 2, 2,
       2, 0, 1, 0, 1, 1, 2, 2])

In [17]:
df_clf.feature_importances_

array([0.01252348, 0.01669798, 0.90897639, 0.06180215])

In [24]:
df_clf_1.feature_importances_

array([0.        , 0.02632138, 0.92441221, 0.04926642])

In [4]:
df_clf.feature_names_in_

array(['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)',
       'petal width (cm)'], dtype=object)

### 2. export_graphviz

sklearn.tree.export_graphviz(_decision_tree_,  _out_file=None_,  _*_,  _max_depth=None_,  _feature_names=None_,  _class_names=None_,  _label='all'_,  _filled=False_,  _leaves_parallel=False_,  _impurity=True_,  _node_ids=False_,  _proportion=False_,  _rotate=False_,  _rounded=False_,  _special_characters=False_,  _precision=3_,  _fontname='helvetica'_)[[source]](https://github.com/scikit-learn/scikit-learn/blob/7db5b6a98/sklearn/tree/_export.py#L740)[](https://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html#sklearn.tree.export_graphviz "Permalink to this definition")

In [5]:
from sklearn.tree import export_graphviz

# export_graohviz() 호출결과로 out_file로 지정된 tree.dot파일 생성
export_graphviz(df_clf,out_file="tree.dot",class_names=iris.target_names,
               feature_names=iris.feature_names,impurity=True,filled=True)

In [6]:
import graphviz

In [11]:
with open('tree.dot') as f:
    dot_graph = f.read()
graph = graphviz.Source(dot_graph)

In [13]:
graph.render(filename='의사결정나무_img', directory='./', format='png')

'의사결정나무_img.png'