# Decision trees are supervised machine learning algorithms used for both classification and regression tasks. They create a model that predicts the value of a target variable based on several input features.

# Here's a step-by-step explanation of the decision tree algorithm:

(1). Data Preparation: Start by preparing your dataset. Ensure that your data is in a structured format, with the target variable and input features properly defined.
    

(2). Splitting Criteria: Decide on the splitting criterion to measure the quality of a split. Common measures include Gini impurity and entropy. Gini impurity measures the probability of misclassifying a randomly chosen element, while entropy measures the average amount of information required to classify an element.
    

(3). Tree Construction: Begin constructing the decision tree recursively. At each step, select the best feature and splitting point based on the splitting criterion. Split the dataset into subsets based on this selected feature and point. Repeat this process for each subset until a stopping condition is met.
    

(4). Stopping Conditions: Determine the stopping conditions for tree construction. This could include reaching a maximum depth, reaching a minimum number of samples in a node, or other predefined criteria.
    

(5). Leaf Node Prediction: Once a stopping condition is met, assign a class or regression value to the leaf node based on the majority class or mean value of the samples in that node.
    

(6). Pruning: Optionally, you can perform pruning techniques such as post-pruning or pre-pruning to reduce overfitting and improve generalization.
    

In [8]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score , r2_score

In [9]:
# Load the dataset (example: Iris dataset)
iris = datasets.load_iris()
X = iris.data
y = iris.target


In [10]:
X_train , X_test , y_train , y_test = train_test_split(X,y,test_size = 0.2 , random_state = 42) 

In [11]:
dt = DecisionTreeClassifier() 

In [12]:
dt.fit(X_train , y_train) 

DecisionTreeClassifier()

In [13]:
y_pred = dt.predict(X_test) 

In [14]:
print("Accuracy score:" , accuracy_score(y_test , y_pred)) 

Accuracy score: 1.0


In [15]:
print("R2 Score:" , r2_score(y_pred , y_test)) 

R2 Score: 1.0
