## Decision trees

Decision trees provide a simple, interpretable way to make predictions in supervised machine learning by asking a sequence of yes/no questions about data features. They excel in both classification tasks, like identifying iris flower species, and regression tasks, like predicting house prices, using labeled training data.

#### Core Concepts

Decision trees mimic human decision-making with a flowchart-like structure starting at a root node, branching through decision nodes based on feature conditions, and ending at leaf nodes with final predictions. 

Root nodes represent the full dataset, decision nodes split data by feature thresholds, and leaf nodes assign classes or values, making trees intuitive for beginners without needing complex math.

#### Supervised Learning Basics

Unlike unsupervised learning, which finds patterns in unlabeled data, supervised learning uses labeled examples where inputs (features) pair with known outputs (targets). Decision trees require this setup to learn splits that best separate classes or predict values, as seen in spam detection (classify email) or real estate pricing (predict continuous values). They handle both classification (discrete categories) and regression (continuous numbers) via CART algorithms, often outperforming linear models on non-linear data.

#### Iris Dataset Overview

The Iris dataset contains 150 samples from three species—Setosa, Versicolor, Virginica—each with four features: sepal length/width and petal length/width in centimeters. 

A 2D scatterplot of petal length vs. width reveals clusters, where Setosa separates easily, but Versicolor and Virginica overlap slightly. Logistic regression draws straight-line boundaries on this plot, while k-nearest neighbors uses local similarities; decision trees create flexible, stepwise boundaries by recursively splitting features.


#### Building a Decision Tree

Start by loading data (e.g., sklearn's load_iris()), split into 75% training and 25% testing sets, then fit a DecisionTreeClassifier: clf = DecisionTreeClassifier(); clf.fit(X_train, y_train). The algorithm greedily picks the best feature split using criteria like Gini impurity (node purity) or entropy (randomness), repeating until leaves are pure or a depth limit stops overfitting. Predict on test data with clf.predict(X_test), often achieving 95%+ accuracy on Iris, and visualize boundaries or trees via scikit-learn plots.

#### Advantages and Comparisons

Decision trees offer transparency—you trace paths visually—handle mixed data types, and capture non-linear patterns better than logistic regression's straight lines on Iris data. 

They risk overfitting on noisy data (use pruning or max_depth), but ensembles like random forests mitigate this; on Iris, they slightly edge out logistic regression and k-NN in accuracy. 

Sources: 

[1](https://www.coursera.org/in/articles/decision-tree-machine-learning)
[2](https://www.coursera.org/articles/decision-tree-machine-learning)
[3](https://www.educative.io/answers/how-to-build-a-decision-tree-with-the-iris-dataset-in-python)
[4](https://www.ejable.com/tech-corner/ai-machine-learning-and-deep-learning/introduction-to-decision-trees-in-supervised-learning/)
[5](https://ijrpr.com/uploads/V5ISSUE7/IJRPR31774.pdf)
[6](https://www.almabetter.com/bytes/tutorials/data-science/decision-tree)
[7](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)
[8](https://www.metriccoders.com/post/iris-classification-dataset-with-decision-trees)
[9](https://discuss.datasciencedojo.com/t/whats-the-role-of-thresholds-in-decision-tree-and-logistic-regression-classifiers-for-binary-classification-decisions/1145)
[10](https://developers.google.com/machine-learning/decision-forests/decision-trees)