Question 1: What is a Decision Tree, and how does it work in the context of
classification?



*   A decision tree is a supervised machine learning algorithm that uses a tree-like structure to model decisions and their possible consequences [1]. It breaks down a dataset into smaller and smaller subsets while simultaneously developing an associated decision tree.
* The final result is a tree with decision nodes (representing a test on an attribute), branches (representing the outcome of the test), and leaf nodes (representing the final classification or decision) [1].


*   Attribute Selection (Root Node)
*   Recursive Partitioning (Decision Nodes)

*  Stopping Criteria (Leaf Nodes)
*  Prediction










Question 2: Explain the concepts of Gini Impurity and Entropy as impurity measures.How do they impact the splits in a Decision Tree?
*  Gini Impurity and Entropy are both mathematical measures used as impurity metrics to quantify the "disorder" or "uncertainty" within a set of data points in a decision tree node. The goal of the decision tree algorithm is to select the splits that maximize the reduction in impurity, or increase the information gain, leading to purer leaf nodes.

*   Impact on Splits in a Decision Tree
Both metrics guide the decision tree algorithm in selecting the optimal feature and split point at each node:
*  Objective: The algorithm evaluates all possible splits and chooses the one that results in the greatest reduction in impurity (or greatest information gain) from the parent node to the resulting child nodes.

*  Gini's Impact: Gini impurity tends to favor splits that isolate the most frequent class into its own branch faster and is often used in the CART (Classification and Regression Trees) algorithm. It is the default in libraries like scikit-learn due to its computational efficiency on large datasets.
*  Entropy's Impact: Entropy tends to produce more balanced tree partitions and is used in algorithms like ID3 and C4.5. While it is slightly slower to compute, the resulting tree structures can sometimes be different, and some studies suggest it may perform better with imbalanced datasets.







Question 3: What is the difference between Pre-Pruning and Post-Pruning in Decision Trees? Give one practical advantage of using each.

*   Pre-pruning and post-pruning are two methods used to prevent a decision tree from overfitting the training data by reducing its size or complexity.

*  Pre-Pruning Timing :-	Done during the tree building process. post-Pruning Timing :-	Done after the tree has been fully generated.

*   Pre-Pruning Approach :- Stops the growth of branches early based on predefined criteria. post-Pruning Approach :-	Grows the full tree first, then systematically removes or merges branches.
*   Pre-Pruning Decision :- Source	Uses heuristics or a validation dataset to decide whether to split. Post-Pruning :- Uses a validation dataset to evaluate and decide which branches to remove.

*  Optimal subtree identification: Because the entire tree is generated initially, post-pruning explores a larger set of possible tree structures and can often lead to a more optimal or generalized tree structure than pre-pruning. A practical advantage is the potential for higher predictive accuracy and better generalization to unseen data, as the algorithm can evaluate the full context of a branch before making a final removal decision.








Question 4: What is Information Gain in Decision Trees, and why is it important for choosing the best split?

*   Information Gain (IG) is a measure used in decision trees to quantify the expected reduction in entropy or impurity achieved by partitioning a dataset based on a specific attribute. It essentially measures how much "information" a feature provides about the target class.
*   Importance for Choosing the Best Split
Information Gain is the core criterion used by algorithms like ID3 and C4.5 to construct a decision tree. Its importance lies in the fact that it helps select the most effective attribute for splitting a node:

*  Maximizing Purity: The primary goal of a decision tree is to create
"Pure" leaf nodes, meaning nodes where all data points belong to the same class (entropy of zero). Information Gain directly measures how close a potential split gets to this goal.

*   Optimal Feature Selection: The algorithm evaluates the Information Gain for every available feature at a given node. The feature that yields the highest information gain is chosen as the optimal feature to split on at that point.
*   Reducing Uncertainty: A higher information gain means a greater reduction in uncertainty (entropy) about the final classification. By prioritizing features that maximize this gain, the decision tree algorithm ensures that the most relevant and discriminative attributes are placed higher up in the tree, leading to more efficient and accurate classification.

*   Prioritizing Meaningful Features: Features that create clearer, more homogeneous child nodes are prioritized, which helps in identifying the most impactful predictors in the dataset.






Question 5: What are some common real-world applications of Decision Trees, and What are their main advantages and limitations?

*  Common Real-World Applications

Medical Diagnosis: Decision trees help doctors in diagnosing diseases by analyzing patient symptoms and medical history. A tree can guide a diagnostic process to determine the most likely condition [1].

Credit Risk Assessment: Financial institutions use decision trees to evaluate loan applications. By analyzing applicant data (income, credit history, debt levels), the model predicts the likelihood of default, helping determine whether to approve a loan [1].

Fraud Detection: In banking and e-commerce, decision trees are used to flag potentially fraudulent transactions by identifying patterns that deviate from normal customer behavior [1].

Marketing and Customer Relationship Management (CRM): Companies use them to segment customers and predict customer churn. This allows for targeted marketing campaigns to retain valuable customers [1].

Manufacturing Quality Control: Decision trees can analyze production data to identify factors leading to product defects, helping optimize manufacturing processes.


*   Advantages

Easy to Understand and Interpret: The tree structure is intuitive and can be easily visualized. Even non-technical stakeholders can understand the logic behind the decisions made by the model [1, 2].

Minimal Data Preparation: They require less data cleaning and preprocessing compared to many other algorithms. They can handle both numerical and categorical data without extensive normalization [1, 2].

Handle Non-Linear Relationships: Decision trees can effectively model non-linear relationships between features and the target variable, which many linear models cannot do easily.

White Box Model: The reasoning behind a prediction is entirely transparent, unlike "black box" models such as neural networks [2].
*   Limitations

Overfitting: Decision trees often overfit the training data, capturing noise and outliers, which leads to poor performance on new, unseen data if not properly pruned [1, 2].

Sensitive to Data Variations: A small change in the input data can result in a completely different tree structure, making the model unstable [2].

Bias towards Dominant Classes: The standard decision tree algorithms tend to be biased towards classes with a higher frequency or larger number of samples in the dataset [1].

Sub-optimal Locally Optimal Splits: The greedy approach of decision trees (selecting the best split at each stage) does not guarantee finding the globally optimal tree structure [2].




In [None]:
'''Question 6: Write a Python program to:
● Load the Iris Dataset
● Train a Decision Tree Classifier using the Gini criterion
● Print the model’s accuracy and feature importances'''
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
