1.What is a Decision Tree, and how does it work?

- A Decision Tree is a supervised learning algorithm used for both classification and regression tasks.
- It works by recursively partitioning the data into subsets based on the values of different input features.
- It creates a tree-like structure where:

1.Nodes: Represent tests on an attribute (feature).

2. Branches: Represent the outcome of the test (i.e., the possible values of the attribute).

3. Leaves: Represent the final decision or prediction (class label for classification, or value for regression).

How it works:

Start at the root node: The algorithm starts with the entire dataset at the root.

Select the best attribute: It chooses the "best" attribute to split the data based on a criterion (e.g., maximizing information gain or minimizing impurity).

Split the data: The data is split into subsets based on the values of the chosen attribute. Each subset becomes a branch of the tree.

Repeat: Steps 2 and 3 are repeated recursively for each subset (branch) until a stopping condition is met. Stopping conditions can include:

All data points in a subset belong to the same class (pure node).

The tree reaches a maximum depth.

The number of data points in a subset is below a threshold.

Further splitting does not significantly improve the purity of the subsets.

Assign a leaf node: When a stopping condition is met, a leaf node is created. For classification, the leaf node is assigned the most common class label in the subset. For regression, the leaf node is assigned the average value of the target variable in the subset.

3.What are impurity measures in Decision Trees?

- Impurity measures are used to evaluate the homogeneity of the target variable within a subset of the data.
- A node is considered "pure" if all data points in it belong to the same class (in classification) or have very similar target values (in regression). Impurity measures quantify how "mixed up" the classes are within a node.
 Common impurity measures include:

- Gini Impurity: Measures the probability of misclassifying a randomly chosen element from the subset if it were randomly labeled according to the class distribution in the subset.

- Entropy: Measures the disorder or randomness of the class distribution in the subset.

- Variance: Used in Regression trees to see how much the data deviates from the mean.

4.What is the mathematical formula for Gini Impurity?

- For a node with n classes, let p_i be the proportion of data points belonging to class i in that node. Then, the Gini Impurity is calculated as:

- Gini Impurity = 1 - Σ (p_i)^2  (summation from i=1 to n)

Where:

- p_i is the probability of class i in the node (i.e., the proportion of instances belonging to class i).

- Σ represents the sum over all classes.

5.What is the mathematical formula for Entropy?

For a node with n classes, let p_i be the proportion of data points belonging to class i in that node. Then, the Entropy is calculated as:

Entropy = - Σ p_i * log2(p_i)  (summation from i=1 to n)
Where:

p_i is the probability of class i in the node.

log2 is the base-2 logarithm. (You can also use natural logarithm (ln or log_e) but the units of information are then in "nats" instead of "bits".)

What is Information Gain, and how is it used in Decision Trees?

Information Gain (IG) measures the reduction in entropy or impurity achieved by splitting the data on a particular attribute. It's used to determine the best attribute to split on at each node of the tree. The attribute with the highest information gain is chosen.

Information Gain = Entropy(parent) - Σ [ (|children_v| / |parent|) * Entropy(children_v) ]
Where:

- Entropy(parent) is the entropy of the parent node before the split.

- children_v represents each child node resulting from splitting the parent node on attribute v.

- |children_v| is the number of data points in child node v.

- |parent| is the number of data points in the parent node.

The summation is over all child nodes resulting from the split.

6.What is the difference between Gini Impurity and Entropy?

Calculation: Gini Impurity involves squaring probabilities and subtracting from 1, while Entropy involves logarithms and probabilities.

Computational Cost: Gini Impurity is generally faster to compute than Entropy because it avoids the computationally expensive logarithm operation.

Sensitivity: Entropy is slightly more sensitive to changes in class probabilities than Gini Impurity. In practice, the difference in performance between the two is often minimal. Both tend to produce similar trees.

Bias: Some argue that Gini is biased towards multi-valued attributes.

In most cases, the choice between Gini Impurity and Entropy doesn't significantly impact the performance of the decision tree. Gini is often the default due to its lower computational cost.

What is the mathematical explanation behind Decision Trees?

- At its core, a decision tree is a piece-wise constant approximation of the true underlying function. It attempts to create regions in the feature space where the target variable is relatively constant.

Here's a more detailed breakdown:

1. Feature Space Partitioning:
- A decision tree recursively divides the feature space into rectangular regions.
- Each split is based on a single feature and a threshold value. This creates a hierarchy of decisions that lead to increasingly homogeneous regions.

-Optimization Problem: The goal is to find the "best" splits at each node. This is an optimization problem where we try to minimize the impurity of the resulting child nodes.
- The impurity measure (Gini, Entropy, Variance) acts as the objective function.

-Greedy Approach: Finding the absolute best tree structure is computationally infeasible for large datasets.
- Therefore, decision tree algorithms typically use a greedy approach. At each node, they choose the split that provides the largest immediate reduction in impurity, without considering the long-term impact on the overall tree structure.

Mathematical Representation (Classification): For a classification tree, the prediction for a new instance x can be represented as:

f(x) =  c_m   if  x ∈ R_m

Where:

f(x) is the predicted class label for instance x.

R_m is the rectangular region in the feature space corresponding to leaf node m.

c_m is the most frequent class label in the training data that falls into region R_m.
Mathematical Representation (Regression): For a regression tree, the prediction for a new instance x can be represented as:
f(x) =  μ_m   if  x ∈ R_m

Where:

f(x) is the predicted value for instance x.

R_m is the rectangular region in the feature space corresponding to leaf node m.

μ_m is the average value of the target variable in the training data that falls into region R_m.

- What is Pre-Pruning in Decision Trees?

- Pre-pruning (also called early stopping) involves setting constraints on the tree growth during the training process.
- These constraints prevent the tree from becoming too complex and overfitting the training data. Common pre-pruning techniques include:

- Maximum Depth: Limiting the maximum depth of the tree.

- Minimum Samples per Leaf: Requiring a minimum number of data points in each leaf node.

- Minimum Samples per Split: Requiring a minimum number of data points in a node before it can be split.

- Maximum Number of Leaves: Limiting the total number of leaf nodes in the tree.

Significance Test: Splitting a node only if the improvement in impurity is statistically significant.

What is Post-Pruning in Decision Trees?

- Post-pruning (also called backward pruning) involves growing the tree to its full extent and then pruning (removing) branches that do not significantly improve performance on a validation set or based on a complexity cost function.
Common post-pruning techniques include:

- Reduced Error Pruning: Iteratively remove subtrees and replace them with leaf nodes. The subtree is removed only if the resulting tree performs better on a validation set.

- Cost Complexity Pruning (Weakest Link Pruning): Introduces a cost complexity parameter (alpha) that penalizes trees with more nodes. The algorithm finds the subtree that minimizes a cost complexity function:

Cost = Error + α * Number_of_Leaves

The tree is pruned by collapsing the internal node that results in the smallest increase in Cost. This process is repeated for various values of alpha. The final tree is selected based on its performance on a validation set.

What is the difference between Pre-Pruning and Post-Pruning?
Feature:
1.Feature
-Pre-Pruning -Applied during the tree construction process.
- Post-Pruning-	Applied after the tree has been fully grown.

2.Method
Pre-Pruning-Stops the tree from growing too deep/complex.
Post-Pruning-
Post-Pruning-Removes branches from an already grown tree.

3.Data Usage:
Pre-Pruning:Only uses the training data.
Post-Pruning:
Often uses a validation set in addition to training data.

4.Overfitting
Pre-Pruning:
Prevents overfitting by limiting complexity upfront.

- Post-Pruning:
Applied after the tree has been fully grown.

5.Complexity:
-Pre-Pruning:
Simpler to implement.

-Post-Pruning:
Can be computationally more expensive.

6.Potential:
- Pre-Pruning:
May underfit if constraints are too strict.
- Post-Pruning:
Can potentially achieve better generalization.

11.What is a Decision Tree Regressor?

A - Decision Tree Regressor is a decision tree used for regression tasks. Instead of predicting a class label, it predicts a continuous numerical value.
- The splitting criteria are based on minimizing variance or mean squared error (MSE) within the leaf nodes.
- The prediction for a new instance is typically the average value of the target variable for the training instances that fall into the same leaf node.

12.What are the advantages and disadvantages of Decision Trees?
Advantages:

-Easy to understand and interpret: The tree structure is intuitive and allows for easy visualization of the decision-making process.

- Handles both numerical and categorical data: No need for extensive data preprocessing like one-hot encoding for numerical or categorical data.

- Non-parametric: Makes no assumptions about the distribution of the data.

- Feature importance: Can identify the most important features in the dataset.

- Relatively fast to train and predict: Compared to some other complex algorithms.

Disadvantages:

- Overfitting: Prone to overfitting the training data, especially with deep trees.

- High variance: Small changes in the training data can lead to significantly different tree structures.

- Bias towards features with many levels: When using Information Gain with categorical variables, features with more levels can appear to be more informative

- Instability: Can be sensitive to small variations in the data.

- Not suitable for highly complex relationships: Decision trees can struggle with learning complex non-linear relationships.

- Greedy approach: The greedy algorithm may not find the globally optimal tree.

How does a Decision Tree handle missing values?

Decision trees can handle missing values in a few ways:

- Ignore the missing values: In some implementations, instances with missing values are simply excluded from the splitting process.
This is the simplest approach but can lead to information loss.

- Imputation: Fill in the missing values with a plausible value, such as the mean, median, or mode of the feature.

- Surrogate splits: When a node is split, the algorithm can also identify "surrogate splits" – splits on other features that produce similar partitions of the data.
- If a data point has a missing value for the primary split feature, the surrogate split can be used instead. This is more sophisticated.

- Treat missing values as a separate category: For categorical features, missing values can be treated as a separate category and assigned to one of the branches.

- Fractional Observations: In some implementations, data points with missing values are passed down multiple branches with fractional weights based on the distribution of the non-missing values for that feature.

13.How does a Decision Tree handle categorical features?

-Decision Trees can handle categorical features directly, without requiring one-hot encoding or other transformations. The way it works depends on the type of categorical feature:

Binary Categorical Features: For binary features (e.g., "yes" or "no"), the split is straightforward: one branch for "yes" and one branch for "no".

- Multi-valued Categorical Features: For features with more than two categories (e.g., "red", "green", "blue"):

- Multi-way Split: The algorithm can create a separate branch for each category. This can lead to a large number of branches and potential overfitting.

- Binary Split: The algorithm can search for the best binary split, grouping the categories into two subsets. This is often done using an exhaustive search or a heuristic algorithm. For instance, if the categorical variable is "color" with options "red", "blue", and "green", the tree might try splitting the data into "red" vs. ("blue", "green") or "blue" vs. ("red", "green"), and so on, picking the split that optimizes the impurity measure.
What are some real-world applications of Decision Trees
1. Credit Risk Assessment:

Application: Banks and financial institutions use decision trees to evaluate the creditworthiness of loan applicants.

How it works: The tree uses factors like credit history, income, employment status, and debt-to-income ratio to classify applicants as low-risk or high-risk, helping to decide whether to approve a loan or not.

2. Medical Diagnosis:

Application: Decision trees can assist doctors in diagnosing diseases based on patient symptoms, medical history, and test results.

How it works: The tree uses a series of questions or tests to narrow down the possible diagnoses. For example, a decision tree might help determine if a patient has a particular type of cancer based on the presence or absence of certain symptoms and biomarkers.

3. Customer Relationship Management (CRM):

Application: Businesses use decision trees to analyze customer behavior and predict customer churn (likelihood of leaving).

How it works: The tree uses customer data like demographics, purchase history, website activity, and customer service interactions to identify customers who are at risk of churning. This allows companies to proactively offer incentives or personalized service to retain them.

4. Fraud Detection:

Application: Financial institutions and e-commerce companies use decision trees to detect fraudulent transactions.

How it works: The tree uses transaction data like amount, location, time, and device information to identify patterns that are indicative of fraud. Transactions that are flagged as suspicious can be further investigated.

5. Image Classification:

Application: While deep learning models are more common for complex image classification, decision trees (or ensembles of decision trees like Random Forests) can be used for simpler image classification tasks.

How it works: The tree uses pixel values or features extracted from images (e.g., edges, textures) to classify the image into different categories (e.g., cat, dog, car).

6. Spam Filtering:

Application: Email providers use decision trees to filter out spam emails.

How it works: The tree uses features like the sender's address, the subject line, the content of the email, and the presence of certain keywords to classify emails as spam or not spam.

7. Marketing:

Application: Businesses use decision trees to segment customers and target them with personalized marketing campaigns.

How it works: The tree uses customer data like demographics, purchase history, and online behavior to identify different customer segments. Each segment can then be targeted with specific marketing messages and offers.

8. Process Optimization:

Application: Manufacturing companies use decision trees to optimize their production processes.

How it works: The tree uses data on machine performance, materials, and environmental conditions to identify factors that are affecting production efficiency and quality.

9. Risk Management:

Application: Insurance companies use decision trees to assess risk associated with insuring individuals or assets.

How it works: The tree uses factors like age, health, lifestyle, location, and property characteristics to estimate the likelihood of a claim.

10. Recommender Systems:

Application: Decision trees can be used as part of recommender systems to suggest products or services to users.

How it works: The tree uses user data like past purchases, browsing history, and demographics to predict what items the user might be interested in.

Key Reasons for Use:

- Interpretability: Decision trees are easily understandable by non-technical stakeholders, making them valuable for explaining decisions.

- Feature Importance: They provide insights into which features are most influential in making predictions.

- Versatility: They can handle both categorical and numerical data and can be used for classification and regression tasks.

- Speed: They are relatively fast to train and make predictions, making them suitable for real-time applications. (However, more complex ensembles like Random Forests can be slower).

1. Credit Risk Assessment:

Application: Banks and financial institutions use decision trees to evaluate the creditworthiness of loan applicants.

How it works: The tree uses factors like credit history, income, employment status, and debt-to-income ratio to classify applicants as low-risk or high-risk, helping to decide whether to approve a loan or not.

2. Medical Diagnosis:

Application: Decision trees can assist doctors in diagnosing diseases based on patient symptoms, medical history, and test results.

How it works: The tree uses a series of questions or tests to narrow down the possible diagnoses. For example, a decision tree might help determine if a patient has a particular type of cancer based on the presence or absence of certain symptoms and biomarkers.

3. Customer Relationship Management (CRM):

Application: Businesses use decision trees to analyze customer behavior and predict customer churn (likelihood of leaving).

How it works: The tree uses customer data like demographics, purchase history, website activity, and customer service interactions to identify customers who are at risk of churning. This allows companies to proactively offer incentives or personalized service to retain them.

4. Fraud Detection:

Application: Financial institutions and e-commerce companies use decision trees to detect fraudulent transactions.

How it works: The tree uses transaction data like amount, location, time, and device information to identify patterns that are indicative of fraud. Transactions that are flagged as suspicious can be further investigated.

5. Image Classification:

Application: While deep learning models are more common for complex image classification, decision trees (or ensembles of decision trees like Random Forests) can be used for simpler image classification tasks.

How it works: The tree uses pixel values or features extracted from images (e.g., edges, textures) to classify the image into different categories (e.g., cat, dog, car).

6. Spam Filtering:

Application: Email providers use decision trees to filter out spam emails.

How it works: The tree uses features like the sender's address, the subject line, the content of the email, and the presence of certain keywords to classify emails as spam or not spam.

7. Marketing:

Application: Businesses use decision trees to segment customers and target them with personalized marketing campaigns.

How it works: The tree uses customer data like demographics, purchase history, and online behavior to identify different customer segments. Each segment can then be targeted with specific marketing messages and offers.

8. Process Optimization:

Application: Manufacturing companies use decision trees to optimize their production processes.

How it works: The tree uses data on machine performance, materials, and environmental conditions to identify factors that are affecting production efficiency and quality.

9. Risk Management:

Application: Insurance companies use decision trees to assess risk associated with insuring individuals or assets.

How it works: The tree uses factors like age, health, lifestyle, location, and property characteristics to estimate the likelihood of a claim.

10. Recommender Systems:

Application: Decision trees can be used as part of recommender systems to suggest products or services to users.

How it works: The tree uses user data like past purchases, browsing history, and demographics to predict what items the user might be interested in.

Key Reasons for Use:

Interpretability: Decision trees are easily understandable by non-technical stakeholders, making them valuable for explaining decisions.

Feature Importance: They provide insights into which features are most influential in making predictions.

Versatility: They can handle both categorical and numerical data and can be used for classification and regression tasks.

Speed: They are relatively fast to train and make predictions, making them suitable for real-time applications. (However, more complex ensembles like Random Forests can be slower).



In [None]:
#Write a Python program to train a Decision Tree Classifier on the Iris dataset and print the model accuracy
import numpy as np
from sklearn.datasets import load_iris, fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor, export_graphviz
from sklearn.metrics import accuracy_score, mean_squared_error
from sklearn.preprocessing import StandardScaler
import graphviz
def train_iris_gini():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    clf = DecisionTreeClassifier(random_state=42)  # Gini impurity by default
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print("Iris Decision Tree (Gini) Accuracy:", accuracy)
train_iris_gini()


Iris Decision Tree (Gini) Accuracy: 1.0


In [None]:
#Write a Python program to train a Decision Tree Classifier using Gini Impurity as the criterion and print the feature importances*
def train_iris_gini_feature_importance():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    clf = DecisionTreeClassifier(random_state=42)  # Gini impurity by default
    clf.fit(X_train, y_train)

    print("Iris Decision Tree (Gini) Feature Importances:", clf.feature_importances_)
train_iris_gini_feature_importance()


Iris Decision Tree (Gini) Feature Importances: [0.         0.01911002 0.89326355 0.08762643]


In [None]:

#3. * Write a Python program to train a Train Decision Tree Classifier with Entropy and print accuracy
def train_iris_entropy():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    clf = DecisionTreeClassifier(criterion="entropy", random_state=42)
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print("Iris Decision Tree (Entropy) Accuracy:", accuracy)
train_iris_entropy()

Iris Decision Tree (Entropy) Accuracy: 0.9777777777777777


In [None]:
#4. Write a Python program to train a Decision Tree Regressor on a housing dataset and evaluate using Mean Squared Error (MSE)*
def train_housing_regressor():
    housing = fetch_california_housing()
    X = housing.data
    y = housing.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    reg = DecisionTreeRegressor(random_state=42)
    reg.fit(X_train, y_train)
    y_pred = reg.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print("Housing Decision Tree Regressor MSE:", mse)
train_housing_regressor()

Housing Decision Tree Regressor MSE: 0.5280096503174904


In [None]:
#5.Write a Python program Train Decision Tree Classifier and visualize the tree using graphviz
def train_iris_visualize_tree():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    clf = DecisionTreeClassifier(random_state=42)
    clf.fit(X_train, y_train)

    dot_data = export_graphviz(clf, out_file=None,
                                feature_names=iris.feature_names,
                                class_names=iris.target_names,
                                filled=True, rounded=True,
                                special_characters=True)
    graph = graphviz.Source(dot_data)
    graph.render("iris_decision_tree") # saves the tree as iris_decision_tree.pdf (or .png, etc.)
    print("Decision tree visualization saved to iris_decision_tree.pdf")
train_iris_visualize_tree()

Decision tree visualization saved to iris_decision_tree.pdf


In [None]:
 #6. Train Decision Tree Classifier with max_depth=3 and compare with a fully grown tree
def train_iris_compare_depth():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Fully grown tree
    clf_full = DecisionTreeClassifier(random_state=42)
    clf_full.fit(X_train, y_train)
    y_pred_full = clf_full.predict(X_test)
    accuracy_full = accuracy_score(y_test, y_pred_full)
    print("Iris Decision Tree (Full Depth) Accuracy:", accuracy_full)

    # Tree with max_depth=3
    clf_depth3 = DecisionTreeClassifier(max_depth=3, random_state=42)
    clf_depth3.fit(X_train, y_train)
    y_pred_depth3 = clf_depth3.predict(X_test)
    accuracy_depth3 = accuracy_score(y_test, y_pred_depth3)
    print("Iris Decision Tree (Max Depth 3) Accuracy:", accuracy_depth3)

  train_iris_compare_depth()


Iris Decision Tree (Full Depth) Accuracy: 1.0
Iris Decision Tree (Max Depth 3) Accuracy: 1.0


In [None]:
# 7. Train Decision Tree Classifier with min_samples_split=5 and compare with a default tree
def train_iris_compare_min_samples_split():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Default tree
    clf_default = DecisionTreeClassifier(random_state=42)
    clf_default.fit(X_train, y_train)
    y_pred_default = clf_default.predict(X_test)
    accuracy_default = accuracy_score(y_test, y_pred_default)
    print("Iris Decision Tree (Default) Accuracy:", accuracy_default)


train_iris_compare_min_samples_split()

Iris Decision Tree (Default) Accuracy: 1.0


In [None]:
# 8. Apply feature scaling before training a Decision Tree Classifier and compare with unscaled data
def train_iris_compare_scaling():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Unscaled data
    clf_unscaled = DecisionTreeClassifier(random_state=42)
    clf_unscaled.fit(X_train, y_train)
    y_pred_unscaled = clf_unscaled.predict(X_test)
    accuracy_unscaled = accuracy_score(y_test, y_pred_unscaled)
    print("Iris Decision Tree (Unscaled) Accuracy:", accuracy_unscaled)

    # Scaled data
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)

    clf_scaled = DecisionTreeClassifier(random_state=42)
    clf_scaled.fit(X_train_scaled, y_train)
    y_pred_scaled = clf_scaled.predict(X_test_scaled)
    accuracy_scaled = accuracy_score(y_test, y_pred_scaled)
    print("Iris Decision Tree (Scaled) Accuracy:", accuracy_scaled)
train_iris_compare_scaling()

Iris Decision Tree (Unscaled) Accuracy: 1.0
Iris Decision Tree (Scaled) Accuracy: 1.0


In [None]:
# One-vs-Rest (OvR) Decision Tree Classifier for Multiclass Classification

import numpy as np
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import accuracy_score

# Generate a synthetic multiclass dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=8,
                           n_redundant=2, n_classes=3, random_state=42)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a Decision Tree Classifier
dt_classifier = DecisionTreeClassifier(random_state=42)

# Wrap the Decision Tree Classifier with OneVsRestClassifier
ovr_classifier = OneVsRestClassifier(dt_classifier)

# Train the One-vs-Rest Classifier
ovr_classifier.fit(X_train, y_train)

# Make predictions on the test set
y_pred = ovr_classifier.predict(X_test)

# Evaluate the performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

Accuracy: 0.7600


In [None]:
# Write a Python program to train a Decision Tree Classifier and display the feature importance scores
import numpy as np
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, random_state=42)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a Decision Tree Classifier
dt_classifier = DecisionTreeClassifier(random_state=42)

# Train the Decision Tree Classifier
dt_classifier.fit(X_train, y_train)

# Get feature importances
feature_importances = dt_classifier.feature_importances_

# Print feature importances
print("Feature Importances:")
for i, importance in enumerate(feature_importances):
    print(f"Feature {i+1}: {importance:.4f}")

Feature Importances:
Feature 1: 0.2218
Feature 2: 0.0211
Feature 3: 0.0082
Feature 4: 0.1048
Feature 5: 0.2503
Feature 6: 0.1242
Feature 7: 0.0253
Feature 8: 0.0097
Feature 9: 0.0213
Feature 10: 0.2132


In [None]:
#Write a Python program to train a Decision Tree Regressor with max_depth=5 and compare its performance with an unrestricted tree*
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Generate some synthetic regression data
np.random.seed(42)
X = np.random.rand(100, 1)
y = 3 * X.squeeze() + 1.5 * np.random.randn(100) # Add some noise

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a Decision Tree Regressor with max_depth=5
dt_regressor_limited = DecisionTreeRegressor(max_depth=5, random_state=42)

# Create an unrestricted Decision Tree Regressor
dt_regressor_unrestricted = DecisionTreeRegressor(random_state=42)

# Train the regressors
dt_regressor_limited.fit(X_train, y_train)
dt_regressor_unrestricted.fit(X_train, y_train)

# Make predictions
y_pred_limited = dt_regressor_limited.predict(X_test)
y_pred_unrestricted = dt_regressor_unrestricted.predict(X_test)

# Evaluate the performance (Mean Squared Error)
mse_limited = mean_squared_error(y_test, y_pred_limited)
mse_unrestricted = mean_squared_error(y_test, y_pred_unrestricted)

print(f"Mean Squared Error (max_depth=5): {mse_limited:.4f}")
print(f"Mean Squared Error (unrestricted): {mse_unrestricted:.4f}")

Mean Squared Error (max_depth=5): 1.0712
Mean Squared Error (unrestricted): 2.0347


In [None]:
# ',* Write a Python program to train a Decision Tree Classifier, apply Cost Complexity Pruning (CCP), and visualize its effect on accuracy*
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, random_state=42)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a Decision Tree Classifier
dt_classifier = DecisionTreeClassifier(random_state=42)

# Get the cost complexity pruning path
path = dt_classifier.cost_complexity_pruning_path(X_train, y_train)
ccp_alphas, impurities = path.ccp_alphas, path.impurities

# Train a Decision Tree for each alpha value
clfs = []
for ccp_alpha in ccp_alphas:
    clf = DecisionTreeClassifier(random_state=42, ccp_alpha=ccp_alpha)
    clf.fit(X_train, y_train)
    clfs.append(clf)

# Remove the last element in clfs and ccp_alphas, as it's often a trivial tree.
clfs = clfs[:-1]
ccp_alphas = ccp_alphas[:-1]

# Evaluate the accuracy for each tree
train_scores = [clf.score(X_train, y_train)

In [None]:
#* Write a Python program to train a Decision Tree Classifier and evaluate its performance using Precision,Recall, and F1-Score*
def train_and_evaluate_decision_tree(data, target):
    """
    Trains a Decision Tree Classifier, evaluates its performance using precision,
    recall, and F1-score.

    Args:
        data (pd.DataFrame or numpy.ndarray): The feature data.
        target (pd.Series or numpy.ndarray): The target variable.

    Returns:
        tuple: A tuple containing the trained model and a dictionary of metrics.
    """

    # Split data into training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.3, random_state=42)

    # Create a Decision Tree Classifier
    dt_classifier = DecisionTreeClassifier(random_state=42)

    # Train the model
    dt_classifier.fit(X_train, y_train)

    # Make predictions on the test set
    y_pred = dt_classifier.predict(X_test)

    # Calculate precision, recall, and F1-score
    precision = precision_score(y_test, y_pred, average='weighted')  # Use 'weighted' for multi-class
    recall = recall_score(y_test, y_pred, average='weighted')  # Use 'weighted' for multi-class
    f1 = f1_score(y_test, y_pred, average='weighted')  # Use 'weighted' for multi-class

    # Store metrics in a dictionary
    metrics = {
        "Precision": precision,
        "Recall": recall,
        "F1-Score": f1
    }

    return dt_classifier, metrics



In [None]:
#Write a Python program to train a Decision Tree Classifier and evaluate its performance using Precision,Recall, and F1-Score*
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris

# Load the iris dataset (replace with your own data if needed)
iris = load_iris()
X = iris.data
y = iris.target
feature_names = iris.feature_names
target_names = iris.target_names


# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)



In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris

# Load the iris dataset (replace with your own data if needed)
iris = load_iris()
X = iris.data
y = iris.target
feature_names = iris.feature_names
target_names = iris.target_names


# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

