### 1. What is a Decision Tree, and how does it work?

A Decision Tree is a machine learning model that splits data into smaller parts using conditions, like a flowchart. Each internal node checks a condition on a feature, branches represent the outcome, and leaf nodes give the final prediction. It works by selecting the best feature at each step to split the data to get the most accurate result.

### 2. What are impurity measures in Decision Trees?

Impurity measures are used to check how mixed the classes are in a dataset. They help decide which feature to split on. The more pure the data (less mixed), the better the split. Common impurity measures are Gini Impurity and Entropy.

### 3. What is the mathematical formula for Gini Impurity?

**Gini = 1 - Σ (pi²)**
Where `pi` is the probability of class `i` in the node. It measures how often a randomly chosen element would be incorrectly labeled.

### 4. What is the mathematical formula for Entropy?

**Entropy = - Σ (pi * log₂(pi))**
Where `pi` is the probability of class `i`. It measures the amount of uncertainty or randomness in the data.

### 5. What is Information Gain, and how is it used in Decision Trees?

Information Gain tells us how much 'information' a feature gives about the target variable. It is the reduction in entropy after a dataset is split. Decision Trees use it to pick the feature that reduces uncertainty the most.

### 6. What is the difference between Gini Impurity and Entropy?

Both measure impurity, but Gini is simpler and faster to calculate. Entropy comes from information theory and includes logarithms. In practice, both give similar results, but Gini is used more often because of its speed.

### 7. What is the mathematical explanation behind Decision Trees?

Mathematically, Decision Trees split the data at each node based on a condition that maximizes a metric (like Information Gain or Gini reduction). The algorithm keeps doing this until it reaches pure nodes or hits stopping conditions like max depth or min samples.

### 8. What is Pre-Pruning in Decision Trees?

Pre-Pruning means stopping the tree from growing too big during the building process. It uses conditions like max depth, min samples per node, etc., to avoid overfitting early.

### 9. What is Post-Pruning in Decision Trees?

Post-Pruning means first growing a full tree and then cutting back (removing) unnecessary branches after checking performance. It helps improve generalization by reducing overfitting.

### 10. What is the difference between Pre-Pruning and Post-Pruning?

Pre-Pruning stops the tree from growing too much while it's being built. Post-Pruning cuts back the tree after it's fully grown. Both aim to reduce overfitting, but they work at different stages.

### 11. What is a Decision Tree Regressor?

A Decision Tree Regressor is a type of Decision Tree used for predicting continuous values instead of classes. Instead of using classification rules, it calculates the average of values in each leaf node.

### 12. What are the advantages and disadvantages of Decision Trees?

**Advantages:** Easy to understand, interpret, and visualize. Works well without much data prep.
**Disadvantages:** Can overfit, especially with deep trees. Sensitive to small changes in data. Not good at capturing smooth patterns.

### 13. How does a Decision Tree handle missing values?

Decision Trees can handle missing values by either skipping them during split or assigning them to the branch that gives the best gain. Some libraries like scikit-learn can also impute or use surrogate splits.

### 14. How does a Decision Tree handle categorical features?

Decision Trees can split categorical features by checking each category or grouping them in different ways. Most implementations automatically handle them by converting categories into conditions.

### 15. What are some real-world applications of Decision Trees?

Decision Trees are used in areas like medical diagnosis, fraud detection, loan approval, customer segmentation, and even game development. They're popular in business because they’re easy to explain to non-tech people.

### Q16. Train a Decision Tree Classifier on the Iris dataset and print the model accuracy

In [None]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

model = DecisionTreeClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

### Q17. Train a Decision Tree Classifier using Gini and print feature importances

In [None]:
model = DecisionTreeClassifier(criterion='gini')
model.fit(X_train, y_train)
print("Feature Importances:", model.feature_importances_)

### Q18. Train a Decision Tree Classifier using Entropy and print accuracy

In [None]:
model = DecisionTreeClassifier(criterion='entropy')
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

### Q19. Train a Decision Tree Regressor on housing dataset and evaluate with MSE

In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error

data = fetch_california_housing()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

reg = DecisionTreeRegressor()
reg.fit(X_train, y_train)
y_pred = reg.predict(X_test)
print("MSE:", mean_squared_error(y_test, y_pred))

### Q20. Train a Decision Tree Classifier and visualize using graphviz

In [None]:
from sklearn.tree import export_graphviz
import graphviz

model = DecisionTreeClassifier()
model.fit(X, y)

dot_data = export_graphviz(model, out_file=None, feature_names=iris.feature_names, class_names=iris.target_names, filled=True)
graph = graphviz.Source(dot_data)
graph.render("iris_tree", view=True)

### Q21. Train a tree with max_depth=3 and compare with full tree

In [None]:
model1 = DecisionTreeClassifier(max_depth=3)
model2 = DecisionTreeClassifier()

model1.fit(X_train, y_train)
model2.fit(X_train, y_train)

acc1 = accuracy_score(y_test, model1.predict(X_test))
acc2 = accuracy_score(y_test, model2.predict(X_test))

print("Max Depth 3 Accuracy:", acc1)
print("Full Tree Accuracy:", acc2)

### Q22. Use min_samples_split=5 and compare with default

In [None]:
tree_default = DecisionTreeClassifier()
tree_custom = DecisionTreeClassifier(min_samples_split=5)

tree_default.fit(X_train, y_train)
tree_custom.fit(X_train, y_train)

print("Default Accuracy:", accuracy_score(y_test, tree_default.predict(X_test)))
print("min_samples_split=5 Accuracy:", accuracy_score(y_test, tree_custom.predict(X_test)))

### Q23. Apply feature scaling and compare accuracy

In [None]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

X_train1, X_test1, y_train1, y_test1 = train_test_split(X, y)
X_train2, X_test2, y_train2, y_test2 = train_test_split(X_scaled, y)

clf1 = DecisionTreeClassifier().fit(X_train1, y_train1)
clf2 = DecisionTreeClassifier().fit(X_train2, y_train2)

print("Without Scaling Accuracy:", accuracy_score(y_test1, clf1.predict(X_test1)))
print("With Scaling Accuracy:", accuracy_score(y_test2, clf2.predict(X_test2)))

### Q24. Train Decision Tree Classifier using One-vs-Rest (OvR) strategy

In [None]:
from sklearn.multiclass import OneVsRestClassifier

ovr_model = OneVsRestClassifier(DecisionTreeClassifier())
ovr_model.fit(X_train, y_train)
y_pred = ovr_model.predict(X_test)
print("OvR Accuracy:", accuracy_score(y_test, y_pred))

### Q25. Train Decision Tree Classifier and display feature importance scores

In [None]:
model = DecisionTreeClassifier()
model.fit(X_train, y_train)

for name, score in zip(iris.feature_names, model.feature_importances_):
    print(f"{name}: {score}")

### Q26. Train Decision Tree Regressor with max_depth=5 and compare with unrestricted

In [None]:
reg1 = DecisionTreeRegressor(max_depth=5)
reg2 = DecisionTreeRegressor()

reg1.fit(X_train, y_train)
reg2.fit(X_train, y_train)

mse1 = mean_squared_error(y_test, reg1.predict(X_test))
mse2 = mean_squared_error(y_test, reg2.predict(X_test))

print("Depth 5 MSE:", mse1)
print("Unrestricted MSE:", mse2)

### Q27. Apply Cost Complexity Pruning (CCP) and visualize effect on accuracy

In [None]:
path = DecisionTreeClassifier().cost_complexity_pruning_path(X_train, y_train)
ccp_alphas = path.ccp_alphas

acc = []
for alpha in ccp_alphas:
    model = DecisionTreeClassifier(ccp_alpha=alpha)
    model.fit(X_train, y_train)
    acc.append(accuracy_score(y_test, model.predict(X_test)))

import matplotlib.pyplot as plt
plt.plot(ccp_alphas, acc)
plt.xlabel("ccp_alpha")
plt.ylabel("Accuracy")
plt.title("Effect of CCP on Accuracy")
plt.show()

### Q28. Evaluate performance using Precision, Recall, and F1-Score

In [None]:
from sklearn.metrics import precision_score, recall_score, f1_score

model = DecisionTreeClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Precision:", precision_score(y_test, y_pred, average='macro'))
print("Recall:", recall_score(y_test, y_pred, average='macro'))
print("F1-Score:", f1_score(y_test, y_pred, average='macro'))

### Q29. Visualize confusion matrix using seaborn

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

model = DecisionTreeClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()

### Q30. Use GridSearchCV to tune max_depth and min_samples_split

In [None]:
from sklearn.model_selection import GridSearchCV

param_grid = {
    'max_depth': [3, 5, 10, None],
    'min_samples_split': [2, 5, 10]
}

grid = GridSearchCV(DecisionTreeClassifier(), param_grid, cv=5)
grid.fit(X_train, y_train)

print("Best Params:", grid.best_params_)
print("Best Score:", grid.best_score_)