1. What is Information Gain, and how is it used in Decision Trees?
- Information Gain (IG) is a measure used in Decision Trees to decide which feature to split on at each node. It tells us how much uncertainty (impurity) in the data is reduced after splitting on a particular attribute.
How Information Gain Is Used in Decision Trees?
1- Start with the full dataset at the root.
2- Calculate entropy of the dataset.
3- For each feature:
->Split the data based on that feature.
->Compute the weighted entropy after the split.
->Calculate Information Gain.
4- Choose the feature with the highest Information Gain.
5- Repeat the process recursively for child nodes until stopping criteria are met.

2. What is the difference between Gini Impurity and Entropy?
Hint: Directly compares the two main impurity measures, highlighting strengths,
weaknesses, and appropriate use cases.
- Both Gini Impurity and Entropy are measures of node impurity used in Decision Tree classifiers to decide the best split. They quantify how mixed the classes are in a dataset, but they differ in formulation, interpretation, and usage.
Strength of Gini Impurity:-
-> Faster to compute (no logarithms)
-> Works well for large datasets
Weakness of Gini Impurity:-
-> Slightly less precise in evaluating split quality

Strength of Entrophy:-
-> More theoretically grounded (information theory)
-> Better when fine-grained splits matter
Weakness of Entrophy:-
-> Slower due to logarithmic calculations

3. What is Pre-Pruning in Decision Trees?
- Pre-pruning (also called early stopping) is a technique used in Decision Trees to stop the tree from growing too deep during training, in order to prevent overfitting.

4. What is a Support Vector Machine (SVM)?
- A Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. Its main goal is to find the best decision boundary (hyperplane) that separates data points of different classes with the maximum margin.

5. What is the Kernel Trick in SVM?
- The Kernel Trick is a technique used in Support Vector Machines (SVMs) that allows them to handle non-linearly separable data by implicitly mapping the input data into a higher-dimensional feature space, where a linear separation becomes possible.

6. What is the Naïve Bayes classifier, and why is it called "Naïve"?
- The Naïve Bayes classifier is a supervised, probabilistic machine learning algorithm based on Bayes’ Theorem. It is widely used for classification tasks, especially in text classification and spam filtering, because it is simple, fast, and effective.
Why is it called "Naive"?
It is called “Naïve” because it makes a strong simplifying assumption

7. Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve
Bayes, and Bernoulli Naïve Bayes?
- Gaussian naive bayes:- Gaussian Naïve Bayes is a probabilistic classification algorithm based on Bayes’ Theorem, used when the input features are continuous and assumed to follow a Gaussian (normal) distribution within each class.

- Multinomial naive bayes:- Multinomial Naïve Bayes is a probabilistic classification algorithm based on Bayes’ Theorem, mainly used when the features represent counts or frequencies of events.

It is especially popular in text classification problems such as spam detection and document categorization.

- Bernoulli naive bayes:- Bernoulli Naïve Bayes is a probabilistic classification algorithm based on Bayes’ Theorem, used when the features are binary (0 or 1)—that is, they indicate the presence or absence of a feature.

It is commonly used in text classification when we care about whether a word appears or not, not how many times it appears.

In [3]:
#Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer dataset and evaluate accuracy.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score


data = load_breast_cancer()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


gnb = GaussianNB()


gnb.fit(X_train, y_train)


y_pred = gnb.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy of Gaussian Naive Bayes:", accuracy)




Accuracy of Gaussian Naive Bayes: 0.9736842105263158


In [4]:
#Write a Python program to train a Decision Tree Classifier using GiniImpurity as the criterion and print the feature importances (practical)

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
import pandas as pd

data = load_breast_cancer()
X = data.data
y = data.target
feature_names = data.feature_names


X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


model = DecisionTreeClassifier(criterion='gini', random_state=42)


model.fit(X_train, y_train)


importances = model.feature_importances_


feature_importance_df = pd.DataFrame({
    "Feature": feature_names,
    "Importance": importances
}).sort_values(by="Importance", ascending=False)

print(feature_importance_df)


                    Feature  Importance
7       mean concave points    0.691420
27     worst concave points    0.065651
1              mean texture    0.058478
20             worst radius    0.052299
22          worst perimeter    0.051494
19  fractal dimension error    0.018554
21            worst texture    0.017445
17     concave points error    0.015931
13               area error    0.011983
24         worst smoothness    0.009233
16          concavity error    0.006276
14         smoothness error    0.001237
2            mean perimeter    0.000000
3                 mean area    0.000000
12          perimeter error    0.000000
11            texture error    0.000000
10             radius error    0.000000
9    mean fractal dimension    0.000000
6            mean concavity    0.000000
8             mean symmetry    0.000000
4           mean smoothness    0.000000
5          mean compactness    0.000000
0               mean radius    0.000000
15        compactness error    0.000000


In [5]:
#Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then compare their accuracies.
# Import required libraries
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler


data = load_wine()
X = data.data
y = data.target


scaler = StandardScaler()
X = scaler.fit_transform(X)


X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


svm_linear = SVC(kernel='linear', random_state=42)
svm_linear.fit(X_train, y_train)
y_pred_linear = svm_linear.predict(X_test)
accuracy_linear = accuracy_score(y_test, y_pred_linear)


svm_rbf = SVC(kernel='rbf', random_state=42)
svm_rbf.fit(X_train, y_train)
y_pred_rbf = svm_rbf.predict(X_test)
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)


print("Accuracy with Linear Kernel:", accuracy_linear)
print("Accuracy with RBF Kernel:", accuracy_rbf)



Accuracy with Linear Kernel: 0.9722222222222222
Accuracy with RBF Kernel: 1.0
