#**Supervised Classification: Decision Trees, SVM, and Naive Bayes| Assignment**

#**Question 1: What is Information Gain, and how is it used in Decision Trees?**

#**Answer:**

Information Gain is a metric used in Decision Trees to measure how well a feature separates the data into different classes. It is based on the concept of Entropy, which measures the level of impurity or uncertainty in a dataset.

Information Gain calculates the reduction in entropy after splitting the dataset on a particular feature. The feature that provides the highest Information Gain is selected as the splitting node in the Decision Tree.

In simple words, Information Gain helps the Decision Tree decide which feature is the best to split the data so that the classes become as pure as possible.

#**Question 2: What is the difference between Gini Impurity and Entropy?**

#**Answer:**

Gini Impurity and Entropy are both measures used to evaluate the impurity of a node in Decision Trees.

Gini Impurity measures the probability that a randomly chosen data point would be incorrectly classified. It is computationally faster and commonly used in CART (Classification and Regression Trees).

Entropy measures the level of disorder or randomness in the data. It comes from information theory and is used in ID3 and C4.5 algorithms.

While both aim to create pure nodes, Gini Impurity is simpler and faster, whereas Entropy is more mathematically informative.

#**Question 3: What is Pre-Pruning in Decision Trees?**

#**Answer:**

Pre-Pruning is a technique used to control the growth of a Decision Tree during the training process in order to prevent overfitting. Instead of allowing the tree to grow fully, pre-pruning applies certain stopping conditions early.

Common pre-pruning methods include limiting the maximum depth of the tree, specifying the minimum number of samples required to split a node, or setting a minimum threshold for impurity reduction.

By restricting tree growth, pre-pruning reduces model complexity, improves generalization on unseen data, and helps maintain better prediction performance.

#**Question 4: Write a Python program to train a Decision Tree Classifier using Gini Impurity and print feature importances.**

In [None]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris

# Load dataset
data = load_iris()
X = data.data
y = data.target

# Train Decision Tree with Gini Impurity
model = DecisionTreeClassifier(criterion='gini')
model.fit(X, y)

# Print feature importances
print("Feature Importances:", model.feature_importances_)


Feature Importances: [0.01333333 0.01333333 0.05072262 0.92261071]


This program trains a Decision Tree using the Gini criterion and displays the importance of each feature in making decisions.

#**Question 5: What is a Support Vector Machine (SVM)?**

#**Answer:**

Support Vector Machine (SVM) is a supervised learning algorithm mainly used for classification tasks. It works by finding an optimal decision boundary, known as a hyperplane, that best separates data points of different classes.

SVM aims to maximize the margin, which is the distance between the hyperplane and the closest data points from each class, called support vectors. By maximizing this margin, SVM achieves better generalization and robustness, especially in high-dimensional datasets.

#**Question 6: What is the Kernel Trick in SVM?**

#**Answer:**

The Kernel Trick is a technique used in SVM to classify non-linearly separable data. Instead of separating data directly in the original feature space, the kernel trick maps the data into a higher-dimensional space where linear separation becomes possible.

Popular kernels include Linear, Polynomial, and Radial Basis Function (RBF). The key advantage of the kernel trick is that it performs this transformation without explicitly computing the higher-dimensional features, making SVM computationally efficient and powerful.

#**Question 7: Write a Python program to train Linear and RBF SVM classifiers on the Wine dataset and compare accuracies.**

#**Answer:**

In [None]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
data = load_wine()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Linear SVM
linear_svm = SVC(kernel='linear')
linear_svm.fit(X_train, y_train)
linear_pred = linear_svm.predict(X_test)

# RBF SVM
rbf_svm = SVC(kernel='rbf')
rbf_svm.fit(X_train, y_train)
rbf_pred = rbf_svm.predict(X_test)

# Compare accuracies
print("Linear SVM Accuracy:", accuracy_score(y_test, linear_pred))
print("RBF SVM Accuracy:", accuracy_score(y_test, rbf_pred))


Linear SVM Accuracy: 0.9722222222222222
RBF SVM Accuracy: 0.6388888888888888


In [None]:
This code compares the performance of Linear and RBF kernels on the same dataset.

#**Question 8: What is the Naïve Bayes classifier, and why is it called "Naïve"?**

#**Answer:**
Naïve Bayes is a probabilistic classification algorithm based on Bayes’ Theorem. It assumes that all features are independent of each other given the class label.

This assumption is usually unrealistic in real-world data, which is why the algorithm is called “Naïve”. However, despite this limitation, Naïve Bayes is simple, fast, and highly effective in applications such as spam detection, sentiment analysis, and document classification.

#**Question 9: Explain the differences between Gaussian, Multinomial, and Bernoulli Naïve Bayes.**

#**Answer:**

Gaussian Naïve Bayes is used when the input features are continuous and follow a normal distribution. It is commonly applied in medical and scientific datasets.

Multinomial Naïve Bayes is suitable for discrete count data, such as word frequencies in text classification tasks.

Bernoulli Naïve Bayes is designed for binary features, where features represent the presence or absence of a characteristic.

Each variant is selected based on the nature of the input data.

#**Question 10: Train a Gaussian Naïve Bayes classifier on the Breast Cancer dataset and evaluate accuracy.**

#**Answer:**

In [None]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Train Gaussian Naïve Bayes
model = GaussianNB()
model.fit(X_train, y_train)

# Evaluate accuracy
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))


Accuracy: 0.9473684210526315
