**THEORY**

Question 1: What is Information Gain, and how is it used in Decision Trees?
Answer:
Information Gain is a measure used in Decision Trees to determine which feature should be selected for splitting the data at each node. It is based on the concept of entropy, which represents the amount of uncertainty or impurity present in a dataset. When a dataset is split using a feature, the entropy of the resulting subsets is calculated. Information Gain is the difference between the entropy before the split and the weighted entropy after the split.
In Decision Trees, the feature that provides the highest Information Gain is chosen because it results in the most significant reduction in uncertainty. This helps the model create purer child nodes and improves classification accuracy.

Question 2: What is the difference between Gini Impurity and Entropy?
Answer:
Gini Impurity and Entropy are both used to measure impurity in a dataset while constructing Decision Trees, but they differ in calculation and interpretation. Gini Impurity measures the probability that a randomly selected data point would be incorrectly classified if it were randomly labeled according to the class distribution. It is computationally efficient and faster because it avoids logarithmic operations.
Entropy measures the amount of randomness or disorder in a dataset using logarithmic calculations. It is more sensitive to changes in class distribution and provides a more detailed measure of impurity. Due to this sensitivity, Entropy may sometimes result in more balanced splits, but it is computationally more expensive.

In practice, Gini Impurity is commonly preferred for large datasets due to its speed, while Entropy is used when a more precise measurement of impurity is required. Both usually produce similar results in terms of model accuracy.

Question 3: What is Pre-Pruning in Decision Trees?
Answer:
Pre-pruning is a technique used to control the growth of a Decision Tree during the training phase itself. It prevents the tree from becoming too complex by setting stopping conditions such as maximum tree depth, minimum number of samples required to split a node, or minimum samples required at a leaf node.

The main purpose of pre-pruning is to avoid overfitting, where the tree learns noise from the training data instead of meaningful patterns. By limiting tree growth early, pre-pruning improves model generalization and reduces computation time, although excessive pruning may lead to underfitting.

Question 5: What is a Support Vector Machine (SVM)?
Answer:
A Support Vector Machine is a supervised machine learning algorithm used for classification and regression tasks. It works by finding an optimal hyperplane that separates data points of different classes with the maximum possible margin. The data points that lie closest to the hyperplane are known as support vectors and play a crucial role in defining the decision boundary.
SVM is effective in high-dimensional spaces and performs well even when the number of features is large compared to the number of samples.

Question 6: What is the Kernel Trick in SVM?
Answer:
The Kernel Trick is a technique used in Support Vector Machines to handle non-linearly separable data. It works by implicitly transforming the input data into a higher-dimensional feature space where a linear separation becomes possible. This transformation is done without explicitly computing the coordinates in the higher dimension, which makes the process computationally efficient.

Common kernel functions include linear, polynomial, and radial basis function (RBF) kernels.

Question 8: What is the Naïve Bayes classifier, and why is it called "Naïve"?
Answer:
Naïve Bayes is a probabilistic classification algorithm based on Bayes’ Theorem. It calculates the probability of a data point belonging to a particular class by assuming that all features are conditionally independent given the class label.
It is called "Naïve" because this assumption of feature independence is rarely true in real-world data. Despite this simplification, Naïve Bayes performs efficiently and accurately in many applications such as text classification and spam filtering.

Question 9: Explain the differences between Gaussian, Multinomial, and Bernoulli Naïve Bayes
Answer:
Gaussian Naïve Bayes is used when the input features are continuous and assumed to follow a normal distribution. It is commonly applied in medical and scientific datasets. Multinomial Naïve Bayes is suitable for discrete data, especially count-based features such as word frequencies in text classification problems. Bernoulli Naïve Bayes is designed for binary features where values represent the presence or absence of a feature, making it useful for binary text data.

**PRACTICAL**

In [1]:
#4
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

# Load the dataset
data = load_iris()
X = data.data
y = data.target

# Train Decision Tree using Gini Impurity
dt_model = DecisionTreeClassifier(criterion='gini', random_state=42)
dt_model.fit(X, y)

# Print feature importances
print("Feature Importances:")
for name, importance in zip(data.feature_names, dt_model.feature_importances_):
    print(name, ":", importance)


Feature Importances:
sepal length (cm) : 0.013333333333333329
sepal width (cm) : 0.0
petal length (cm) : 0.5640559581320451
petal width (cm) : 0.4226107085346215


In [2]:
#7
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
wine = load_wine()
X = wine.data
y = wine.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Linear Kernel SVM
linear_svm = SVC(kernel='linear')
linear_svm.fit(X_train, y_train)
linear_predictions = linear_svm.predict(X_test)
linear_accuracy = accuracy_score(y_test, linear_predictions)

# RBF Kernel SVM
rbf_svm = SVC(kernel='rbf')
rbf_svm.fit(X_train, y_train)
rbf_predictions = rbf_svm.predict(X_test)
rbf_accuracy = accuracy_score(y_test, rbf_predictions)

print("Linear SVM Accuracy:", linear_accuracy)
print("RBF SVM Accuracy:", rbf_accuracy)


Linear SVM Accuracy: 0.9814814814814815
RBF SVM Accuracy: 0.7592592592592593


In [3]:
#10
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Train Gaussian Naive Bayes model
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Evaluate accuracy
y_pred = gnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print("Model Accuracy:", accuracy)


Model Accuracy: 0.9415204678362573
