1. What is Information Gain, and how is it used in Decision Trees?

->Information Gain (IG) is a key concept in machine learning, particularly in Decision Tree algorithms. It measures the reduction in uncertainty or entropy achieved by splitting a dataset based on a specific feature.

Information Gain is calculated as the difference between the Entropy of the parent node (before the split) and the weighted average Entropy of the child nodes (after the split)

2.What is the difference between Gini Impurity and Entropy?

->Gini Impurity and Entropy are both mathematical measures of impurity or disorder in a set of data, and they are the two most common criteria used by Decision Tree algorithms (like CART, ID3, and C4.5) to decide the optimal split point at any given node.

3. What is Pre-Pruning in Decision Trees?

->Pre-Pruning in Decision Trees, also known as Early Stopping, is a technique used to prevent the decision tree from growing too large during the training phase itself, thereby mitigating the risk of overfitting.

Unlike post-pruning, which involves cutting back a fully grown tree, pre-pruning imposes constraints that stop the tree-building process before a node is split, turning the node into a leaf if the splitting criteria are not met.



In [1]:
#4. Write a Python program to train a Decision Tree Classifier using Gini Impurity as the criterion and print the feature importances

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
import pandas as pd

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Train Decision Tree using Gini criterion
clf = DecisionTreeClassifier(criterion='gini', random_state=42)
clf.fit(X, y)

# Feature importances
importances = pd.Series(clf.feature_importances_, index=iris.feature_names)
print("Feature Importances:")
print(importances.sort_values(ascending=False))

Feature Importances:
petal length (cm)    0.564056
petal width (cm)     0.422611
sepal length (cm)    0.013333
sepal width (cm)     0.000000
dtype: float64


5. What is a Support Vector Machine (SVM)?

->A Support Vector Machine (SVM) is a powerful and flexible supervised machine learning algorithm primarily used for classification (though it can also be used for regression).

Its core principle is to find the optimal hyperplane that distinctly separates the data points of different classes in an N-dimensional space. The "optimal" hyperplane is the one that achieves the maximum margin between the classes, which leads to better generalization on unseen data.

6.What is the Kernel Trick in SVM?

->The Kernel Trick is a fundamental mathematical concept in Support Vector Machines (SVMs) that allows the algorithm to effectively classify non-linearly separable data without explicitly transforming the data into a high-dimensional space.

The "trick" is that the SVM's optimization problem relies only on the dot product (inner product) between data points. The kernel function is designed to compute this dot product implicitly in a higher-dimensional feature space, which is computationally much cheaper than actually calculating the coordinates of every single data point in that new space


In [2]:
#7 Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then compare their accuracies

from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
wine = load_wine()
X_train, X_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=42)

# Linear Kernel
svm_linear = SVC(kernel='linear', random_state=42)
svm_linear.fit(X_train, y_train)
linear_acc = accuracy_score(y_test, svm_linear.predict(X_test))

# RBF Kernel
svm_rbf = SVC(kernel='rbf', random_state=42)
svm_rbf.fit(X_train, y_train)
rbf_acc = accuracy_score(y_test, svm_rbf.predict(X_test))

print(f"Linear Kernel Accuracy: {linear_acc:.4f}")
print(f"RBF Kernel Accuracy: {rbf_acc:.4f}")

Linear Kernel Accuracy: 1.0000
RBF Kernel Accuracy: 0.8056


8. What is the Naïve Bayes classifier, and why is it called "Naïve"?

->The Naïve Bayes classifier is a family of simple, yet effective, probabilistic classifiers used for classification tasks (like spam detection, sentiment analysis, and text classification). It is a supervised machine learning algorithm based on Bayes' Theorem.

The classifier works by calculating the probability of a given sample belonging to a particular class (the posterior probability) based on the combined probabilities of its individual features. It then predicts the class that has the maximum posterior probability.

9.  Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve
Bayes, and Bernoulli Naïve Bayes

->The different variants of the Naïve Bayes classifier—Gaussian, Multinomial, and Bernoulli—are distinguished by the assumptions they make about the probability distribution of the features ($P(x_i|C)$) and, consequently, the type of data they are designed to handle.




In [3]:
#10 Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer dataset and evaluate accuracy.

from sklearn.datasets import load_breast_cancer
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

# Train model
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Evaluate
y_pred = gnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"GaussianNB Accuracy: {accuracy:.4f}")



GaussianNB Accuracy: 0.9737
