Q-1. What is Information Gain, and how is it used in Decision Trees?

Ans- Information Gain (IG) measures the reduction in impurity after a split.

It is calculated as:

[ IG = H(Parent) - \sum_{j} \frac{N_j}{N} H(Child_j) ]

where:

( H(Parent) ) is the entropy of the parent node.
( H(Child_j) ) is the entropy of the child nodes.
( N_j ) is the number of instances in child ( j ).
( N ) is the total number of instances.

Q-2. What is the difference between Gini Impurity and Entropy?
| Measure | Gini Impurity | Entropy |
|---------|--------------|---------|
| Formula | \( 1 - \sum p_i^2 \) | \( -\sum p_i \log_2 p_i \) |
| Interpretation | Measures misclassification probability | Measures dataset uncertainty |
| Computation | Faster | Slower |


Q-3. What is Pre-Pruning in Decision Trees?  
- **Pre-Pruning** stops the tree from growing too deep.  
- It uses **stopping criteria** like:  
  - Minimum number of samples per node.  
  - Maximum tree depth.  
  - Minimum impurity decrease.

In [1]:
# Practical question
'''
Q4. Write a Python program to train a Decision Tree Classifier using Gini Impurity as the criterion and print the feature importances.
'''
'''
Answer:-4
'''
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

# Loading the dataset
iris = load_iris()
X, y = iris.data, iris.target

# Spliting data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Training Decision Tree with Gini impurity
clf = DecisionTreeClassifier(criterion='gini')
clf.fit(X_train, y_train)

# Print feature importances
print("Feature Importances:", clf.feature_importances_)

Feature Importances: [0.         0.01667014 0.40593501 0.57739485]


Q-5. What is a Support Vector Machine (SVM)?

A Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification, regression, and outlier detection tasks. It works by finding the hyperplane that best divides a dataset into classes. The optimal hyperplane is determined by maximizing the margin between the classes, which is the distance between the hyperplane and the nearest data points from each class, known as support vectors.
Example: If you have a dataset of emails labeled as "spam" or "not spam," an SVM can be trained to classify new emails into these categories by finding the best separating hyperplane between the two classes.

Q-6. What is the Kernel Trick in SVM?

The Kernel Trick is a technique used in SVM to handle non-linear classification and regression problems. It involves transforming the input data into a higher-dimensional feature space using a kernel function, where a linear separator can be found. The kernel function computes the inner product of the transformed features without explicitly performing the transformation.

Common kernel functions include:

Linear Kernel
Polynomial Kernel
Radial Basis Function (RBF) Kernel
Example: In a dataset where the classes are not linearly separable in the original feature space, the Kernel Trick can be used to map the data into a higher-dimensional space where a linear hyperplane can separate the classes.

In [2]:
# Practical question
'''
Q-7. Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then compare their accuracies.
'''
'''
Answer:-7
'''
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

wine = datasets.load_wine()
X, y = wine.data, wine.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)
y_pred_linear = svm_linear.predict(X_test)
accuracy_linear = accuracy_score(y_test, y_pred_linear)

svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)
y_pred_rbf = svm_rbf.predict(X_test)
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)

print(f'Linear Kernel Accuracy: {accuracy_linear:.2f}')
print(f'RBF Kernel Accuracy: {accuracy_rbf:.2f}')

Linear Kernel Accuracy: 1.00
RBF Kernel Accuracy: 0.81


Q-8.  What is the Naive Bayes classifier, and why is it called "Naive"?

The Naive Bayes classifier is a probabilistic machine learning algorithm based on Bayes' Theorem. It is used for classification tasks and assumes that the features are conditionally independent given the class label. The "naive" assumption of feature independence simplifies the computation, making the algorithm efficient and easy to implement.
Example: In text classification, the Naive Bayes classifier can be used to classify emails as "spam" or "not spam" by calculating the probability of each class given the words in the email and assuming that the presence of each word is independent of the others.

Q-9. Explain the differences between Gaussian Naive Bayes, Multinomial Naive Bayes, and Bernoulli Naive Bayes.

Gaussian Naive Bayes:

Assumes that the features follow a normal (Gaussian) distribution.
Suitable for continuous data.
Multinomial Naive Bayes:

Assumes that the features follow a multinomial distribution.
Suitable for discrete data, such as word counts in text classification.
Bernoulli Naive Bayes:

Assumes that the features follow a Bernoulli distribution (binary/boolean values).
Suitable for binary data, such as the presence or absence of words in text classification.
Example: Gaussian Naive Bayes can be used for classifying Iris flowers based on continuous features like petal length and width. Multinomial Naive Bayes can be used for classifying documents based on word counts. Bernoulli Naive Bayes can be used for classifying documents based on the presence or absence of specific words.

In [3]:
# Practical question
'''
Q10. Write a Python program to train a Gaussian Naive Bayes classifier on the Breast Cancer dataset and evaluate accuracy.
'''
'''
Answer:-10
'''
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

cancer = datasets.load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

Accuracy: 0.97
