# Theoretical Questions

Q1. What is Information Gain, and how is it used in Decision Trees?  
- **Information Gain (IG)** measures the **reduction in impurity** after a split.
- It is calculated as:

  \[
  IG = H(Parent) - \sum_{j} \frac{N_j}{N} H(Child_j)
  \]

- where:  
  - \( H(Parent) \) is the entropy of the parent node.  
  - \( H(Child_j) \) is the entropy of the child nodes.  
  - \( N_j \) is the number of instances in child \( j \).  
  - \( N \) is the total number of instances.

-----------------------------------------------------------------

Q2. What is the difference between Gini Impurity and Entropy?  
| Measure | Gini Impurity | Entropy |
|---------|--------------|---------|
| Formula | \( 1 - \sum p_i^2 \) | \( -\sum p_i \log_2 p_i \) |
| Interpretation | Measures misclassification probability | Measures dataset uncertainty |
| Computation | Faster | Slower |

-----------------------------------------------------------------

Q3. What is Pre-Pruning in Decision Trees?  
- **Pre-Pruning** stops the tree from growing too deep.  
- It uses **stopping criteria** like:  
  - Minimum number of samples per node.  
  - Maximum tree depth.  
  - Minimum impurity decrease.

-----------------------------------------------------------------

Q5. What is a Support Vector Machine (SVM)?
- A Support Vector Machine (SVM) is a supervised machine learning algorithm that classifies data by finding the best hyperplane to separate data points into different classes. It is used for both classification and regression tasks, but is especially powerful for classification problems like image and text recognition.

-----------------------------------------------------------------

Q6. What is the Kernel Trick in SVM?
- The kernel trick is a method used in Support Vector Machines (SVM) to perform non-linear classification by implicitly mapping data into a higher-dimensional space, making it linearly separable. Instead of explicitly computing the new coordinates in this higher dimension, a kernel function computes the dot product between pairs of points in this new space directly from the original data, which is computationally efficient.

-----------------------------------------------------------------

Q8. What is the Naïve Bayes classifier, and why is it called "Naïve"?
- The Naïve Bayes classifier is a simple, probabilistic machine learning algorithm for classification tasks that is based on Bayes' Theorem. It is called "naïve" because it makes a strong, often unrealistic, assumption that all features are independent of each other.
- A probabilistic classifier: It calculates the probability of a given set of features belonging to a particular class.

- Based on Bayes' Theorem: It uses Bayes' theorem to update the probability of a class based on new evidence (the features).

- Efficient and fast: This method is computationally efficient, especially for high-dimensional data like text, due to its simplifying assumption.

**Why it's called "naïve"**

- Assumption of independence: The "naïve" part of the name comes from its core assumption that the presence of one feature does not affect the presence of any other feature.

- Unrealistic in practice: In real-world data, features are often not independent. For example, in a spam email, the words "free" and "winner" are not independent of each other.

- Performance despite the assumption: Even with this strong, unrealistic assumption, the algorithm often performs very well in practice. The independence assumption simplifies the calculations, and even when it's not perfectly true, the resulting class probability estimates are often accurate enough to distinguish between classes.

-----------------------------------------------------------------
Q9. Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve
Bayes, and Bernoulli Naïve Bayes?

 - **Gaussian Naïve Bayes:**

- Assumes features follow a normal (Gaussian) distribution

- Suitable for continuous data like numerical values

- Uses the mean and standard deviation of the feature for each class to calculate the likelihood

**Multinomial Naïve Bayes:**

- Assumes features are discrete counts

- Suitable for data like word frequencies in text documents

- Uses the multinomial distribution to model the likelihood of each feature
value given a class

**Bernoulli Naïve Bayes:**

- Assumes features are binary (either present or absent)

- Useful for situations where features are classified as "on" or "off"

- Uses the Bernoulli distribution to model the likelihood of a feature being present given a class
-----------------------------------------------------------------

# Practical Questions

In [None]:
# Practical question
'''
Q4. Write a Python program to train a Decision Tree Classifier using Gini Impurity as the criterion and print the feature importances.
'''
'''
Answer:-17
'''
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

# Loading the dataset
iris = load_iris()
X, y = iris.data, iris.target

# Spliting data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Training Decision Tree with Gini impurity
clf = DecisionTreeClassifier(criterion='gini')
clf.fit(X_train, y_train)

# Print feature importances
print("Feature Importances:", clf.feature_importances_)


Feature Importances: [0.         0.01667014 0.90614339 0.07718647]


In [None]:
'''
Q7. Write a Python program to train two SVM classifiers with Linear and RBF
kernels on the Wine dataset, then compare their accuracies.

Hint:Use SVC(kernel='linear') and SVC(kernel='rbf'), then compare accuracy scores after fitting on the same dataset.
(Include your Python code and output in the code box below.)
'''
'''
Answer:-7
'''
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the Wine dataset
wine = load_wine()
X = wine.data
y = wine.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVM with a Linear Kernel
svm_linear = SVC(kernel='linear', random_state=42)
svm_linear.fit(X_train, y_train)
y_pred_linear = svm_linear.predict(X_test)
accuracy_linear = accuracy_score(y_test, y_pred_linear)

# Train SVM with an RBF Kernel
svm_rbf = SVC(kernel='rbf', random_state=42)
svm_rbf.fit(X_train, y_train)
y_pred_rbf = svm_rbf.predict(X_test)
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)

# Print the accuracies
print(f"Accuracy of SVM with Linear Kernel: {accuracy_linear:.4f}")
print(f"Accuracy of SVM with RBF Kernel: {accuracy_rbf:.4f}")

Accuracy of SVM with Linear Kernel: 0.9815
Accuracy of SVM with RBF Kernel: 0.7593


In [None]:
'''
Q10. Breast Cancer Dataset
Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer
dataset and evaluate accuracy.
Hint:Use GaussianNB() from sklearn.naive_bayes and the Breast Cancer dataset from
sklearn.datasets.
(Include your Python code and output in the code box below.)
'''
'''
Answer:-10
'''
# Import necessary libraries
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# 1. Load the Breast Cancer dataset
cancer_data = load_breast_cancer()
X = cancer_data.data
y = cancer_data.target

# 2. Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# 3. Initialize and train the Gaussian Naive Bayes classifier
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# 4. Make predictions on the test set
y_pred = gnb.predict(X_test)

# 5. Evaluate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)

# Print the results
print(f"Number of training samples: {X_train.shape[0]}")
print(f"Number of testing samples: {X_test.shape[0]}")
print(f"Accuracy of the Gaussian Naive Bayes model: {accuracy:.4f}")

Number of training samples: 398
Number of testing samples: 171
Accuracy of the Gaussian Naive Bayes model: 0.9415
