Question 1 : What is Information Gain, and how is it used in Decision Trees?
-  Information Gain (IG): It measures the reduction in uncertainty (entropy) after splitting a dataset based on an attribute.


- Used in descision trees:\
In Decision Trees, IG helps select the best attribute for splitting at each node. The attribute with the highest IG is chosen, ensuring maximum reduction in impurity.



Question 2: What is the difference between Gini Impurity and Entropy?
- Definition:

 **Entropy**:

  Measures the amount of disorder or uncertainty in the dataset.

**Gini Impurity**:

 Measures the probability of incorrectly classifying a randomly chosen element.
- Computation:

 Entropy involves logarithms, making it slightly more computationally expensive.

Gini is simpler and faster to compute since it only requires squaring probabilities.

- Interpretation:

 Entropy comes from information theory and quantifies the expected information needed to classify a sample.

 Gini focuses on classification error probability, giving a more direct measure of impurity.

- Strengths:

 Entropy: Theoretically grounded, useful when interpretability in terms of information gain is important.

Gini: Computationally efficient, often leads to faster training and slightly purer splits.

- Weaknesses:

Entropy can be slower due to logarithmic calculations.

Gini may sometimes bias splits toward attributes with more categories.
- Use Cases:

 Entropy: Preferred in academic or theoretical contexts where information theory is emphasized.

Gini: Commonly used in practice because of speed and simplicity.


Question 3:What is Pre-Pruning in Decision Trees?
- Definition:

Pre-pruning is also called early stopping. It is a technique used in Decision Trees to halt the growth of the tree before it becomes overly complex. Instead of allowing the tree to grow fully and then trimming it , pre-pruning sets constraints during training to prevent overfitting


In [1]:
"""Question 4:Write a Python program to train a Decision Tree Classifier using Gini Impurity as the criterion and print the feature importances (practical).
Hint: Use criterion='gini' in DecisionTreeClassifier and access .feature_importances_.
(Include your Python code and output in the code box below.)
"""
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

# Load dataset
data = load_iris()
X = data.data
y = data.target

# Create Decision Tree model using Gini Impurity
dt_model = DecisionTreeClassifier(criterion='gini', random_state=42)
dt_model.fit(X, y)

# Print feature importances
print("Feature Importances:")
print(dt_model.feature_importances_)


Feature Importances:
[0.01333333 0.         0.56405596 0.42261071]


Question 5: What is a Support Vector Machine (SVM)?
- A Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression that works by finding an optimal hyperplane which separates data points of different classes with the maximum possible margin.

Question 6: What is the Kernel Trick in SVM?

- The Kernel Trick is a mathematical technique used in Support Vector Machines (SVMs) to handle data that is not linearly separable. Instead of explicitly transforming the data into a higher-dimensional space, the kernel trick computes the similarity between data points in that space using a kernel function.


In [3]:
"""
Question 7: Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then compare their accuracies.
Hint:Use SVC(kernel='linear') and SVC(kernel='rbf'), then compare accuracy scores after fitting
on the same dataset.
(Include your Python code and output in the code box below.)
"""

# Import required libraries
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the Wine dataset
wine = load_wine()
X, y = wine.data, wine.target

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Train SVM with Linear Kernel
svm_linear = SVC(kernel='linear', random_state=42)
svm_linear.fit(X_train, y_train)
y_pred_linear = svm_linear.predict(X_test)
acc_linear = accuracy_score(y_test, y_pred_linear)

# Train SVM with RBF Kernel
svm_rbf = SVC(kernel='rbf', random_state=42)
svm_rbf.fit(X_train, y_train)
y_pred_rbf = svm_rbf.predict(X_test)
acc_rbf = accuracy_score(y_test, y_pred_rbf)

# Print accuracies
print("Linear Kernel Accuracy:", acc_linear)
print("RBF Kernel Accuracy:", acc_rbf)

Linear Kernel Accuracy: 0.9814814814814815
RBF Kernel Accuracy: 0.7592592592592593


Question 8: What is the Naïve Bayes classifier, and why is it called "Naïve"?
-  Definition:

The Naïve Bayes classifier is a probabilistic machine learning algorithm based on Bayes’ Theorem. It is widely used for classification tasks, especially in text analysis, spam detection, and sentiment analysis.

- Why it is called "Naïve":

The algorithm assumes all features are independent of each other given the class label.

In reality, features often have correlations.

Despite this unrealistic assumption, the model performs surprisingly well in practice, hence the name “Naïve.”



Question 9: Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve Bayes, and Bernoulli Naïve Bayes.
-  Gaussian Naïve Bayes (GNB):

 Assumes that features follow a normal (Gaussian) distribution.

Suitable for continuous data (e.g., height, weight, sensor readings).

 Example use case: Medical diagnosis where features are continuous measurements.

- Multinomial Naïve Bayes (MNB):

 Assumes that features represent discrete counts or frequencies.

 Commonly used in text classification (word counts, term frequencies).

Example use case: Spam detection using word frequency in emails.

- Bernoulli Naïve Bayes (BNB):

Assumes that features are binary (0/1), representing presence or absence of a feature.

 Suitable for data where attributes are yes/no, true/false, or present/absent.

 Example use case: Document classification based on whether a word appears or not.



In [4]:
"""Question 10: Breast Cancer Dataset
Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer
dataset and evaluate accuracy.
Hint:Use GaussianNB() from sklearn.naive_bayes and the Breast Cancer dataset from
sklearn.datasets.
(Include your Python code and output in the code box below.)
"""
# Import required libraries
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load the Breast Cancer dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Train Gaussian Naïve Bayes classifier
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Make predictions
y_pred = gnb.predict(X_test)

# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Accuracy: 0.9415204678362573
