#1. What is Information Gain, and how is it used in Decision Trees?

-> Information Gain is a metric used in decision trees to measure how well a feature splits the data into meaningful classes. It is based on the reduction in entropy (uncertainty) after splitting the dataset.

Entropy measures impurity or disorder in data.

    Information Gain = Entropy(before split) ‚àí Weighted Entropy(after split)

A feature with higher Information Gain is more useful for classification and is selected as a split node in the decision tree. Decision tree algorithms like ID3 and C4.5 use Information Gain to choose the best attribute at each step.

#2. Difference between Gini Impurity and Entropy


    Feature     	     Gini Impurity              	Entropy

    Definition	Measures probability of incorrect classification	Measures amount of uncertainty
    
    Formula	             1‚àí‚àëùëùùëñ2                       ‚àí‚àëpi log2(pi)

	‚Äã Range            	  0 to 0.5	                      0 to 1
   
    Speed	              Faster to compute	       More computationally expensive
   
    Used In	            CART (Classification and Regression Trees)	  ID3 / C4.5 Decision Trees
    
    Preferred When	    Want speed and simplicity	When probabilistic purity
    matters


#Q3. What is Pre-Pruning in Decision Trees?
->Pre-pruning (early stopping) stops the tree from growing too deep during training by applying constraints like:

Maximum depth

Minimum samples per leaf or node

Minimum information gain

Maximum number of leaves

It prevents overfitting by stopping unnecessary splits and improves model generalization and training speed.





In [1]:
#Write a Python program to train a Decision Tree Classifier using Gini Impurity as the criterion and print the feature importances (practical).


from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

# Load dataset
data = load_iris()
X, y = data.data, data.target

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Decision Tree model with Gini impurity
model = DecisionTreeClassifier(criterion='gini', random_state=42)
model.fit(X_train, y_train)

# Print accuracy
print("Accuracy:", model.score(X_test, y_test))

# Print feature importances
print("Feature Importances:")
for feature, importance in zip(data.feature_names, model.feature_importances_):
    print(f"{feature}: {importance:.4f}")


Accuracy: 1.0
Feature Importances:
sepal length (cm): 0.0000
sepal width (cm): 0.0167
petal length (cm): 0.9061
petal width (cm): 0.0772


#Q5: What is a Support Vector Machine (SVM)?

->SVM is a supervised ML algorithm used for classification and regression. It finds a maximum-margin hyperplane that best separates classes in feature space.

Key ideas:

Maximizes margin between classes

Uses support vectors (critical points near boundary)

Works well in high-dimensional data

#Q6: What is the Kernel Trick in SVM?

-> The Kernel Trick allows SVM to handle non-linear datasets by transforming data into a higher-dimensional space without explicitly computing the transformation.

Popular kernels:

-> Linear

-> Polynomial

-> RBF (Radial Basis Function)

-> Sigmoid

It enables SVM to classify data that is not linearly separable.

In [2]:
#Q 7: Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then compare their accuracies.

from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Dataset
data = load_wine()
X, y = data.data, data.target

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Linear SVM
svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)
pred_linear = svm_linear.predict(X_test)

# RBF SVM
svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)
pred_rbf = svm_rbf.predict(X_test)

# Print accuracies
print("Linear Kernel Accuracy:", accuracy_score(y_test, pred_linear))
print("RBF Kernel Accuracy:", accuracy_score(y_test, pred_rbf))



Linear Kernel Accuracy: 1.0
RBF Kernel Accuracy: 0.8055555555555556


# Q8: What is the Na√Øve Bayes classifier, and why is it called "Na√Øve"?

->Na√Øve Bayes is a probabilistic classifier based on Bayes' Theorem.
It assumes features are independent given the class label ‚Äî this assumption is often unrealistic, hence the term "Na√Øve".

Advantages:

Fast and efficient

Works well with text (spam filtering, sentiment analysis)

Performs well on high-dimensional data



#Q9: Explain the differences between Gaussian Na√Øve Bayes, Multinomial Na√ØveBayes, and Bernoulli Na√Øve Bayes

| Model              | Data Type             | Use Case                             |
| ------------------ | --------------------- | ------------------------------------ |
| **Gaussian NB**    | Continuous features   | Iris dataset, medical data           |
| **Multinomial NB** | Count-based features  | Text classification, word counts     |
| **Bernoulli NB**   | Binary features (0/1) | Binary text features, spam filtering |




In [3]:
#Q10.Write a Python program to train a Gaussian Na√Øve Bayes classifier on the Breast Cancer dataset and evaluate accuracy.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Gaussian Naive Bayes model
model = GaussianNB()
model.fit(X_train, y_train)

# Prediction & accuracy
pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, pred))



Accuracy: 0.9736842105263158
