# PCA + Logistic Regression vs PCA + SVM (Iris Dataset)

This notebook compares **Logistic Regression** and **SVM** after applying **PCA** on the **Iris dataset**.

**Goal:** To analyze the effect of dimensionality reduction using PCA on model performance.


## Step 1: Load the Iris Dataset

In [None]:
from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target
df.head()

## Step 2: Apply PCA (2 Components)

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

X = df.drop('target', axis=1)
y = df['target']

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)

## Step 3: Train-Test Split

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_pca, y, test_size=0.2, random_state=42)

## Step 4: Train Logistic Regression and SVM

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

log_model = LogisticRegression()
log_model.fit(X_train, y_train)

svm_model = SVC(probability=True)
svm_model.fit(X_train, y_train)

## Step 5: Evaluate Models

In [None]:
from sklearn.metrics import accuracy_score, classification_report

log_preds = log_model.predict(X_test)
svm_preds = svm_model.predict(X_test)

print("Logistic Regression Accuracy:", accuracy_score(y_test, log_preds))
print(classification_report(y_test, log_preds))

print("SVM Accuracy:", accuracy_score(y_test, svm_preds))
print(classification_report(y_test, svm_preds))

## Step 6: Realistic Prediction Query
**Example:** A sample with sepal length 5.1, sepal width 3.5, petal length 1.4, petal width 0.2

In [None]:
import numpy as np

sample = np.array([[5.1, 3.5, 1.4, 0.2]])
sample_scaled = scaler.transform(sample)
sample_pca = pca.transform(sample_scaled)

log_prob = log_model.predict_proba(sample_pca)[0]
svm_prob = svm_model.predict_proba(sample_pca)[0]

print("Logistic Regression Probabilities:", log_prob)
print("SVM Probabilities:", svm_prob)