## Concept of Kernel PCA:

Kernel PCA (Principal Component Analysis) is an extension of the regular PCA (Principal Component Analysis) method that allows the data to be mapped into a higher-dimensional feature space using a kernel function, making it possible to capture non-linear patterns in the data. Kernel PCA is particularly useful when dealing with non-linear relationships between features.

In contrast to standard PCA, which relies on linear projections, Kernel PCA leverages different kernel functions to implicitly map data into a higher-dimensional space, where linear separability might be easier to achieve.

### Steps to Apply Kernel PCA:

#### Preprocess Data:
- Load the dataset.
- Handle categorical variables using label encoding.
- Scale the features using standardization.

#### Apply Kernel PCA:
- Choose an appropriate kernel (e.g., RBF, polynomial, etc.).
- Apply Kernel PCA for dimensionality reduction.

#### Train Multiple Classifiers:
- Use various classifiers (e.g., Logistic Regression, SVM, KNN, etc.) on the reduced data.

#### Evaluate Performance:
- Evaluate the accuracy of each classifier and display the results in a tabular format.


#### Steps and Code Implementation:

In [1]:
from sklearn.decomposition import KernelPCA
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
from sklearn.preprocessing import StandardScaler, LabelEncoder

# Step 1: Load the dataset
dataset = pd.read_csv('prep.csv')

# Step 2: Preprocess the data
labelencoder = LabelEncoder()
y = dataset.iloc[:, -1].values  # Target variable
X = dataset.iloc[:, :-1].values  # Features

# Encode categorical variables if needed
for i in range(X.shape[1]):
    if X[:, i].dtype == 'object':  # Check for categorical columns
        X[:, i] = labelencoder.fit_transform(X[:, i])

# Step 3: Feature scaling
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)  # Scale the features

# Step 4: Apply Kernel PCA for non-linear dimensionality reduction
kernel_pca = KernelPCA(n_components=2, kernel='rbf')  # You can choose other kernels like 'poly', 'sigmoid', etc.
X_kpca = kernel_pca.fit_transform(X_scaled)

# Step 5: Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X_kpca, y, test_size=0.2, random_state=42)

# Step 6: Initialize classifiers
models = {
    'Logistic': LogisticRegression(),
    'SVMl': SVC(kernel='linear'),
    'SVMnl': SVC(kernel='rbf'),
    'KNN': KNeighborsClassifier(n_neighbors=5),
    'Navie': GaussianNB(),
    'Decision': DecisionTreeClassifier(),
    'Random': RandomForestClassifier()
}

# Step 7: Evaluate each model
results = {}
for name, model in models.items():
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    results[name] = accuracy

# Step 8: Convert results to a DataFrame and print
results_df = pd.DataFrame([results], index=['Kernel PCA'])
print("Model Performance with Kernel PCA:")
print(results_df)


Model Performance with Kernel PCA:
            Logistic    SVMl  SVMnl    KNN  Navie  Decision  Random
Kernel PCA     0.975  0.9625  0.975  0.975  0.975    0.9875   0.975


## Model Performance with Kernel PCA:

| Model        | Logistic | SVM (Linear) | SVM (Non-linear) | KNN    | Naive Bayes | Decision Tree | Random Forest |
|--------------|----------|--------------|------------------|--------|-------------|---------------|---------------|
| **Kernel PCA** | 0.975    | 0.9625       | 0.975            | 0.975  | 0.975       | 0.9875        | 0.975         |


By using this code, you should be able to evaluate the performance of different classifiers on the data after applying Kernel PCA and display the results in a table format.


## Explanation of the Code:

### Preprocessing:

- **The LabelEncoder** is used to convert categorical features into numerical values.
- **Feature scaling** is performed using StandardScaler to standardize the features, which is important when using kernel methods.

### Kernel PCA:

- **KernelPCA(n_components=2, kernel='rbf')** is used for non-linear dimensionality reduction. The RBF kernel is used here, but you can also experiment with other kernels such as poly, sigmoid, etc. This maps the data to a higher-dimensional space using the chosen kernel and then performs PCA on that transformed space.

### Model Training:

- **Multiple classifiers** are trained using the reduced dataset (`X_kpca`), which is the result of the kernel PCA transformation.

### Evaluation:

- **Each classifier’s performance** is evaluated using accuracy. The results are stored in the `results` dictionary, which is then converted into a pandas DataFrame for easy display.


## Key Points:

### Kernel Choice:
- Kernel PCA uses a kernel function to map the data into a higher-dimensional feature space. The most common kernel is the RBF kernel, which can capture non-linear relationships. You can also experiment with other kernels such as polynomial or sigmoid.

### Dimensionality Reduction:
- The number of components (`n_components`) should be chosen based on your dataset and the number of features you want to reduce to. For example, in this case, we reduce the data to 2 components.

### Model Performance:
- After applying Kernel PCA, you can train different classifiers on the transformed data and evaluate their performance.
