<a href="https://colab.research.google.com/github/pejmanrasti/From_Shallow_to_Deep/blob/main/05_Dimention_Reduction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

We demonstrates here the impact of dimensionality reduction using **Component Analysis (PCA)** on both classification and regression tasks.  It provides a comparative analysis of model performance (accuracy for classification, mean squared error for regression) and computational efficiency (time taken for training and prediction) with and without PCA.

We use uses two popular datasets: the MNIST handwritten digit dataset for classification and the California housing dataset for regression.  For each dataset, it trains a model (Support Vector Classifier for MNIST and Linear Regression for California housing) with and without applying PCA beforehand. PCA transforms the high-dimensional data into a lower-dimensional space while retaining most of the important variance, potentially improving performance and reducing computational costs.

In [None]:
import numpy as np
import time
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.datasets import mnist

# Load MNIST dataset from Keras
print("Loading the MNIST dataset from Keras...")
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Flatten the 28x28 images into vectors of size 784
X_train = X_train.reshape(X_train.shape[0], -1)
X_test = X_test.reshape(X_test.shape[0], -1)

# Normalize pixel values to [0, 1]
X_train = X_train / 255.0
X_test = X_test / 255.0

# Scenario 1: Classification without dimensionality reduction
print("Running classification without dimensionality reduction...")
start_time = time.time()
clf_no_reduction = SVC(random_state=42)
clf_no_reduction.fit(X_train, y_train)
y_pred_no_reduction = clf_no_reduction.predict(X_test)
time_no_reduction = time.time() - start_time
accuracy_no_reduction = accuracy_score(y_test, y_pred_no_reduction)

# Scenario 2: Classification with PCA for dimensionality reduction
print("Applying PCA for dimensionality reduction...")
pca = PCA(n_components=50)  # Reduce to 50 dimensions
X_train_pca = pca.fit_transform(X_train)
X_test_pca = pca.transform(X_test)

print("Running classification with dimensionality reduction...")
start_time = time.time()
clf_with_reduction = SVC(random_state=42)
clf_with_reduction.fit(X_train_pca, y_train)
y_pred_with_reduction = clf_with_reduction.predict(X_test_pca)
time_with_reduction = time.time() - start_time
accuracy_with_reduction = accuracy_score(y_test, y_pred_with_reduction)

# Display the results
print("\nResults:")
print(f"Accuracy without PCA: {accuracy_no_reduction:.2f}, Time taken: {time_no_reduction:.2f} seconds")
print(f"Accuracy with PCA: {accuracy_with_reduction:.2f}, Time taken: {time_with_reduction:.2f} seconds")

# Visualize the results
labels = ["Without PCA", "With PCA"]
accuracy = [accuracy_no_reduction, accuracy_with_reduction]
time_taken = [time_no_reduction, time_with_reduction]

plt.figure(figsize=(12, 5))

# Accuracy plot
plt.subplot(1, 2, 1)
sns.barplot(x=labels, y=accuracy)
plt.title("Accuracy Comparison")
plt.ylabel("Accuracy")
plt.ylim(0.8, 1)

# Time plot
plt.subplot(1, 2, 2)
sns.barplot(x=labels, y=time_taken)
plt.title("Time Comparison")
plt.ylabel("Time (seconds)")

plt.tight_layout()
plt.show()

# Visualizing PCA effect on data (first two components)
plt.figure(figsize=(10, 6))
plt.scatter(X_train_pca[:, 0], X_train_pca[:, 1], c=y_train, cmap="viridis", s=5)
plt.colorbar(label="Digit Label")
plt.title("PCA Reduced Data Visualization (First 2 Components)")
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.show()

In [None]:
import numpy as np
import time
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split

# Load the California housing dataset
print("Loading the California housing dataset...")
housing = fetch_california_housing()
X, y = housing.data, housing.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


# Scenario 1: Regression without dimensionality reduction
print("Running regression without dimensionality reduction...")
start_time = time.time()
reg_no_reduction = LinearRegression()
reg_no_reduction.fit(X_train, y_train)
y_pred_no_reduction = reg_no_reduction.predict(X_test)
time_no_reduction = time.time() - start_time
mse_no_reduction = mean_squared_error(y_test, y_pred_no_reduction)

# Scenario 2: Regression with PCA for dimensionality reduction
print("Applying PCA for dimensionality reduction...")
pca = PCA(n_components=3)  # Reduce to 3 dimensions
X_train_pca = pca.fit_transform(X_train)
X_test_pca = pca.transform(X_test)

print("Running regression with dimensionality reduction...")
start_time = time.time()
reg_with_reduction = LinearRegression()
reg_with_reduction.fit(X_train_pca, y_train)
y_pred_with_reduction = reg_with_reduction.predict(X_test_pca)
time_with_reduction = time.time() - start_time
mse_with_reduction = mean_squared_error(y_test, y_pred_with_reduction)

# Display the results
print("\nResults:")
print(f"Mean Squared Error without PCA: {mse_no_reduction:.2f}, Time taken: {time_no_reduction:.2f} seconds")
print(f"Mean Squared Error with PCA: {mse_with_reduction:.2f}, Time taken: {time_with_reduction:.2f} seconds")

# Visualize the results
labels = ["Without PCA", "With PCA"]
mse = [mse_no_reduction, mse_with_reduction]
time_taken = [time_no_reduction, time_with_reduction]

plt.figure(figsize=(12, 5))

# MSE plot
plt.subplot(1, 2, 1)
sns.barplot(x=labels, y=mse)
plt.title("Mean Squared Error Comparison")
plt.ylabel("MSE")

# Time plot
plt.subplot(1, 2, 2)
sns.barplot(x=labels, y=time_taken)
plt.title("Time Comparison")
plt.ylabel("Time (seconds)")

plt.tight_layout()
plt.show()

#Exercise 1:  Varying PCA Components
Modify the n_components parameter in PCA for both the classification and regression examples.

Experiment with different values (e.g., 10, 20, 100, 700) or Use ratio for the MNIST dataset and observe how the accuracy and time taken change.
Analyze the impact of different numbers of components on the model's performance and time efficiency.

# Exercise 2:  Alternative Dimensionality Reduction Technique (t-SNE)
Replace PCA with t-SNE (t-distributed Stochastic Neighbor Embedding) for dimensionality reduction in the MNIST classification example.

Compare the results (accuracy, time) of t-SNE with PCA. Analyze the differences in their performance, focusing on how well each preserves local neighborhood structures.

**Note:** t-SNE is often used for visualization, but you can experiment with it as a dimensionality reduction method.  It is computationally more expensive.

# Exercise 3:  Applying LDA to MNIST
Apply Linear Discriminant Analysis (LDA) to reduce the dimensionality of the MNIST dataset before classification.

Compare the accuracy and computational time with the previous methods (PCA and t-SNE).  LDA is a supervised method; it uses class labels during dimensionality reduction.

Does LDA's use of class labels improve classification accuracy, and how much slower is it?

LDA is often better for classification tasks because it maximizes class separability.

# Exercise 4: Feature Scaling and Dimensionality Reduction
Experiment with different feature scaling methods (e.g., MinMaxScaler, StandardScaler) before applying PCA or other dimensionality reduction techniques.
Evaluate how feature scaling affects the performance of dimensionality reduction methods (accuracy and training time).  For example:

**from sklearn.preprocessing import StandardScaler**

**scaler = StandardScaler()**

**X_train_scaled = scaler.fit_transform(X_train)**

**X_test_scaled = scaler.transform(X_test)**

# Exercise 5:  Kernel PCA
Use KernelPCA instead of standard PCA. Experiment with different kernels (linear, rbf, poly)and observe their impact on MNIST classification. Compare results against those from regular PCA.

Kernel PCA can capture non-linear relationships in the data.
