### 🔻 Dimensionality Reduction with PCA
In this notebook, we demonstrate how to use **PCA (Principal Component Analysis)** to reduce the number of features in a dataset.

We will:
- Load the dataset
- Select numeric features
- Visualize the data before and after PCA
- Apply PCA using `scikit-learn`

**Goal:** Simplify high-dimensional data while retaining as much information as possible.

In [None]:
# 📦 Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

In [None]:
# 📥 Load dataset
url = 'https://raw.githubusercontent.com/Dr-AlaaKhamis/ISE518/refs/heads/main/5_Datafication/data/historical/historical_record.csv'
df = pd.read_csv(url)
df.columns = df.columns.str.strip()
df.head()

In [None]:
# 🔢 Select numeric features
features = ['temperature', 'vibration', 'humidity', 'pressure', 'energy_consumption', 'predicted_remaining_life']
X = df[features]

In [None]:
# ⚖️ Standardize the features (important for PCA)
scaler = StandardScaler()
X_scaled = pd.DataFrame(scaler.fit_transform(X), columns=features)

#### 📊 Visualization Before PCA 

In [None]:
# 📊 Pairplot (multi-axis visualization)
sns.pairplot(X_scaled, corner=True, plot_kws={'alpha': 0.5})
plt.suptitle("Multi-Axis Visualization Before PCA", y=1.02)
plt.show()


#### 🔻 Apply PCA
We'll reduce the data from 6 dimensions to 2 dimensions for visualization.

In [None]:
pca = PCA(n_components=2) # Apply PCA to reduce to 2 dimensions
pca = PCA(n_components=0.90)  # keep enough components to explain 90% of variance

X_pca = pca.fit_transform(X_scaled)
print("Number of components:", pca.n_components_)

#### 📉 Visualization After PCA
Now we plot the transformed data in 2D using PCA components.

In [None]:
# Plot after PCA
plt.figure(figsize=(6, 5))
plt.scatter(X_pca[:, 0], X_pca[:, 1], alpha=0.6, color='tomato')
plt.title('Data After PCA Reduction')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.grid(True)
plt.show()

#### 📈 Explained Variance
How much information did we keep by reducing to 2 components?

In [None]:
# Variance explained by each component
print('Explained variance ratio:', pca.explained_variance_ratio_)
print('Total variance retained:', sum(pca.explained_variance_ratio_))

#### ✅ Summary
- PCA reduced our 6D dataset to 2D.
- Visualization helps see clustering or structure in data.
- PCA is useful for simplification and speed in modeling.

Let me know if you'd like to use PCA for clustering or prediction next!