### 📊 Feature Scaling: Normalization and Standardization
In this notebook, we'll demonstrate how to apply:
- 🔢 **Min-Max Normalization**
- 📏 **Z-score Standardization (Standard Scaling)**

We'll use different libraries to perform these transformations:
- 🐼 `pandas`
- ⚙️ `scikit-learn`
- 💡 (Optional) `scipy` and custom methods

The dataset used is the `historical_record.csv` from Dr. Alaa Khamis' ISE518 course.

Let's get started!

In [1]:
# 📦 Import required libraries
import pandas as pd
from sklearn.preprocessing import MinMaxScaler, StandardScaler

In [2]:
# 📥 Load the dataset
url = 'https://raw.githubusercontent.com/Dr-AlaaKhamis/ISE518/refs/heads/main/5_Datafication/data/historical/historical_record.csv'
df = pd.read_csv(url)
df.columns = df.columns.str.strip()  # remove extra spaces from column names
df.head()

Unnamed: 0,timestamp,machine_id,temperature,vibration,humidity,pressure,energy_consumption,machine_status,anomaly_flag,predicted_remaining_life,failure_type,downtime_risk,maintenance_required
0,2025-01-01 00:00:00,39,78.61,28.65,79.96,3.73,2.16,1,0,106,Normal,0.0,0
1,2025-01-01 00:01:00,29,68.19,57.28,35.94,3.64,0.69,1,0,320,Normal,0.0,0
2,2025-01-01 00:02:00,15,98.94,50.2,72.06,1.0,2.49,1,1,19,Normal,1.0,1
3,2025-01-01 00:03:00,43,90.91,37.65,30.34,3.15,4.96,1,1,10,Normal,1.0,1
4,2025-01-01 00:04:00,8,72.32,40.69,56.71,2.68,0.63,2,0,65,Vibration Issue,0.0,1


#### 🎯 Columns for Scaling
We'll apply scaling to the following **numerical features**:
- `temperature`
- `vibration`
- `humidity`
- `pressure`
- `energy_consumption`
- `predicted_remaining_life`

In [3]:
# 🔢 Select numeric columns for scaling
numeric_cols = ['temperature', 'vibration', 'humidity', 'pressure', 'energy_consumption', 'predicted_remaining_life']
df_numeric = df[numeric_cols]
df_numeric.head()

Unnamed: 0,temperature,vibration,humidity,pressure,energy_consumption,predicted_remaining_life
0,78.61,28.65,79.96,3.73,2.16,106
1,68.19,57.28,35.94,3.64,0.69,320
2,98.94,50.2,72.06,1.0,2.49,19
3,90.91,37.65,30.34,3.15,4.96,10
4,72.32,40.69,56.71,2.68,0.63,65


#### 🐼 Min-Max Normalization using `pandas`
**Formula:**
$ x_{norm} = \frac{x - x_{min}}{x_{max} - x_{min}} $

In [4]:
# 🐼 Min-Max Normalization using Pandas
df_minmax_pandas = (df_numeric - df_numeric.min()) / (df_numeric.max() - df_numeric.min())
df_minmax_pandas.head()

Unnamed: 0,temperature,vibration,humidity,pressure,energy_consumption,predicted_remaining_life
0,0.498437,0.349454,0.9992,0.6825,0.368889,0.210843
1,0.377822,0.568187,0.1188,0.66,0.042222,0.640562
2,0.733765,0.514096,0.8412,0.0,0.442222,0.036145
3,0.640815,0.418214,0.0068,0.5375,0.991111,0.018072
4,0.425628,0.441439,0.5342,0.42,0.028889,0.128514


#### 🐼 Z-score Normalization using `pandas`
**Formula:**
$ x_{z} = \frac{x - \mu}{\sigma} $

In [5]:
# 🐼 Z-score Normalization using Pandas
df_zscore_pandas = (df_numeric - df_numeric.mean()) / df_numeric.std()
df_zscore_pandas.head()

Unnamed: 0,temperature,vibration,humidity,pressure,energy_consumption,predicted_remaining_life
0,0.358295,-1.425535,1.729095,0.63311,-0.452331,-0.854768
1,-0.680393,0.484986,-1.319813,0.555012,-1.58496,0.571299
2,2.384834,0.012527,1.181926,-1.735861,-0.198067,-1.434525
3,1.584386,-0.824952,-1.707679,0.129812,1.705059,-1.494499
4,-0.268706,-0.622088,0.118756,-0.278033,-1.63119,-1.127987


#### ⚙️ Min-Max Normalization using `scikit-learn`

In [6]:
# ⚙️ Min-Max Normalization using scikit-learn
scaler = MinMaxScaler()
df_minmax_sk = pd.DataFrame(scaler.fit_transform(df_numeric), columns=numeric_cols)
df_minmax_sk.head()

Unnamed: 0,temperature,vibration,humidity,pressure,energy_consumption,predicted_remaining_life
0,0.498437,0.349454,0.9992,0.6825,0.368889,0.210843
1,0.377822,0.568187,0.1188,0.66,0.042222,0.640562
2,0.733765,0.514096,0.8412,0.0,0.442222,0.036145
3,0.640815,0.418214,0.0068,0.5375,0.991111,0.018072
4,0.425628,0.441439,0.5342,0.42,0.028889,0.128514


#### ⚙️ Z-score Normalization using `scikit-learn`

In [7]:
# ⚙️ Z-score Normalization using scikit-learn
scaler = StandardScaler()
df_zscore_sk = pd.DataFrame(scaler.fit_transform(df_numeric), columns=numeric_cols)
df_zscore_sk.head()

Unnamed: 0,temperature,vibration,humidity,pressure,energy_consumption,predicted_remaining_life
0,0.358297,-1.425542,1.729103,0.633113,-0.452333,-0.854773
1,-0.680397,0.484988,-1.319819,0.555015,-1.584968,0.571302
2,2.384846,0.012528,1.181932,-1.73587,-0.198068,-1.434532
3,1.584394,-0.824956,-1.707688,0.129813,1.705067,-1.494507
4,-0.268707,-0.622091,0.118757,-0.278034,-1.631198,-1.127992
