Practical to Implement: Perform feature scaling (standardization, normalization) and
transformation on a numerical dataset.

In [1]:
from sklearn.datasets import load_diabetes
from sklearn.preprocessing import StandardScaler, MinMaxScaler, PowerTransformer
import pandas as pd
import numpy as np

# Load inbuilt dataset (Diabetes dataset)
data = load_diabetes()
df = pd.DataFrame(data.data, columns=data.feature_names)

# Standardization (Z-score)
scaler_standard = StandardScaler()
df_standardized = pd.DataFrame(scaler_standard.fit_transform(df), columns=df.columns)

# Normalization (Min-Max Scaling)
scaler_minmax = MinMaxScaler()
df_normalized = pd.DataFrame(scaler_minmax.fit_transform(df), columns=df.columns)

# Log Transformation (for skewed data)
df_transformed = df.copy()
df_transformed = df_transformed.apply(lambda x: np.log1p(x))

print("\nOriginal Data (first 5 rows):\n", df.head())
print("\nStandardized Data:\n", df_standardized.head())
print("\nNormalized Data:\n", df_normalized.head())
print("\nLog Transformed Data:\n", df_transformed.head())



Original Data (first 5 rows):
         age       sex       bmi        bp        s1        s2        s3  \
0  0.038076  0.050680  0.061696  0.021872 -0.044223 -0.034821 -0.043401   
1 -0.001882 -0.044642 -0.051474 -0.026328 -0.008449 -0.019163  0.074412   
2  0.085299  0.050680  0.044451 -0.005670 -0.045599 -0.034194 -0.032356   
3 -0.089063 -0.044642 -0.011595 -0.036656  0.012191  0.024991 -0.036038   
4  0.005383 -0.044642 -0.036385  0.021872  0.003935  0.015596  0.008142   

         s4        s5        s6  
0 -0.002592  0.019907 -0.017646  
1 -0.039493 -0.068332 -0.092204  
2 -0.002592  0.002861 -0.025930  
3  0.034309  0.022688 -0.009362  
4 -0.002592 -0.031988 -0.046641  

Standardized Data:
         age       sex       bmi        bp        s1        s2        s3  \
0  0.800500  1.065488  1.297088  0.459841 -0.929746 -0.732065 -0.912451   
1 -0.039567 -0.938537 -1.082180 -0.553505 -0.177624 -0.402886  1.564414   
2  1.793307  1.065488  0.934533 -0.119214 -0.958674 -0.718897 -0.68