#Demo 2: Scaling Features Using StandardScaler and MinMaxScaler from Scikit-learn


##**Scenario: Loan Eligibility Prediction**

A banking institution wants to develop a machine learning model to predict whether a loan applicant is eligible for a loan. The dataset contains customer financial details, such as income, loan amount, credit score, and debt-to-income ratio. However, these features have different scales:

* Income is in thousands of dollars (e.g., 30,000 to 150,000).

* Loan Amount ranges from a few thousand to hundreds of thousands.

* Credit Score is typically between 300 and 850.

* Debt-to-Income Ratio is a decimal between 0 and 1.

Since machine learning models perform poorly when features have different scales, proper feature scaling is necessary to ensure fair weightage and better convergence in optimization.

##**Objective**
* Apply StandardScaler (Z-score normalization) to transform data into a standard distribution with mean 0 and variance 1.

* Apply MinMaxScaler (Min-Max normalization) to scale features between 0 and 1, preserving the relative distribution.

* Compare the effect of different scaling techniques on data distribution and machine learning performance.

In [None]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, MinMaxScaler

In [None]:
# Load the dataset
df = pd.read_csv("loan_eligibility_dataset.csv")

# Display first few rows to understand the dataset
print("Initial Dataset:\n", df.head())

In [None]:
# Selecting numerical columns for scaling
numerical_features = ["Income", "LoanAmount", "CreditScore", "DebtToIncomeRatio"]

In [None]:
# Extracting only numerical features for scaling
df_numerical = df[numerical_features]

In [None]:
# Initialize StandardScaler
standard_scaler = StandardScaler()

# Apply StandardScaler transformation
df_standard_scaled = standard_scaler.fit_transform(df_numerical)

# Convert back to DataFrame
df_standard_scaled = pd.DataFrame(df_standard_scaled, columns=numerical_features)

In [None]:
# Display scaled data
print("\nStandard Scaled Data (Z-score normalization):\n", df_standard_scaled.head())

In [None]:
# Initialize MinMaxScaler
minmax_scaler = MinMaxScaler()

# Apply MinMaxScaler transformation
df_minmax_scaled = minmax_scaler.fit_transform(df_numerical)

# Convert back to DataFrame
df_minmax_scaled = pd.DataFrame(df_minmax_scaled, columns=numerical_features)

In [None]:
# Display scaled data
print("\nMin-Max Scaled Data (Range 0-1):\n", df_minmax_scaled.head())

In [None]:
# Save Standard Scaled dataset
df_standard_scaled.to_csv("standard_scaled_loan_data.csv", index=False)

# Save Min-Max Scaled dataset
df_minmax_scaled.to_csv("minmax_scaled_loan_data.csv", index=False)

print("\nScaled datasets saved as 'standard_scaled_loan_data.csv' and 'minmax_scaled_loan_data.csv'")