# Day 1: 🔍 Understanding Feature Drift & Data Drift

##🎯 Goal: Understand what data drift and feature drift are, why they break ML models in production, and how to detect them.

#🧠 Why This Is Critical?
Once your model is deployed, it’s working with live data. If that incoming data changes over time, your model's accuracy will drop. This is called drift.

#🔁 What Is Drift?
| Type              | Definition                                                              | Example                                                 |
| ----------------- | ----------------------------------------------------------------------- | ------------------------------------------------------- |
| **Feature Drift** | A feature's distribution changes between training & production          | "ApplicantIncome" becomes higher in real-time data      |
| **Target Drift**  | The distribution of target variable changes                             | Fewer loans are being approved in recent months         |
| **Concept Drift** | The relationship between X and y changes (data behaves differently now) | Good income ≠ loan approval anymore due to new policies |

#📊 Real-Life Example
    You trained a model in 2023 on:
    • Income ~ ₹35,000 average
    • Loan approvals: 70%

    Now in 2025:
    • Income ~ ₹55,000 average
    • Loan approvals: 50%
🔥 Your model thinks people will get approved, but rules have changed — drift!


#📦 Key Concepts
1. Feature Distribution Shift
- Measured using KS-test, Population Stability Index (PSI), or Jensen-Shannon distance
- You compare training vs current feature values

2. Target Drift
Detected by looking at:
```
train["Loan_Status"].value_counts(normalize=True)
real_time["Loan_Status"].value_counts(normalize=True)
```
3. Concept Drift
- Harder to detect
- Requires monitoring model performance over time (accuracy, precision, etc.)

# Day 2: Coding Drift Detection Using evidently

## basic setup

In [None]:
#download data from kaggle
!pip install opendatasets

Collecting opendatasets
  Downloading opendatasets-0.1.22-py3-none-any.whl.metadata (9.2 kB)
Downloading opendatasets-0.1.22-py3-none-any.whl (15 kB)
Installing collected packages: opendatasets
Successfully installed opendatasets-0.1.22


In [None]:
import pandas as pd
import numpy as np
import opendatasets as od
import seaborn as sns
import matplotlib.pyplot as plt
import sklearn
import joblib

In [None]:
od.download("https://www.kaggle.com/datasets/taweilo/loan-approval-classification-data?select=loan_data.csv")


Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username: hemantk777
Your Kaggle Key: ··········
Dataset URL: https://www.kaggle.com/datasets/taweilo/loan-approval-classification-data
Downloading loan-approval-classification-data.zip to ./loan-approval-classification-data


100%|██████████| 751k/751k [00:00<00:00, 391MB/s]







In [None]:
df=pd.read_csv("/content/loan-approval-classification-data/loan_data.csv")

In [None]:
df.head()

Unnamed: 0,person_age,person_gender,person_education,person_income,person_emp_exp,person_home_ownership,loan_amnt,loan_intent,loan_int_rate,loan_percent_income,cb_person_cred_hist_length,credit_score,previous_loan_defaults_on_file,loan_status
0,22.0,female,Master,71948.0,0,RENT,35000.0,PERSONAL,16.02,0.49,3.0,561,No,1
1,21.0,female,High School,12282.0,0,OWN,1000.0,EDUCATION,11.14,0.08,2.0,504,Yes,0
2,25.0,female,High School,12438.0,3,MORTGAGE,5500.0,MEDICAL,12.87,0.44,3.0,635,No,1
3,23.0,female,Bachelor,79753.0,0,RENT,35000.0,MEDICAL,15.23,0.44,2.0,675,No,1
4,24.0,male,Master,66135.0,1,RENT,35000.0,MEDICAL,14.27,0.53,4.0,586,No,1


## evidently setup

In [None]:
!pip install -U evidently

Collecting evidently
  Downloading evidently-0.7.11-py3-none-any.whl.metadata (10 kB)
Collecting litestar>=2.8.3 (from evidently)
  Downloading litestar-2.16.0-py3-none-any.whl.metadata (110 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m111.0/111.0 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting typing-inspect>=0.9.0 (from evidently)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting watchdog>=3.0.0 (from evidently)
  Downloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.3/44.3 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
Collecting iterative-telemetry>=0.0.5 (from evidently)
  Downloading iterative_telemetry-0.0.10-py3-none-any.whl.metadata (4.1 kB)
Collecting dynaconf>=3.2.4 (from evidently)
  Downloading dynaconf-3.2.11-py2.py3-none-any.whl.metadata (9.2 kB)
Collecting ujson>=5.4.0 (from evidently)
  Downloading ujson-5.10.0-cp

In [None]:
# Simulate drift: use last 30% as "current"
split = int(len(df) * 0.7)
ref_data = df[:split].copy()   # old/training data
cur_data = df[split:].copy()   # simulated drift (e.g., remove credit history)

In [None]:
# Build Drift Report
from evidently import Report
from evidently.presets import DataDriftPreset

# Create a Report with the DataDrift preset
report = Report([DataDriftPreset()])

# Run the report; first dataset = current, second = reference
result = report.run(current_data=cur_data, reference_data=ref_data)

# Save to HTML
#result.save_html("loan_drift_report.html")
result  # View in Colab

# Day 3: 🔄 Concept Drift & Train-Test Leakage

## Explanation: Concept Drift
# 🧠 What is Concept Drift?
- Concept Drift happens when the relationship between input features (X) and the target variable (y) changes over time.
- Model logic changes over time (same features ≠ same result)
- example: 	Model rule for "who gets a loan" changed

# 📉 Why Concept Drift is Dangerous
- Model performance drops even if input looks similar.
- You won’t detect it just by looking at data drift.
- It needs model performance monitoring over time.

# 🛡️ How to Handle Concept Drift

| Strategy                       | Description                                                           |
| ------------------------------ | --------------------------------------------------------------------- |
| **Monitor model accuracy**     | Track performance on new data                                         |
| **Use rolling retraining**     | Retrain model periodically using most recent data                     |
| **Use online learning models** | Models that adapt continuously (e.g., SGDClassifier)                  |
| **Detect drift explicitly**    | Tools like **Evidently**, **River**, or custom statistical monitoring |

In [None]:
# Compare Target Distribution
print("Train Target Distribution:")
print(ref_data['loan_status'].value_counts(normalize=True))


print("\nProduction Target Distribution:")
print(cur_data['loan_status'].value_counts(normalize=True))

Train Target Distribution:
loan_status
0    0.781834
1    0.218166
Name: proportion, dtype: float64

Production Target Distribution:
loan_status
0    0.768313
1    0.231687
Name: proportion, dtype: float64


##  Detecting Train-Test Leakage
🧠 If any feature has too high correlation (e.g. 1.0 or near 0.9), it’s possibly leaky.

In [None]:
# Check correlation with target
correlation = df.corr(numeric_only=True)['loan_status'].sort_values(ascending=False)
print(correlation)

loan_status                   1.000000
loan_percent_income           0.384880
loan_int_rate                 0.332005
loan_amnt                     0.107714
credit_score                 -0.007647
cb_person_cred_hist_length   -0.014851
person_emp_exp               -0.020481
person_age                   -0.021476
person_income                -0.135808
Name: loan_status, dtype: float64


In [None]:
# Check correlation of all features with the target
leak_test = df.copy()
leak_test['loan_status'] = df['loan_status']

# Convert categorical to numeric for check
leak_test_encoded = pd.get_dummies(leak_test,drop_first=True)

correlations = leak_test_encoded.corr()['loan_status'].sort_values(ascending=False)
print("Top correlated features with Loan_Status:")
print(correlations.head(10))


Top correlated features with Loan_Status:
loan_status                    1.000000
loan_percent_income            0.384880
loan_int_rate                  0.332005
person_home_ownership_RENT     0.255239
loan_amnt                      0.107714
loan_intent_MEDICAL            0.065195
loan_intent_HOMEIMPROVEMENT    0.033838
person_home_ownership_OTHER    0.013645
person_education_Bachelor      0.004728
person_education_Doctorate     0.001833
Name: loan_status, dtype: float64


## Preventing Pipeline Leakage

In [None]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split

# Use only non-leaky features
X = leak_test_encoded.drop(columns='loan_status')
y = leak_test_encoded['loan_status']

# Proper train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, stratify=y, random_state=42
)
xgb = XGBClassifier(n_estimators=100, use_label_encoder=False, eval_metric='logloss', random_state=42)

# Pipeline with scaler + model
pipe = Pipeline([
    ('scaler', StandardScaler()),
    ('model', xgb)
])

pipe.fit(X_train, y_train)
print("Pipeline Accuracy:", pipe.score(X_test, y_test))



Parameters: { "use_label_encoder" } are not used.




Pipeline Accuracy: 0.9335555555555556


# Day 4: 🔁 Model Versioning & Experiment Tracking with MLflow

In [None]:
# Install MLflow & pyngrok
!pip install mlflow --quiet

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.7/24.7 MB[0m [31m77.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m71.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m247.0/247.0 kB[0m [31m17.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m147.8/147.8 kB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.9/114.9 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.0/85.0 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m676.2/676.2 kB[0m [31m34.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m203.4/203.4 kB[0m [31m12.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
df.head()

Unnamed: 0,person_age,person_gender,person_education,person_income,person_emp_exp,person_home_ownership,loan_amnt,loan_intent,loan_int_rate,loan_percent_income,cb_person_cred_hist_length,credit_score,previous_loan_defaults_on_file,loan_status
0,22.0,female,Master,71948.0,0,RENT,35000.0,PERSONAL,16.02,0.49,3.0,561,No,1
1,21.0,female,High School,12282.0,0,OWN,1000.0,EDUCATION,11.14,0.08,2.0,504,Yes,0
2,25.0,female,High School,12438.0,3,MORTGAGE,5500.0,MEDICAL,12.87,0.44,3.0,635,No,1
3,23.0,female,Bachelor,79753.0,0,RENT,35000.0,MEDICAL,15.23,0.44,2.0,675,No,1
4,24.0,male,Master,66135.0,1,RENT,35000.0,MEDICAL,14.27,0.53,4.0,586,No,1


In [None]:
# Example Setup with Logistic Regression (Loan Dataset)
import mlflow
import mlflow.sklearn
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd

# Load data
df
df = pd.get_dummies(df, drop_first=True)     # categorical

X_1 = df.drop('loan_status', axis=1)
y_1 = df['loan_status']
x_train, x_test, y_train, y_test = train_test_split(X_1, y_1, test_size=0.2, stratify=y_1, random_state=42)

# Scale
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(x_train)
X_test_scaled = scaler.transform(x_test)

In [None]:
# Set MLflow tracking to local folder
mlflow.set_tracking_uri("file:/content/mlruns")
mlflow.set_experiment("loan_approval_experiment")


2025/07/23 18:22:11 INFO mlflow.tracking.fluent: Experiment with name 'loan_approval_experiment' does not exist. Creating a new experiment.


<Experiment: artifact_location='file:///content/mlruns/229885467265834607', creation_time=1753294931659, experiment_id='229885467265834607', last_update_time=1753294931659, lifecycle_stage='active', name='loan_approval_experiment', tags={}>

In [None]:
# End any previous run if still active
mlflow.end_run()

In [None]:
# Train + Log with MLflow
with mlflow.start_run(run_name="RandomForest"):
    model = RandomForestClassifier(n_estimators=100, random_state=42)
    model.fit(x_train, y_train)

    y_pred = model.predict(x_test)
    acc = accuracy_score(y_test, y_pred)

    mlflow.log_param("model_type", "RandomForest")
    mlflow.log_param("n_estimators", 100)
    mlflow.log_metric("accuracy", acc)

    mlflow.sklearn.log_model(model, "model")

    print(f"✅ Model trained. Accuracy: {acc:.4f}")




✅ Model trained. Accuracy: 0.9286


In [None]:
# try model 2
with mlflow.start_run(run_name="LogisticRegression"):
  model = LogisticRegression()
  model.fit(x_train, y_train)
  y_pred = model.predict(x_test)
  acc = accuracy_score(y_test, y_pred)
  mlflow.log_param("model_type", "LogisticRegression")
  mlflow.log_metric("accuracy", acc)
  mlflow.sklearn.log_model(model, "model")
  print(f"✅ Model trained. Accuracy: {acc:.4f}")


lbfgs failed to converge (status=1):
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression



✅ Model trained. Accuracy: 0.8229


In [None]:
# ✅ View results from MLflow experiments
experiment = mlflow.get_experiment_by_name("loan_approval_experiment")
runs_df = mlflow.search_runs(experiment_ids=[experiment.experiment_id])

# Show key results
runs_df[['run_id', 'params.model_type', 'params.n_estimators', 'metrics.accuracy']]


Unnamed: 0,run_id,params.model_type,params.n_estimators,metrics.accuracy
0,6b6417bfe72a4bfc9f5b3869b14c382a,LogisticRegression,,0.822889
1,ddf78fc488014094943e3174d64d88bc,LogisticRegression,,0.928556
2,f4cbc69900cb4d5f8cbc64bc9761b54e,RandomForest,100.0,0.928556
3,feb11ed8c1c34f2f97b26b0bf6fbc0f0,LogisticRegression,,0.928556
4,f4833e2124634686bedd6ef1a6253574,RandomForest,100.0,0.928556
5,7008c4a858be4d7eb1e7dcc4753b6feb,,,
6,f6145cfae7e645d482e5be19e54bf420,,,


In [None]:
#sort the top runs
runs_df.sort_values("metrics.accuracy", ascending=False)

Unnamed: 0,run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.accuracy,params.model_type,params.n_estimators,tags.mlflow.runName,tags.mlflow.source.name,tags.mlflow.source.type,tags.mlflow.user
1,ddf78fc488014094943e3174d64d88bc,229885467265834607,FINISHED,file:///content/mlruns/229885467265834607/ddf7...,2025-07-23 18:39:54.048000+00:00,2025-07-23 18:40:00.418000+00:00,0.928556,LogisticRegression,,LogisticRegression,/usr/local/lib/python3.11/dist-packages/colab_...,LOCAL,root
3,feb11ed8c1c34f2f97b26b0bf6fbc0f0,229885467265834607,FINISHED,file:///content/mlruns/229885467265834607/feb1...,2025-07-23 18:30:13.165000+00:00,2025-07-23 18:39:43.777000+00:00,0.928556,LogisticRegression,,illustrious-bug-735,/usr/local/lib/python3.11/dist-packages/colab_...,LOCAL,root
2,f4cbc69900cb4d5f8cbc64bc9761b54e,229885467265834607,FINISHED,file:///content/mlruns/229885467265834607/f4cb...,2025-07-23 18:39:45.558000+00:00,2025-07-23 18:39:54.031000+00:00,0.928556,RandomForest,100.0,RandomForest,/usr/local/lib/python3.11/dist-packages/colab_...,LOCAL,root
4,f4833e2124634686bedd6ef1a6253574,229885467265834607,FINISHED,file:///content/mlruns/229885467265834607/f483...,2025-07-23 18:23:27.248000+00:00,2025-07-23 18:23:35.876000+00:00,0.928556,RandomForest,100.0,nervous-fowl-431,/usr/local/lib/python3.11/dist-packages/colab_...,LOCAL,root
0,6b6417bfe72a4bfc9f5b3869b14c382a,229885467265834607,FINISHED,file:///content/mlruns/229885467265834607/6b64...,2025-07-23 18:41:35.930000+00:00,2025-07-23 18:41:42.370000+00:00,0.822889,LogisticRegression,,LogisticRegression,/usr/local/lib/python3.11/dist-packages/colab_...,LOCAL,root
5,7008c4a858be4d7eb1e7dcc4753b6feb,229885467265834607,FAILED,file:///content/mlruns/229885467265834607/7008...,2025-07-23 18:23:19.274000+00:00,2025-07-23 18:23:25.228000+00:00,,,,adaptable-fox-584,/usr/local/lib/python3.11/dist-packages/colab_...,LOCAL,root
6,f6145cfae7e645d482e5be19e54bf420,229885467265834607,FAILED,file:///content/mlruns/229885467265834607/f614...,2025-07-23 18:22:48.385000+00:00,2025-07-23 18:22:48.396000+00:00,,,,whimsical-finch-729,/usr/local/lib/python3.11/dist-packages/colab_...,LOCAL,root


In [None]:
# ✅ Export MLflow results to CSV
runs_df.to_csv("/content/mlflow_results.csv", index=False)

#  Day 5: 🚀 Deploy Your Model as an API Using Flask

In [None]:
# train model.ipynb............. creats on jupyter

# Separate target
X = df.drop("loan_status", axis=1)
y = df["loan_status"]

# Column categories
#categorical_cols = X.select_dtypes(include="object").columns.tolist()
#numerical_cols = X.select_dtypes(exclude="object").columns.tolist()

# ONLY done before pipeline creation, during training
categorical_cols = [
    "person_gender", "person_education", "person_home_ownership",
    "loan_intent", "previous_loan_defaults_on_file"
]

numerical_cols = [
    "person_age", "person_income", "person_emp_exp",
    "loan_amnt", "loan_int_rate", "loan_percent_income",
    "cb_person_cred_hist_length", "credit_score"
]


# Split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)

# Define transformers
preprocessor = ColumnTransformer([
    ("num", StandardScaler(), numerical_cols),
    ("cat", OneHotEncoder(drop="first", handle_unknown='ignore'), categorical_cols)
])

# Create pipeline
pipeline = Pipeline([
    ("preprocess", preprocessor),
    ("model", LogisticRegression())
])

# Train
pipeline.fit(x_train, y_train)

# Save full pipeline (includes scaler + encoder + model)
save_path = r"D:\Machine learning\Machine-Learning-Projects-Showcase-\AdvancedML & Feature Drift\deploy\deployment"
os.makedirs(save_path, exist_ok=True)
joblib.dump(pipeline, os.path.join(save_path, "loan_pipeline.pkl"))


In [None]:
# app.py...............creats on colab ....... save as .py

from flask import Flask, request, render_template
import joblib
import pandas as pd

app = Flask(__name__)
pipeline = joblib.load("loan_pipeline.pkl")

# 🏠 Route to load HTML form
@app.route('/')
def home():
    return render_template('index.html')  # renders HTML form

# 📤 Route to handle prediction from form
@app.route('/predict', methods=['POST'])
def predict():
    try:
        data = request.form.to_dict()  # Get form data as dictionary

        # Convert numeric fields
        for key in data:
            try:
                data[key] = float(data[key])
            except:
                pass  # Leave as string for categorical columns

        df = pd.DataFrame([data])  # Single row DataFrame
        prediction = pipeline.predict(df)
        result = int(prediction[0])
        return render_template('index.html', prediction=result)

    except Exception as e:
        return render_template('index.html', prediction=f"Error: {e}")

if __name__ == '__main__':
    app.run(debug=True)


## Open terminal inside the /deployment/ folder:
    python app.py  #run app file

Running on http://127.0.0.1:5000/

## test

In [None]:
# colab test purpose
import requests
from bs4 import BeautifulSoup

url = "http://127.0.0.1:5000/predict"

In [None]:
# pred : 1
sample_input = {
  "person_age": 22.0,
  "person_gender": "female",
  "person_education": "Master",
  "person_income": 71948.0,
  "person_emp_exp": 0,
  "person_home_ownership": "RENT",
  "loan_amnt": 35000.0,
  "loan_intent": "PERSONAL",
  "loan_int_rate": 16.02,
  "loan_percent_income": 0.49,
  "cb_person_cred_hist_length": 3.0,
  "credit_score": 561,
  "previous_loan_defaults_on_file": "No"
}

response = requests.post(url, data=sample_input)

soup = BeautifulSoup(response.text, "html.parser")
prediction = soup.find("h3")  # or whatever tag your HTML uses
print("Prediction from Web App:", prediction.text if prediction else "Not Found")

# Day 6: Build an interactive Streamlit Web App

In [None]:
# 📄 app.py

import streamlit as st
import joblib
import numpy as np
import pandas as pd

In [None]:
# Load the full pipeline
pipeline = joblib.load("loan_pipeline.pkl")

In [None]:
# streamlit setup
st.title("🏦 Loan Approval Prediction App")
st.markdown("🔍 Enter loan application details:")

# Input Fields
person_age = st.number_input("👤 Age", min_value=18, max_value=100, value=30)
person_gender = st.selectbox("👩‍🦰 Gender", ["female", "male"])
person_education = st.selectbox("🎓 Education", ['Master', 'High School', 'Bachelor', 'Associate', 'Doctorate', "Other"])
person_income = st.number_input("💰 Income", min_value=0.0, step=1000.0)
person_emp_exp = st.number_input("📊 Employment Experience (Years)", min_value=0, max_value=50)
person_home_ownership = st.selectbox("🏠 Home Ownership", ["RENT", "OWN", "MORTGAGE", "OTHER"])

loan_amnt = st.number_input("💸 Loan Amount", min_value=1000.0, step=1000.0)
loan_intent = st.selectbox("📄 Loan Intent", ["EDUCATION", "MEDICAL", "VENTURE", "PERSONAL", "HOMEIMPROVEMENT", "DEBTCONSOLIDATION"])
loan_int_rate = st.number_input("📈 Interest Rate (%)", min_value=0.0, max_value=100.0, step=0.1)
loan_percent_income = st.number_input("📉 Loan % of Income", min_value=0.0, max_value=1.0, step=0.01)

cb_person_cred_hist_length = st.number_input("📜 Credit History Length (Years)", min_value=0.0, max_value=100.0, step=0.1)
credit_score = st.number_input("📊 Credit Score", min_value=300, max_value=850, step=1)

previous_loan_defaults_on_file = st.selectbox("📂 Previous Loan Defaults?", ["Yes", "No"])


In [None]:
# Predict button
if st.button("🔮 Predict Loan Approval"):
    input_dict = {
        "person_age": person_age,
        "person_gender": person_gender,
        "person_education": person_education,
        "person_income": person_income,
        "person_emp_exp": person_emp_exp,
        "person_home_ownership": person_home_ownership,
        "loan_amnt": loan_amnt,
        "loan_intent": loan_intent,
        "loan_int_rate": loan_int_rate,
        "loan_percent_income": loan_percent_income,
        "cb_person_cred_hist_length": cb_person_cred_hist_length,
        "credit_score": credit_score,
        "previous_loan_defaults_on_file": previous_loan_defaults_on_file
    }

    # Convert to DataFrame
    input_df = pd.DataFrame([input_dict])

    # Predict
    prediction = pipeline.predict(input_df)[0]

    if prediction == 1:
        st.success("✅ Loan Approved!")
    else:
        st.error("❌ Loan Not Approved.")

## ✅ Steps: Run the App
- Open terminal in this folder and run:

- ```streamlit run app.py```
- Visit:
📍 http://localhost:8501

**You’ll see a full web UI for your ML model — no HTML/CSS needed! 🤯**