# Business Rules & Decision Logic

This notebook implements the final stage of the loan approval pipeline. 
While Machine Learning (ML) provides a statistical probability of default, financial institutions must also enforce "Hard Rules" (Business Logic) for regulatory compliance and risk management.

In [1]:
import pandas as pd
import numpy as np
import joblib
import os

In [2]:
# Load preprocessed test data
test_df = pd.read_csv("../data/processed/test_data.csv")
X_test = test_df.drop("approval", axis=1)
y_test = test_df["approval"]

# Load both models trained in the previous step
lr_model = joblib.load("../results/logistic_model.pkl")
rf_model = joblib.load("../results/random_forest_model.pkl")

In [3]:
def apply_regulatory_rules(row, ml_prediction):
    """
    Separates hard business constraints from ML predictions.
    This function is designed to be easily exported to a production script.
    
    Returns:
        tuple: (final_decision, reason)
    """
    # Note: Features like 'income' and 'credit_score' are scaled.
    # Values below -1.0 typically represent the bottom ~15% of applicants.
    
    # Rule 1: Minimum Credit Score Floor
    if row.get("credit_score", 0) < -1.5:
        return 0, "Rejected: Credit score below minimum threshold."
    
    # Rule 2: Debt-to-Income Safety Net
    # (Assuming high 'debt' and low 'income' values after scaling)
    if row.get("debt", 0) > 2.0 and row.get("income", 0) < -0.5:
        return 0, "Rejected: High debt-to-income ratio."
    
    # Rule 3: Anti-Fraud / Prior Default
    # Check if prior_default_1 exists (One-Hot Encoded)
    if row.get("prior_default_1", 0) == 1:
        return 0, "Rejected: History of prior default."

    # Final Decision: If no hard rules are triggered, follow the ML Model
    decision = ml_prediction
    reason = "Approved: Meets all criteria." if decision == 1 else "Rejected: ML Model risk assessment."
    
    return decision, reason

In [4]:
# --- 3. Run Decision Pipeline ---

def generate_decisions(model, X_data):
    """Generates a dataframe of final decisions combining ML and Rules."""
    ml_preds = model.predict(X_data)
    
    results = []
    for i in range(len(X_data)):
        row = X_data.iloc[i]
        final_dec, reason = apply_regulatory_rules(row, ml_preds[i])
        results.append({
            "ML_Prediction": ml_preds[i],
            "Final_Decision": final_dec,
            "Decision_Reason": reason
        })
    
    return pd.DataFrame(results)

In [5]:
# Generate results for both models to see the impact
lr_results = generate_decisions(lr_model, X_test)
rf_results = generate_decisions(rf_model, X_test)

In [6]:
# --- 4. Analyze & Compare Impacts ---

print("Impact of Business Rules on Logistic Regression:")
print(lr_results["Decision_Reason"].value_counts())
print("\nImpact of Business Rules on Random Forest:")
print(rf_results["Decision_Reason"].value_counts())

Impact of Business Rules on Logistic Regression:
Decision_Reason
Rejected: ML Model risk assessment.    17
Approved: Meets all criteria.          16
Name: count, dtype: int64

Impact of Business Rules on Random Forest:
Decision_Reason
Rejected: ML Model risk assessment.    17
Approved: Meets all criteria.          16
Name: count, dtype: int64


In [7]:
# --- 5. Export for Final Evaluation ---

# We will use the Random Forest results as our 'Champion' model decisions
final_df = X_test.copy()
final_df["Actual_Approval"] = y_test.values
final_df["ML_Prediction"] = rf_results["ML_Prediction"]
final_df["Final_Decision"] = rf_results["Final_Decision"]
final_df["Decision_Reason"] = rf_results["Decision_Reason"]

In [8]:

# Ensure y_test (the actual outcomes) is added to the results
final_df['approval'] = y_test.values 

# Save to results
os.makedirs("../results", exist_ok=True)
final_df.to_csv("../results/loan_decisions_with_rules.csv", index=False)

print("\nFinal decisions saved to '../results/loan_decisions_with_rules.csv'")
final_df[["ML_Prediction", "Final_Decision", "Decision_Reason"]].head(10)


Final decisions saved to '../results/loan_decisions_with_rules.csv'


Unnamed: 0,ML_Prediction,Final_Decision,Decision_Reason
0,0,0,Rejected: ML Model risk assessment.
1,1,1,Approved: Meets all criteria.
2,0,0,Rejected: ML Model risk assessment.
3,0,0,Rejected: ML Model risk assessment.
4,0,0,Rejected: ML Model risk assessment.
5,1,1,Approved: Meets all criteria.
6,0,0,Rejected: ML Model risk assessment.
7,1,1,Approved: Meets all criteria.
8,0,0,Rejected: ML Model risk assessment.
9,0,0,Rejected: ML Model risk assessment.
