## Model Deployment
**Objective:**  
- Load the trained model and preprocessing objects.  
- Define functions for preprocessing input data and making predictions.  
- Map predicted default probabilities to credit scores.  
- Test the model with sample input data.  

### Import Libraries

In [175]:
import joblib
import pandas as pd
import numpy as np

### Load the Model and Preprocessing Objects

In [176]:

model = joblib.load('../models/credit_model.pkl')
scaler = joblib.load('../models/scaler.pkl')

### Prepare Sample Input Data

#### Define the Input Schema

In [177]:
feature_names = [
    'RevolvingUtilizationOfUnsecuredLines',
    'age',
    'DebtRatio',
    'MonthlyIncome',
    'NumberOfTime30-59DaysPastDueNotWorse',
    'NumberOfOpenCreditLinesAndLoans',
    'NumberOfTimes90DaysLate',
    'NumberRealEstateLoansOrLines',
    'NumberOfTime60-89DaysPastDueNotWorse',
    'NumberOfDependents',
    'TotalPastDue'
]

### Define Preprocessing and Prediction Functions

#### Preprocessing Function

In [178]:
def preprocess_input(input_data):
    """
    Preprocess the input data for prediction.

    Args:
        input_data (dict): User input data as a dictionary.

    Returns:
        DataFrame: Preprocessed input data ready for prediction.
    """
    # Convert the input dictionary to a DataFrame
    input_df = pd.DataFrame([input_data], columns=feature_names)

    # Feature Engineering
    # Create 'TotalPastDue'
    input_df['TotalPastDue'] = (
        input_df['NumberOfTime30-59DaysPastDueNotWorse'] +
        input_df['NumberOfTime60-89DaysPastDueNotWorse'] +
        input_df['NumberOfTimes90DaysLate']
    )

    # Handle missing 'MonthlyIncome'
    input_df['MonthlyIncomeMissing'] = input_df['MonthlyIncome'].isna().astype(int)

    # Fill missing 'MonthlyIncome' with 0
    input_df['MonthlyIncome'].fillna(0, inplace=True)  

    # Binning 'age' into 'AgeGroup'
    bins = [0, 30, 50, 100]
    labels = ['Young', 'Adult', 'Senior']
    input_df['AgeGroup'] = pd.cut(input_df['age'], bins=bins, labels=labels)

    # One-Hot Encoding
    input_df = pd.get_dummies(input_df, columns=['AgeGroup'], drop_first=True)

    # Define final feature names after preprocessing
    final_feature_names = [
        'RevolvingUtilizationOfUnsecuredLines',
        'age',
        'NumberOfTime30-59DaysPastDueNotWorse',
        'DebtRatio',
        'MonthlyIncome',
        'NumberOfOpenCreditLinesAndLoans',
        'NumberOfTimes90DaysLate',
        'NumberRealEstateLoansOrLines',
        'NumberOfTime60-89DaysPastDueNotWorse',
        'NumberOfDependents',
        'MonthlyIncomeMissing',
        'TotalPastDue',
        'AgeGroup_Adult',
        'AgeGroup_Senior',
    ]

    # Reorder columns to match training data
    input_df = input_df.reindex(columns=final_feature_names, fill_value=0)

    # Apply scaling to numerical features
    numerical_features = [
        'RevolvingUtilizationOfUnsecuredLines',
        'age',
        'DebtRatio',
        'MonthlyIncome',
        'NumberOfTime30-59DaysPastDueNotWorse',
        'NumberOfOpenCreditLinesAndLoans',
        'NumberOfTimes90DaysLate',
        'NumberRealEstateLoansOrLines',
        'NumberOfTime60-89DaysPastDueNotWorse',
        'NumberOfDependents',
        'TotalPastDue'
    ]

    input_df[numerical_features] = scaler.transform(input_df[numerical_features])

    return input_df

#### Probability to Credit Score Function

In [179]:
def probability_to_score(probability, base_point=600, pdo=50, base_odds=50):
    """
    Convert probability of default to a credit score using logistic scaling.
    
    Args:
        probability (float): Predicted probability of default (between 0 and 1).
        base_point (int): Base score corresponding to the base odds.
        pdo (int): Points to Double the Odds.
        base_odds (float): Base odds of non-default (odds at the base score).
        
    Returns:
        int: Calculated credit score.
    """
    import numpy as np
    
    # Ensure probability is within bounds
    probability = np.clip(probability, 1e-6, 1 - 1e-6)
    
    # Calculate odds
    odds = (1 - probability) / probability
    
    # Calculate Factor and Offset
    factor = pdo / np.log(2)
    offset = base_point - factor * np.log(base_odds)
    
    # Calculate score
    score = offset + factor * np.log(odds)
    
    return int(round(score))

#### Prediction Function

In [180]:
def calculate_credit_score(input_data, base_point=600, pdo=50, base_odds=50):
    """
    Process input data, predict probability of default, and calculate credit score.
    
    Args:
        input_data (dict): User input data as a dictionary.
        base_point (int): Base score corresponding to the base odds.
        pdo (int): Points to Double the Odds.
        base_odds (float): Base odds of non-default.
    
    Returns:
        dict: Results containing probability of default and credit score.
    """
    # Preprocess the input data
    input_df = preprocess_input(input_data)
    
    # Make prediction
    probability = model.predict_proba(input_df)[:, 1][0]  # Probability of default
    
    # Calculate credit score using logistic scaling
    credit_score = probability_to_score(
        probability, base_point=base_point, pdo=pdo, base_odds=base_odds
    )
    
    # Prepare result
    result = {
        'Probability of Default': round(probability, 6),
        'Credit Score': credit_score
    }
    
    return result

### Test the Prediction Function

#### Sample Input Data

In [181]:
sample_input = {
    'RevolvingUtilizationOfUnsecuredLines': 0.5,
    'age': 45,
    'NumberOfTime30-59DaysPastDueNotWorse': 1,
    'DebtRatio': 0.3,
    'MonthlyIncome': 5000,
    'NumberOfOpenCreditLinesAndLoans': 5,
    'NumberOfTimes90DaysLate': 0,
    'NumberRealEstateLoansOrLines': 1,
    'NumberOfTime60-89DaysPastDueNotWorse': 0,
    'NumberOfDependents': 2,
}

#### Calculate Credit Score

In [182]:
result = calculate_credit_score(sample_input)

print(f"Probability of Default: {result['Probability of Default']}")
print(f"Calculated Credit Score: {result['Credit Score']}")

Probability of Default: 0.1331620067358017
Calculated Credit Score: 453


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  input_df['MonthlyIncome'].fillna(0, inplace=True)
