# Part B: Loan Risk Calculator

In this part, we implement a function to classify loan risk based on a formula using the dataset `main_loan_base.csv`.

**Formula:**
```
risk_score = (missed_repayments * 2) + (loan_amount / collateral_value) + (interest / 2)
```

**Classification:**
- Score < 15 → LOW
- Score 15–25 → MEDIUM
- Score > 25 → HIGH


In [7]:
import pandas as pd

### Loading the dataset

In [8]:
df = pd.read_csv('datasets/main_loan_base.csv')
df.head()

Unnamed: 0,loan_acc_num,customer_name,customer_address,loan_type,loan_amount,collateral_value,cheque_bounces,number_of_loans,missed_repayments,vintage_in_months,tenure_years,interest,monthly_emi,disbursal_date,default_date
0,LN79307711,Aarna Sura,"09/506, Anand Path, Ongole 646592",Consumer-Durable,21916,4929.47,200,9.17,20638.94,20648.86,1.7,10.1,1012.32,2019-04-14,2020-07-31
1,LN88987787,Amira Konda,"11, Dhaliwal Circle\nRaichur 659460",Two-Wheeler,121184,10254.5,200,6.67,134551.04,134560.46,1.97,11.8,5693.24,2015-04-14,2016-07-30
2,LN78096023,Eshani Khosla,H.No. 31\nAtwal Street\nKatihar-037896,Car,487036,116183.86,200,9.81,490405.74,490410.18,2.43,14.6,16788.02,2015-01-10,2015-04-18
3,LN56862431,Divij Kala,"766, Gulati Marg\nPudukkottai-051396",Two-Wheeler,52125,10310.05,200,9.96,46195.89,46197.25,1.61,9.6,2395.69,2018-02-07,2018-09-13
4,LN77262680,Vaibhav Bir,"55/73, Sachdev Marg\nDharmavaram-332966",Consumer-Durable,8635,1051.25,200,9.01,7813.47,7819.9,1.64,9.6,396.87,2014-12-25,2016-02-20


In [None]:
# Function to calculate score
# here we have divided missed repayments by 1000 to avoid large numbers ()scaling 
def calculate_risk_score(row):
    try:
        score = ((row['missed_repayments'] / 1000) * 2) + (row['loan_amount'] / row['collateral_value']) + (row['interest'] / 2)
        return score
    except ZeroDivisionError:
        return None

# Function to classify based on score
def classify_loan_risk(score):
    if score is None:
        return "INVALID"
    elif score < 15:
        return "LOW"
    elif 15 <= score <= 25:
        return "MEDIUM"
    else:
        return "HIGH"

In [63]:
df['risk_score'] = df.apply(calculate_risk_score, axis=1)
df['risk_level'] = df['risk_score'].apply(classify_loan_risk)

### Sample Output

In [64]:
"Sample output"

'Sample output'

In [67]:
print(df[['loan_acc_num', 'loan_amount', 'collateral_value', 'missed_repayments', 'interest', 'risk_score', 'risk_level']].sample(10, random_state=79
                                                                                                                                  ))

      loan_acc_num  loan_amount  collateral_value  missed_repayments  \
36901   LN15252116        76323           1195.44           51797.05   
29718   LN37366445        68902          11370.79          111183.19   
21659   LN87581935        76801          10726.08           56096.75   
8172    LN52579019      1870228         559802.35         2514449.00   
430     LN99410639       361728           9324.40          352295.97   
5967    LN44313121         3237            961.61            1570.40   
42493   LN85672805       177803          51363.38           75723.32   
24400   LN34530347       114128           6366.60          156124.09   
40152   LN44772564         8429           1792.96            4695.68   
41969   LN79362062        85869           4531.46           72519.21   

       interest   risk_score risk_level  
36901      14.4   174.639211       HIGH  
29718       9.2   233.025941       HIGH  
21659      11.1   124.903711       HIGH  
8172       14.0  5039.238872       HIGH