# **Title:** Credit Card Default Prediction

### **Objective:**Predicting credit card defaults is crucial for financial institutions to mitigate risks and make informed lending decisions. The objective of this project is to develop a reliable machine learning model that accurately predicts whether a credit card holder is likely to default on their payments based on various features.

# **Data Sources:**
The data set consists of 2000 samples from each of two categories.
Five variables are

*   Income
*   Age
*   Loan
*   Loan to Income (engineered feature)
*   Default






In [None]:
# Step 1 : import library
import pandas as pd

In [None]:
# Step 2 : import data
default = pd.read_csv('/Credit Card Default.csv')

In [None]:
default.head()


Unnamed: 0,Income,age,Loan,Loan to income,Default
0,66155.9251,59.017015,8106.532131,0.122537,0
1,34415.15397,48.117153,6564.745018,0.190752,0
2,57317.17006,63.108049,8020.953296,0.13994,0
3,42709.5342,45.751972,6103.64226,0.142911,0
4,66952.68885,18.584336,8770.099235,0.13099,1


In [None]:
default.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Income          5 non-null      float64
 1   age             5 non-null      float64
 2   Loan            5 non-null      float64
 3   Loan to income  5 non-null      float64
 4   Default         5 non-null      int64  
dtypes: float64(4), int64(1)
memory usage: 328.0 bytes


In [None]:
default.describe()

Unnamed: 0,Income,age,Loan,Loan to income,Default
count,5.0,5.0,5.0,5.0,5.0
mean,53510.094436,46.915705,7513.194388,0.145426,0.2
std,14460.143977,17.421954,1126.506989,0.026567,0.447214
min,34415.15397,18.584336,6103.64226,0.122537,0.0
25%,42709.5342,45.751972,6564.745018,0.13099,0.0
50%,57317.17006,48.117153,8020.953296,0.13994,0.0
75%,66155.9251,59.017015,8106.532131,0.142911,0.0
max,66952.68885,63.108049,8770.099235,0.190752,1.0


In [None]:
# Count of each category
default['Default'].value_counts()

0    4
1    1
Name: Default, dtype: int64

In [None]:
# Step 3 : define target (y) and features (X)

In [None]:
default.columns

Index(['Income', 'age', 'Loan ', 'Loan to income', 'Default'], dtype='object')

In [None]:
y = default['Default']

In [None]:
X = default.drop(['Default'],axis=1)

In [None]:
# Step 4 : train test split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, train_size=0.7, random_state=2529)

In [None]:
# check shape of train and test sample
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((3, 4), (2, 4), (3,), (2,))

In [None]:
# Step 5 : select model
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()

In [None]:
# Step 6 : train or fit model
model.fit(X_train,y_train)

In [None]:
model.intercept_

array([-0.00044628])

In [None]:
model.coef_

array([[-1.43846594e-04, -2.43664993e-01,  2.21621126e-03,
        -6.18666460e-05]])

In [None]:
# Step 7 : predict model
y_pred = model.predict(X_test)

In [None]:
y_pred

array([0, 0])

In [None]:
# Step 8 : model accuracy
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

In [None]:


confusion_matrix(y_test,y_pred)

array([[2]])

In [None]:
accuracy_score(y_test,y_pred)

1.0

In [None]:
print(classification_report(y_test,y_pred))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00         2

    accuracy                           1.00         2
   macro avg       1.00      1.00      1.00         2
weighted avg       1.00      1.00      1.00         2

