The project entails building a model using python programming language that predicts if someone who seeks a loan might be a defaulter or a non-defaulter. We have several independent variables like, checking account balance, credit history, purpose, loan amount etc. Ensemble Models such as Bagging, AdaBoosting, GradientBoost, XGBoost, Random Forest etc will be used for the modelling
Kindly take note of the following libaries and models below:
import warnings
warnings.filterwarnings("ignore")
IMPORT THE FOLLOWING LIBRARIES:
import pandas as pd
import numpy as np
from sklearn import metrics
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
#from sklearn.feature_extraction.text import CountVectorizer #DT does not take strings as input for the model fitting step
STEPS TAKEN IN THIS PROJECT:
- Data Collection
- Data Preprocessing
- Exploratory Data Analysis (EDA)
- Feature Engineering
- Data Split (Test and Train)
- Model Selection
- Model Evaluation
- Model Fine-Tuning
- Final Model Testing