Loan Repayment Prediction
A comparison of automated feature engineering using Featuretools and manual feature engineering for the Home Credit Default Risk machine learning competition currently running on Kaggle.
The notebooks are as follows:
Manual Loan Repayment.ipynb
Automated Loan Repayment.ipynb
Featuretools on Dask.ipynb
Semi-Automated Loan Repayment.ipynb
utils.py contains a number of useful helper functions and
random_search.py in the
scripts directory was used for the random search implementation. To generate the final feature matrix,
Featuretools on Dask notebook or run the
ft.py script. The script takes nearly
a full day to run, while depending on your system, the notebook can run in a few hours.
The data can be downloaded here.
To run the notebooks, place the following data files in the
HomeCredit_columns_description.csv file may
be helpful as it contains the data decscriptions.