Skip to content
Branch: master
Find file History
Latest commit 68426d5 Dec 19, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
images Ran with newest commit of featuretools Aug 27, 2018
input Delete requirements.txt Dec 19, 2018
notebooks removed ipynb_checkpoints Dec 19, 2018
scripts Dask notebook ran 8000 seconds Aug 6, 2018 Update Aug 6, 2018

Loan Repayment Prediction

A comparison of automated feature engineering using Featuretools and manual feature engineering for the Home Credit Default Risk machine learning competition currently running on Kaggle.


The notebooks are as follows:

  1. Manual Loan Repayment.ipynb
  2. Automated Loan Repayment.ipynb
  3. Featuretools on Dask.ipynb
  4. Semi-Automated Loan Repayment.ipynb
  5. Feature Selection.ipynb
  6. Results.ipynb contains a number of useful helper functions and in the scripts directory was used for the random search implementation. To generate the final feature matrix, use the Featuretools on Dask notebook or run the script. The script takes nearly a full day to run, while depending on your system, the notebook can run in a few hours.


The data can be downloaded here.

To run the notebooks, place the following data files in the input directory: application_train.csv, application_test.csv, bureau.csv, bureau_balance.csv, POS_CASH_balance.csv, credit_card_balance.csv, previous_application.csv, and installments_payments.csv. The HomeCredit_columns_description.csv file may be helpful as it contains the data decscriptions.

You can’t perform that action at this time.