Skip to content

Xixiong-Guo/LandingClubLoanPrediction

Repository files navigation

LandingClubLoanPrediction

Introduction

LendingClub is a US peer-to-peer lending company, headquartered in San Francisco, California. It was the first peer-to-peer lender to register its offerings as securities with the Securities and Exchange Commission (SEC), and to offer loan trading on a secondary market. LendingClub is the world's largest peer-to-peer lending platform.

Data

Data merge and preliminary cleaning

1. Data collection, import and concatenation.ipynb

This notebook includes:

  1. Download the raw data from LendingClub
  2. Concatenate the data from each year/quarter into one dataset
  3. Delete some features with more than 27% missing data. Meanwhile, this can reduce the memory cost and improve the computing efficiency

Get to know all the features

2. Feature explore This notebook includes:

  1. Explore the meanings of all features from LendingClub. Divided the features into 2 categories: Borrower relevant and loan relevant features.
  2. Further delete a few unrelated features after better understanding the dataset.
  3. Encode the target feature, classify whether it is a good or bad loan based on the loan_status.

Missing data impute

3. Missing data imputation This notebook includes:

  1. split the training and test dataset.
  2. Impute the missing data, depending on the category of each feature

EDA (Exploratory data analysis)

4. Categorical variable encode: Explore each categorical feature, and encode if its order matters

To be continued

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published