Skip to content

Analysing and Building Financial Risk System for Bondora (P2P Lending) using Supervised Learning Classification Algorithms

Notifications You must be signed in to change notification settings

Srimathy-S/Financial-Risk-System

Repository files navigation

Bondora P2P Lending Analysis and Prediction

Project Overview

This project focuses on analyzing and predicting loan defaults for Bondora, a leading European peer-to-peer (P2P) lending company. By preprocessing the dataset, transforming target variables, and creating a binary classification model, we aim to provide actionable insights into reducing financial risks for lenders and borrowers.


Project Goals

  • Clean and preprocess the dataset to handle missing values, duplicates, and outliers.
  • Create a binary target variable (Default or Not Default) based on loan statuses.
  • Encode categorical variables and transform the data into a model-ready format.
  • Perform exploratory data analysis (EDA) to understand key features influencing loan defaults.
  • Build predictive models to assess default risks.

Dataset Details

The dataset contains historical loan data from Bondora, including:

  • Loan statuses
  • Borrower details
  • Loan amount and interest rates
  • Payment history and more

Data Link

The data required for preprocessing, cleaning and labeling tasks can be downloaded from the following Google Drive link:

Download Data

Target Variable

The Status column is transformed into a binary variable:

  • 1 (Default): Includes statuses like "Charged Off," "Late," and "Defaulted."
  • 0 (Not Default): Includes statuses like "Fully Paid" and "Current."

Data Preprocessing Steps

1. Data Cleaning

  • Handle missing values using appropriate imputation techniques.
  • Remove duplicates and standardize column names.
  • Convert columns (e.g., date columns) to their appropriate formats.

2. Target Variable Transformation

  • Map the Status column into binary values:
    • 1 for loan defaults.
    • 0 for non-defaults.

3. Data Encoding

  • Apply label encoding for binary categorical columns.
  • Use one-hot encoding for multi-category columns.

4. Outlier Detection and Handling

  • Detect outliers in numeric columns using statistical methods.
  • Cap extreme values to the 1st and 99th percentiles.

Modeling and Analysis

The processed dataset will be used to build classification models (e.g., Logistic Regression, Random Forest, XGBoost) to predict loan defaults. The models will be evaluated on metrics like:

  • Accuracy
  • Precision
  • Recall
  • F1 Score

How to Use This Repository

  1. Clone the repository:
    git clone https://github.com/Technocolabs100/Analysin-and-Building-Financial-Risk-System-For-P2P-Lending.git
  2. Install the required Python libraries:
    pip install -r requirements.txt

Technologies Used

  • Programming Language: Python
  • Libraries and Tools:
    • Pandas, NumPy (Data Preprocessing)
    • Scikit-learn (Modeling)
    • Matplotlib, Seaborn (Visualization)
    • Jupyter Notebook
    • PowerBI and Tableau Dashboards (Visualization)

Contributing

We welcome contributions! If you'd like to contribute:

  1. Fork this repository.
  2. Create a new branch.
  3. Commit your changes and push to your branch.
  4. Create a pull request, and we’ll review it.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Analysing and Building Financial Risk System for Bondora (P2P Lending) using Supervised Learning Classification Algorithms

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published