# Financial risk modeling of European P2P lending platform

## Project Summary

### Abstract


In this project we will be doing a credit risk modeling of peer to peer lending systems. Data for the study has been taken from a leading European P2P lending platform Bondora. The retrieved data is a pool of both defaulted and non-defaulted loans from the time period between 1st March 2009 and 27th January 2020. The data comprises of demographic and financial information of borrowers, and loan transactions.In P2P lending, loans are typically uncollateralized and lenders seek higher returns as a compensation for the financial risk they take. In addition, they need to make decisions under information asymmetry that works in favor of the borrowers. In order to make rational decisions, lenders want to minimize the risk of default of each lending decision, and realize the return that compensates for the risk.


### Background of Understanding the Problem


Peer-to-peer lending has attracted considerable attention in recent years, largely because it offers a novel way of connecting borrowers and lenders. But as with other innovative approaches to doing business, there is more to it than that. Some might wonder, for example, what makes peer-to-peer lending so different–or, perhaps, so much better–than working with a bank, or why has it become popular in many parts of the world.

For investors, "peer-2-peer lending," or "P2P," offers an attractive way to diversify portfolios and enhance long-term performance. When they invest through a peer-to-peer platform, they can profit from an asset class that has proven itself in both good times and bad.

Default risk has long been a significant risk factor to test borrowers’ behaviour in Peer-to-Peer (P2P) lending. In P2P lending, loans are typically uncollateralized and lenders seek higher returns as compensation for the financial risk they take. In addition, they need to make decisions under information asymmetry that works in favor of the borrowers. In order to make rational decisions, lenders want to minimize the risk of default of each lending decision and realize the return that compensates for the risk.



**Reasons why a loan could be rejected:**

* Credit score was too low
* Debt-to-income ratio was too high
* Tried to borrow too much
* Income was insufficient or unstable
* Didn’t meet the basic requirements
* Missing information on the application
* Loan purpose didn’t meet the lender’s criteria





### Importing The Data

#### Importing The Dependencies

In [1]:
import pandas as pd

**We can get the Data from 2 sources**

* [The first source](https://ieee-dataport.org/open-access/bondora-peer-peer-lending-data)
* [The second source](https://www.kaggle.com/datasets/sid321axn/bondora-peer-to-peer-lending-loan-data)

**And finally we have a bonus one if we want up to date data from the official [Bondora website](https://www.bondora.com/en/public-reports)**

**Importing the data**

* We created a [Github Repository](https://github.com/HaniiAtef/Financial-Risk-Modeling-of-a-European-P2P-Lending-Platform) so we can import the data using a link without the need to manually upload it 
* We can also use the [Kaggle](https://www.kaggle.com/datasets/sid321axn/bondora-peer-to-peer-lending-loan-data) API to get the data as well

In [2]:
url = 'https://media.githubusercontent.com/media/HaniiAtef/Financial-Risk-Modeling-of-a-European-P2P-Lending-Platform/main/Bondora_raw.csv'
df = pd.read_csv(url) # reads a CSV file from the given URL and assigns the data to the variable df (This might take some time because the size of the data is large(111 MB))

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


In [3]:
df.head()

Unnamed: 0,ReportAsOfEOD,LoanId,LoanNumber,ListedOnUTC,BiddingStartedOn,BidsPortfolioManager,BidsApi,BidsManual,UserName,NewCreditCustomer,...,PreviousEarlyRepaymentsCountBeforeLoan,GracePeriodStart,GracePeriodEnd,NextPaymentDate,NextPaymentNr,NrOfScheduledPayments,ReScheduledOn,PrincipalDebtServicingCost,InterestAndPenaltyDebtServicingCost,ActiveLateLastPaymentCategory
0,2020-01-27,F0660C80-83F3-4A97-8DA0-9C250112D6EC,659,2009-06-11 16:40:39,2009-06-11 16:40:39,0,0,115.041,KARU,True,...,0,,,,,,,0.0,0.0,
1,2020-01-27,978BB85B-1C69-4D51-8447-9C240104A3A2,654,2009-06-10 15:48:57,2009-06-10 15:48:57,0,0,140.6057,koort681,False,...,0,,,,,,,0.0,0.0,
2,2020-01-27,EA44027E-7FA7-4BB2-846D-9C1F013C8A22,641,2009-06-05 19:12:29,2009-06-05 19:12:29,0,0,319.558,0ie,True,...,0,,,,,,,0.0,0.0,180+
3,2020-01-27,CE67AD25-2951-4BEE-96BD-9C2700C61EF4,668,2009-06-13 12:01:20,2009-06-13 12:01:20,0,0,57.5205,Alyona,True,...,0,,,,,,,0.0,0.0,
4,2020-01-27,9408BF8C-B159-4D6A-9D61-9C2400A986E3,652,2009-06-10 10:17:13,2009-06-10 10:17:13,0,0,319.5582,Kai,True,...,0,,,,,,,0.0,0.0,180+


In [4]:
df.shape

(134529, 112)

In [5]:
df.describe()

Unnamed: 0,LoanNumber,BidsPortfolioManager,BidsApi,BidsManual,ApplicationSignedHour,ApplicationSignedWeekday,VerificationType,LanguageCode,Age,Gender,...,InterestAndPenaltyBalance,NoOfPreviousLoansBeforeLoan,AmountOfPreviousLoansBeforeLoan,PreviousRepaymentsBeforeLoan,PreviousEarlyRepaymentsBefoleLoan,PreviousEarlyRepaymentsCountBeforeLoan,NextPaymentNr,NrOfScheduledPayments,PrincipalDebtServicingCost,InterestAndPenaltyDebtServicingCost
count,134529.0,134529.0,134529.0,134529.0,134529.0,134529.0,134484.0,134529.0,134529.0,134484.0,...,134529.0,134529.0,134529.0,91368.0,58026.0,134529.0,97788.0,97788.0,59129.0,59129.0
mean,944939.2,966.452876,29.111664,559.33259,13.37464,3.907908,2.817257,2.827874,40.819295,0.442097,...,701.567107,1.48762,2868.652401,928.395548,320.743805,0.069903,5.178795,50.126795,5.264702,89.851455
std,478673.8,1355.686016,150.159148,750.360512,4.992375,1.726192,1.407908,1.959802,12.348693,0.636083,...,2514.595572,2.396148,4507.046575,2042.348751,1561.799076,0.359461,7.674427,12.51953,57.800582,287.449052
min,37.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,...,-2.66,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
25%,620679.0,155.0,0.0,96.0,10.0,2.0,1.0,1.0,31.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,36.0,0.0,0.0
50%,923597.0,465.0,0.0,317.0,13.0,4.0,4.0,3.0,40.0,0.0,...,0.0,1.0,396.3541,197.98,0.0,0.0,3.0,60.0,0.0,0.0
75%,1311025.0,1218.0,5.0,729.0,17.0,5.0,4.0,4.0,50.0,1.0,...,202.9,2.0,4250.0,780.95,0.0,0.0,7.0,60.0,0.0,17.33
max,1855339.0,10625.0,7570.0,10630.0,23.0,7.0,4.0,22.0,77.0,2.0,...,64494.77,25.0,53762.0,34077.42,48100.0,11.0,60.0,72.0,3325.33,5295.29


In [10]:
for i in df.columns: print(i)


ReportAsOfEOD
LoanId
LoanNumber
ListedOnUTC
BiddingStartedOn
BidsPortfolioManager
BidsApi
BidsManual
UserName
NewCreditCustomer
LoanApplicationStartedDate
LoanDate
ContractEndDate
FirstPaymentDate
MaturityDate_Original
MaturityDate_Last
ApplicationSignedHour
ApplicationSignedWeekday
VerificationType
LanguageCode
Age
DateOfBirth
Gender
Country
AppliedAmount
Amount
Interest
LoanDuration
MonthlyPayment
County
City
UseOfLoan
Education
MaritalStatus
NrOfDependants
EmploymentStatus
EmploymentDurationCurrentEmployer
EmploymentPosition
WorkExperience
OccupationArea
HomeOwnershipType
IncomeFromPrincipalEmployer
IncomeFromPension
IncomeFromFamilyAllowance
IncomeFromSocialWelfare
IncomeFromLeavePay
IncomeFromChildSupport
IncomeOther
IncomeTotal
ExistingLiabilities
LiabilitiesTotal
RefinanceLiabilities
DebtToIncome
FreeCash
MonthlyPaymentDay
ActiveScheduleFirstPaymentReached
PlannedPrincipalTillDate
PlannedInterestTillDate
LastPaymentOn
CurrentDebtDaysPrimary
DebtOccuredOn
CurrentDebtDaysSecondary