# Peer-to-Peer (P2P) Loan Data Analysis

### **Project Objective**
This project analyzes a dataset of peer-to-peer (P2P) loans to understand loan characteristics, model financial risk, and evaluate the effectiveness of the platform's internal credit rating system. The analysis serves as a portfolio piece demonstrating skills in data exploration, statistical modeling, risk assessment, and reporting in R.

---

### **Dataset**
The analysis uses the `p2ploans.csv` dataset, which contains information on 5,000 loans. Key variables include:
* `id`: Unique loan identifier.
* `interest_rate`: The loan's interest rate.
* `yearly_payment`: The fixed annual payment made by the borrower.
* `dti_ratio`: The borrower's debt-to-income ratio.
* `maturity`: The duration of the loan in years.
* `internal_rating`: The platform's proprietary credit rating (AA, A, B, C, D, E, HR).

* `risk_free:` The risk-free rate (in percent), typically based on government securities.

---

### **Analysis Overview**
The project is broken down into four key analytical tasks:

1.  **Descriptive Analysis of Loan Payments:**
    * Investigated the statistical distribution of yearly payments using descriptive statistics (mean, median, skewness) and histograms.
    * Determined that a triangular distribution is most suitable for modeling these payments.

2.  **Borrower Default Risk Modeling:**
    * Calculated the expected value of loan payments considering default probabilities.
    * Used a Binomial distribution to model the expected number and variance of defaults across the entire loan portfolio.
    * Compared the risk profiles of independent vs. correlated default scenarios.

3.  **Evaluation of the Internal Rating System:**
    * Used linear regression to measure how well the platform's internal ratings explain variations in interest rates.
    * Concluded that the rating system is highly effective, capturing over 96% of the interest rate variability.

4.  **Monte Carlo Simulation of Portfolio Losses:**
    * Ran a 1,000-iteration Monte Carlo simulation to model the distribution of total payments from the two riskiest loan groups (E and HR).
    * Calculated the portfolio's mean expected payment, standard deviation, and the 95% Value at Risk (VaR) to quantify potential losses.

---

### **Tools and Libraries**
* **Language:** R
* **Key Libraries:** `ggplot2` (for visualizations), `dplyr` (for data manipulation), `moments` (for skewness/kurtosis)
* **Reporting:** R Markdown

---

### **How to Use**
The complete analysis, including all code, narrative, and output, is contained in the **`P2P_Loan_Analysis.ipynb`** file. A pre-rendered version is also available in **`P2P_Loan_Analysis.html`** for easy viewing in any web browser.