_________________________________________________________
## Title : Boxing Fighter Pay - Multiple Regression

- Author: Izaan Khudadad
- Email : ikhudada@charlotte.edu
- Affiliation: University of North Carolina at Charlotte

Categories:

- Correlation
- Multiple Regression
- Summary Statistics 
- Linear Regression 
_________________________________________________________

### Introduction

In the following activity, you will use data compiled by Peter Anderson from the Nevada State Athletic Commission(NSAC) and the California State Athletic Commission(CSAC). The dataset covers professional boxing fights held between 2009 and 2017, which includes individual fighters, their opponents, fight characteristics, broadcasting networks, and fighter earnings. 

Each observation represents a single fighter in a given bout, forming a panel structure that enables tracking of fighters over time.

Using this data, you will work on how various variables are correlated to fighter purse amount, as well as learn about how to determine the quality of a regression model.

By the end of the activity you should be able to: 
1. Use Python to create models based on more than two predictors. 
2. Understand what Root Mean Squared Error(RMSE) is in the context of multiple regression. 
3. Look at Residual plots to determine the quality of linear regression models. 
4. Use a variety of Python Libraries such as MatplotLib, Numpy, and Seaborn in creating models. 


### Data

The dataset includes over 4,600 fight entries and more than 1,200 unique professional boxers. Each row represents one fighter in one fight.

[Boxing Fighter Pay xlsx](https://github.com/izaan-khudadad/Boxing_Fighter_Pay/blob/main/Boxing_Pay_data.xlsx)

[Boxing Fighter Pay CSV](https://github.com/izaan-khudadad/Boxing_Fighter_Pay/blob/main/Boxing_Pay_data%20(1).csv)

<details>
<summary><b>Variable Descriptions</b></summary>

| Variable | Description |
|--------------------|-----------------------------------------------------------------------------|
| Boxer              | Name of the boxer (Last, First)                                             |
| Date               | Date of the fight (YYYY.MM.DD)                                              |
| Venue              | Location where the fight took place                                         |
| Purse              | Reported purse (fighter's earnings) in USD                                 |
| lnRPurse           | Natural logarithm of the purse (for regression use)                         |
| weight             | Weight of the boxer (in pounds)                                             |
| Age                | Age of the boxer at the time of the fight                                   |
| Wins               | Number of professional wins prior to the fight                              |
| Losses             | Number of professional losses prior to the fight                            |
| KO                 | Number of professional knockout wins prior to the fight                     |
| W-Title            | Indicator for world title bout (1 = yes, 0 = no)                            |
| PPV                | Fight was on Pay-Per-View (1 = yes, 0 = no)                                 |
| ESPN               | Fight broadcast on ESPN (1 = yes, 0 = no)                                   |
| HBO                | Fight broadcast on HBO (1 = yes, 0 = no)                                    |
| FOX                | Fight broadcast on FOX (1 = yes, 0 = no)                                    |
| TopRank            | Top Rank as the promoter (1 = yes, 0 = no)                                  |
| GoldenBoy          | Golden Boy Promotions as the promoter (1 = yes, 0 = no)                     |
| RDS                | Scheduled number of rounds                                                  |
| Y2009–Y2017        | Year indicator columns for 2009 through 2017 (1 = yes, 0 = no per year)     |


**Data Source**

[Mendeley Data](https://data.mendeley.com/datasets/vpbsd5bryy/1)

### Material
[Multiple Linear Regression, Boxing Purse Data](https://github.com/izaan-khudadad/Boxing_Fighter_Pay/blob/main/MultipleRegressionBoxing.ipynb)



### Conclusion
In the provided material, students explored how to use multiple linear regression to model Boxing data. By building, evaluating, and interpreting regression models, students practiced core data science skills such as:
- Selecting and preparing predictor variables
- Understanding how each predictor influences the response
- Evaluating model performance using RMSE
- Diagnosing fit using residual plots
- Reading and interpreting a full regression summary
