# 🏆 Predicting the NBA Finals Winner

Brayden Stach, Julian Loutzenhiser, Katherine Nunn

---

# 🛠️ Imports

In [2]:
import pandas as pd

---

# 📝 Project Plan

We are going to be analyzing team data in order to come up with a power ranking of all of the basketball teams in the Playoffs. We will then compare certain matchups based on this ranking and use predictive modeling in order to predict what team is most likely to win it all.

1) 📥 Load NBA team data
2) 🧹 Clean the dataset (any empty columns / rows)
3) 🏆 Use an all around statistic such as SRS (SRS = Simple Rating System) to try to come up with a power ranking based on the regular season
4) 📈 Calculate an average of how much each team has done better / worse in the playoffs over the last ~5 years
5) 🧠 Integrate the % that each team does better in the playoffs with their power ranking score to come up with a playoff power ranking
6) 🥇 Compare each teams playoff power ranking against each other to predict who will win

This is a very simple plan that would be a good spot to start I believe. We will probably have to get additional data such as player data in order to make it more accurate. Ideally, we would also take into account trades that happen midway through the season looking more at the second half of games played. It might also be important to look at injuries, team age, experience in the playoffs, home court advantage, and other factors that might influence a teams ability to win.

---

# 🎯 Research Questions

**Main Research Question:**
- ➡️ Can we predict the winner of the 2025 NBA Finals based on team regular season performance, adjusted for playoff success trends?

**Related Research Questions:**
1. 📈 How well does the Simple Rating System (SRS) predict playoff success compared to regular season wins alone?
2. 🏀 If we are required to add player data in order to do the Main Research Question, an additional bonus fun question to answer would be who deserves the MVP award this season based on stats?

---

# 📊 Data Description

Our dataset contains information about NBA teams' performance during the regular season.  
It includes 31 columns and one row per team.

The key features are:

| Column Name   | Description |
|---------------|-------------|
| Rk            | Unique ID |
| Team          | Name of the NBA team |
| Age           | Average age of players on the team |
| W             | Total wins during the regular season |
| L             | Total losses during the regular season |
| PW            | Pythagorean wins (estimated wins based on points scored and allowed) |
| PL            | Pythagorean losses |
| MOV           | Margin of victory (average point differential per game) |
| SOS           | Strength of schedule (relative difficulty of opponents), 0 is average, the larger the number the tougher the schedule |
| SRS           | Simple Rating System (MOV adjusted for SOS — a measure of overall team strength), very good all-in-one statistic |
| ORtg          | Offensive Rating (points scored per 100 possessions) |
| DRtg          | Defensive Rating (points allowed per 100 possessions) |
| NRtg          | Net Rating (Offensive Rating - Defensive Rating, overall team efficiency) |
| Pace          | Number of possessions per 48 minutes (how fast the team plays) |
| FTr           | Free Throw Rate (ratio of free throws to field goal attempts) |
| 3PAr          | Three-Point Attempt Rate (percentage of shots that are three-pointers) |
| TS%           | True Shooting Percentage (adjusted shooting efficiency including free throws and threes) |
| **Offensive Four Factors** |  |
| eFG%          | Effective field goal percentage (adjusted for 3-pointers being worth more) |
| TOV%          | Turnover percentage (percentage of possessions ending in a turnover) |
| ORB%          | Offensive rebounding percentage (offensive rebounds per available opportunity) |
| FT/FGA        | Free throws per field goal attempt |
| **Defensive Four Factors** |  |
| eFG%.1        | Opponent effective field goal percentage |
| TOV%.1        | Opponent turnover percentage |
| DRB%          | Defensive rebounding percentage (defensive rebounds per available opportunity) |
| FT/FGA.1      | Opponent free throws per field goal attempt |



---

**General Observations:**
- 🛠️ There are a few unnamed columns (`Unnamed: 22`, `Unnamed: 27`) that appear to contain only missing values (NaN) and may need to be dropped.
- 🏟️ Columns related to attendance and arenas (`Arena`, `Attend.`, `Attend./G`) may not be highly relevant to predicting playoff outcomes and could also be dropped during cleaning. On top of this we could also look to drop a few other columns if we end up not using them (because SRS is a combined metric that does a lot of the work for us).
- 📈 Important metrics for our project include **SRS**, **MOV**, and **SOS**, which relate directly to team strength and game performance.


In [12]:
# Load the data
df = pd.read_csv('teamdata.csv', skiprows=1)

# Drop the league average row
df = df[df['Team'] != 'League Average']

# Reset the index after dropping
df = df.reset_index(drop=True)

# Now safe to work with
df.head(32)

Unnamed: 0,Rk,Team,Age,W,L,PW,PL,MOV,SOS,SRS▼,ORtg,DRtg,NRtg,Pace,FTr,3PAr,TS%,Unnamed: 17,eFG%,TOV%,ORB%,FT/FGA,Unnamed: 22,eFG%.1,TOV%.1,DRB%,FT/FGA.1,Unnamed: 27,Arena,Attend.,Attend./G
0,1.0,Oklahoma City Thunder*,24.8,68.0,14.0,68,14,12.87,-0.16,12.7,120.3,107.5,12.8,100.0,0.22,0.419,0.593,,0.56,10.3,24.2,0.18,,0.513,14.9,74.6,0.211,,Paycom Center,754832,17973
1,2.0,Cleveland Cavaliers*,26.6,64.0,18.0,62,20,9.54,-0.73,8.81,121.7,112.2,9.5,99.8,0.241,0.457,0.607,,0.578,11.6,25.9,0.187,,0.528,12.6,74.8,0.181,,Rocket Arena,796712,19432
2,3.0,Boston Celtics*,28.9,61.0,21.0,62,20,9.11,-0.83,8.28,120.6,111.1,9.5,95.7,0.212,0.536,0.591,,0.561,10.8,25.7,0.169,,0.522,11.6,76.0,0.154,,TD Garden,785396,19156
3,4.0,Minnesota Timberwolves*,27.2,49.0,33.0,53,29,5.0,0.15,5.15,116.6,111.5,5.1,97.3,0.249,0.455,0.588,,0.554,13.0,25.8,0.196,,0.532,13.2,75.1,0.178,,Target Center,772249,18835
4,5.0,Houston Rockets*,24.9,52.0,30.0,52,30,4.51,0.45,4.97,115.3,110.8,4.5,98.6,0.242,0.384,0.553,,0.523,11.8,31.7,0.178,,0.528,12.8,76.2,0.186,,Toyota Center,716853,17484
5,6.0,Los Angeles Clippers*,29.7,50.0,32.0,53,29,4.66,0.18,4.84,115.1,110.3,4.8,97.5,0.251,0.387,0.589,,0.554,13.4,24.4,0.2,,0.536,13.7,77.5,0.189,,Intuit Dome,679593,16575
6,7.0,Memphis Grizzlies*,24.7,48.0,34.0,52,30,4.85,-0.06,4.79,117.7,113.0,4.7,103.3,0.249,0.406,0.588,,0.554,13.1,28.7,0.196,,0.533,12.9,74.9,0.206,,FedEx Forum,683067,16660
7,8.0,Denver Nuggets*,27.0,50.0,32.0,50,32,3.89,0.08,3.97,119.9,116.0,3.9,99.8,0.259,0.356,0.604,,0.573,12.5,26.7,0.2,,0.542,11.3,74.6,0.173,,Ball Arena,811211,19786
8,9.0,New York Knicks*,27.5,51.0,31.0,51,31,4.1,-0.51,3.59,118.5,114.3,4.2,96.7,0.232,0.382,0.589,,0.556,11.9,26.0,0.186,,0.549,13.1,74.5,0.176,,Madison Square Garden (IV),811794,19800
9,10.0,Golden State Warriors*,28.6,48.0,34.0,49,33,3.3,0.25,3.56,115.0,111.7,3.3,98.7,0.244,0.469,0.568,,0.536,12.3,27.3,0.187,,0.541,14.1,75.6,0.189,,Chase Center,740624,18064
