---

# **1. Import Libraries**

In [1]:
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup, Comment

---

# **2. Web-Scrape Data From Basketball-Reference**

Basketball-Reference stores player statistics on **individual player pages**, with multiple tables per page. To scale this process across multiple players and keep everything consistent, we:
1.   Define player URLs
2.   Extract all tables
3.   Extract advanced player statistics
4.   Extract per-Game player statistics
5.   Loop through players and collecting data

---

### **Define Player URLs**
Each Basketball-Reference player page follows a consistent URL structure.

In [2]:
players = {
    "Nikola Jokic": "https://www.basketball-reference.com/players/j/jokicni01.html",
    "Shai Gilgeous-Alexander": "https://www.basketball-reference.com/players/g/gilgesh01.html",
    "Luka Doncic": "https://www.basketball-reference.com/players/d/doncilu01.html",
    "Giannis Antetokounmpo": "https://www.basketball-reference.com/players/a/antetgi01.html",
    "Victor Wembanyama": "https://www.basketball-reference.com/players/w/wembavi01.html"
}

### **Extracting All Tables**
We need to create a function that retrieves every statistics table from a Basketball-Reference player page, including:
*   Tables that are **directly visible in the HTML**
*   Tables that are **hidden inside HTML comments** (which Basketball-Reference uses for playoff and advanced data)
---
### **Why This Step is Necessary**
Basketball-Reference pages are not structured like simple CSV downloads:
*   Many important tables (especially **playoff stats**) are embedded inside **HTML comment blocks**
*   Standard HTML parsing methods will **miss these tables**
*   `pd.read_html(url)` alone is often incomplete or inconsistent
To ensure **no data is missed**, this function explicity extracts **both visible and commented tables.**

In [3]:
def get_all_tables(url):
    headers = {"User-Agent": "Mozilla/5.0"}
    html = requests.get(url, headers=headers).text
    soup = BeautifulSoup(html, "lxml")

    tables = []

    # visible tables
    for table in soup.find_all("table"):
        try:
            tables.append(pd.read_html(str(table))[0])
        except:
            pass

    # commented tables
    comments = soup.find_all(string=lambda text: isinstance(text, Comment))
    for c in comments:
        try:
            tables.extend(pd.read_html(c))
        except:
            pass

    return tables

---
### **Extracting Advanced Player Statistics**
Traditional box score statistics (points, rebounds, assists) describe what a player produces, but they do not fully capture **how impactful a player is on the court.**



---

Advanced player statistics are designed to measure **overall contribution, efficiency, and contextual value.**

---

### **Measuring "Impact"**
Impact is not just about scoring volume, it reflects:
*   Efficiency
*   All-around contribution
*   Effect on team success
*   Performance relative to league context


---


Advanced metrics such as:
*   Player Efficiency Rating (PER) -> summarizes per-minute productivity
*   Win Shares (WS) -> estimates contribution to team wins
*   Box Plus/Minus (BPM) -> measures on-court impact relative to an average player are specifcally designed to capture these dimensions.

Using these metrics allows impact to be evaluated holistically, rather than relying on a single box score category.


---

### **Identifying Peak Impact Seasons**
Advanced statistics are particularly useful for identifying peak seasons, because:
*   They normalize performance across different seasons
*   They reduce bias from team pace or role-specific scoring
*   They highlight efficiency and effectiveness, not just opportunity

By combining multiple advanced metrics into a composite score, the analysis identifies the season where a player's **overall contribution was maximized**, not merely when raw totals were highest.

---

### **Why This Apporach is Appropriate or Career Analysis**
*   Allow fair comparison across different career lengths
*   Adjust for changes in role and playing time
*   Provide context beyond traditional counting stats
*   Support season-level trend analysis

This makes it well-suited for analyzing **career trajectories** and **long-term player impact**, rather than isolated game performances.


In [4]:
def get_advanced(player, url):
    tables = pd.read_html(url)

    advanced_tables = []

    for t in tables:
        # Identify advanced tables by structure, not column subset
        if (
            "Season" in t.columns
            and "PER" in t.columns
            and "WS" in t.columns
            and "BPM" in t.columns
        ):
            advanced_tables.append(t.copy())

    if len(advanced_tables) == 0:
        raise ValueError(f"No advanced tables found for {player}")

    # Basketball-Reference order:
    # 0 = Regular Season Advanced
    # 1 = Playoffs Advanced
    reg = advanced_tables[0]
    reg["season_type"] = "Regular Season"

    if len(advanced_tables) > 1:
        po = advanced_tables[1]
        po["season_type"] = "Playoffs"
        df = pd.concat([reg, po], ignore_index=True)
    else:
        df = reg

    # Remove repeated header rows
    df = df[df["Season"] != "Season"]

    df["player"] = player

    return df

---
### **Extracting Per-Game Player Statistics**
Advanced statistics are well-suited for measuring **overall impact and efficiency,** but they do not fully describe **a player's role on the team.** Per-game statistics provide essential context about how that impact was generated. Per-game metrics capture:
*   **Usage and volume** (points, shot attempts)
*   **Playing time** (minutes per game)
*   **Offensive responsibility** (scoring and playmaking load)
*   **Role changes across seasons**

By incorporating per-game statistics, the analysis can distinguish between:
*   A player becoming **more impactful because they became more efficient**
*   A player becoming **more impactful because their role expanded**

This distinction is critical for understanding **career evolution,** not just peak performance.

---

### **Role Evolution vs. Impact**
A player’s career often progresses through phases:
* Early career → limited role, growing efficiency
* Prime years → high usage and high efficiency
* Later years → reduced usage, maintained efficiency

Advanced metrics show how impactful a player was, but per-game stats explain:
* Whether that impact came from **increased volume**
* Or from **better efficiency in a stable or reduced role**

In [5]:
def get_per_game(player, url):
    tables = pd.read_html(url)

    pg_tables = []

    for t in tables:
        if (
            "Season" in t.columns
            and "PTS" in t.columns
            and "FG" in t.columns
            and "MP" in t.columns
            and "PER" not in t.columns
        ):
            pg_tables.append(t.copy())

    if len(pg_tables) == 0:
        raise ValueError(f"No per-game tables found for {player}")

    # Basketball-Reference order:
    # 0 = Regular Season Per Game
    # 1 = Playoffs Per Game
    reg = pg_tables[0]
    reg["season_type"] = "Regular Season"

    if len(pg_tables) > 1:
        po = pg_tables[1]
        po["season_type"] = "Playoffs"
        df = pd.concat([reg, po], ignore_index=True)
    else:
        df = reg

    df = df[df["Season"] != "Season"]

    df["player"] = player

    return df

---
### **Looping Through Players and Collecting Data**
This block iterates through all selected players and systematically extract both advanced and per-game statistics for each one using the previously defined functions.

Each Basketball-Reference player page must be scraped individually.
Rather than hard-coding logic for each player, this loop:
*   Ensures **consistent processing** across all players
*   Makes the pipeline **scalable** to additional players
*   Prevents manual errors or mismatched logic
*   Keeps advanced and per-game data **separate but aligned**

In [6]:
advanced_all = []
per_game_all = []

for player, url in players.items():
    print(f"Pulling {player}...")

    adv = get_advanced(player, url)
    pg  = get_per_game(player, url)

    advanced_all.append(adv)
    per_game_all.append(pg)

Pulling Nikola Jokic...
Pulling Shai Gilgeous-Alexander...
Pulling Luka Doncic...
Pulling Giannis Antetokounmpo...
Pulling Victor Wembanyama...


---
# **3. Consolidating Player Data into Master Tables**
This step combines the individual player DataFrames collected during the loop into two unified, season-level datasets:
*   `df_advanced` -> all players' advanced statistics
*  `df_per_game` -> all players' per-game statistics

Up to this point, each player's data was stored separately.
To perform cross-player analysis, trend analysis, etc, the data must be consolidated into shared tables.

In [7]:
df_advanced = pd.concat(advanced_all, ignore_index=True)
df_per_game = pd.concat(per_game_all, ignore_index=True)

---
### **Defining Identifier (Join) Colummns**
This defines the **core identifier columns** that uniquely describe each observation in the dataset. These columns establish the **grain** of the data and are used as join keys when merging datasets.

Each row in the final dataset represents:

**One player x one season x one season type x one team**

---
### **Why These Specific Columns Were Chosen**
*   `player` identifies the individual NBA player
*   `Season` represents the NBA season (e.g., 2022-23), which is the primary time dimension.
*   `season_type` distinguishes Regular Season vs. Playoffs
*   `Team` preserves team context, especially important for seasons where a player was traded mid-season

Together, these columns uniquely define each season-level record.

In [8]:
id_cols = ["player", "Season", "season_type", "Team"]

---
### **Prefixing Advanced Statistics Columns**
This systematically renames all advanced-statistics columns by adding an `adv_` prefix, while leaving identifier columns unchanged.

As a result:
*   Identifier columns remain clean and consistent
*   Advanced metrics are clearly labeled and distinguishable


In [9]:
df_advanced_pref = df_advanced.rename(
    columns={
        c: f"adv_{c}"
        for c in df_advanced.columns
        if c not in id_cols
    }
)

---
### **Prefixing Per-Game Statistics Columns**
This systematically renames all **per-game statistics columns** by adding a `pg_` prefix, while leaving identifier columns unchanged.

This clearly distinguishes **per-game role and volume metrics** from advanced efficiency metrics in the final dataset.

In [10]:
df_per_game_pref = df_per_game.rename(
    columns={
        c: f"pg_{c}"
        for c in df_per_game.columns
        if c not in id_cols
    }
)

---
### **Merging Advanced and Per-Game Statistics into a Unified Dataset**
This step merges the advanced and per-game datasets into a single, season-level fact table using the predefined identifier columns.

---
### **Why This Merge is Necessary**
Advanced statistics and per-game statistics capture different aspects of player performance:
*   **Advanced stats** -> efficiency and impact (PER, BPM, WS)
*   **Per-game stats** -> role and volume (PTS, MP, AST)

In [11]:
df_combined = df_advanced_pref.merge(
    df_per_game_pref,
    on=id_cols,
    how="left"
)

In [12]:
df_combined = (
    df_combined
    .sort_values(["player", "Season", "season_type", "Team"])
    .groupby(["player", "Season", "season_type"], as_index=False)
    .first()
)

---
# **4. Numeric Conversion**
Convert every column that can be interpreted as numeric into a numeric data type, while leaving non-numeric columns unchanged.

In [18]:
numeric_cols = df_combined.columns.drop(
    ["player", "Season", "season_type", "Team"]
)

df_combined[numeric_cols] = df_combined[numeric_cols].apply(
    pd.to_numeric, errors="coerce"
)

---
# **5. Removing Career Summary Rows**
Non-season summary rows (e.g., "Career", "13 Yrs") that Basketbal-Reference includes in its tables are removed.

In [19]:
df_combined = df_combined[
    df_combined["Season"].str.contains("-", na=False)
]

---
# **6. Filter to Keep Only Valid NBA Season Rows**



In [20]:
df_combined = df_combined[
    df_combined["Season"].str.contains(r"\d{4}-\d{2}", na=False)
]

---
# **7. Calculating Career Net Rating (Weighted BPM)**
Career Net Rating (Weighted BPM) computes a **minutes-weighted version of Box Plus/Mins (BPM)** for each player-season.
*   `adv_BPM` measures **per-possession impact**
*   `adv_MP` represents **total playing time**


---

### **Computing Weighted Impact Per Season**
$$
\text{Weighted BPM}_s = \text{BPM}_s \times \text{MP}_s
$$
Where:
* $\text{BPM}_s =$ Box Plus/Minus (per-possession impact)
* $\text{MP}_s =$ Total minutes played in that season

Multiplying the two ensures that:
*   High-impact seasons with **low minutes** don't overstate importance
*   Sustained impact over many minutes is properly rewarded

This creates a fair measure of **total on-court impact.**

---

### **Aggregate Across Seasons (Career Level)**
For each player *p* and season type *t* (Regular Season or Playoffs):
$$
\text{Career Net Rating}_{p,t}
=
\frac{
\sum_{s \in (p,t)} \left( \text{BPM}_s \times \text{MP}_s \right)
}{
\sum_{s \in (p,t)} \text{MP}_s
}
$$

---

### **Why Weighting by Minutes is Necessary**
BPM is a rare statistic. A player who plays 300 minutes should not be treated the same as one who plays 3,000 minutes. Weighting by miuntes accounts for avaliability and durability, prevents small-sample distortion, and produces a more realistic career evaluation.

In [21]:
df_combined["Weighted_BPM"] = df_combined["adv_BPM"] * df_combined["adv_MP"]

career_net = (
    df_combined.groupby(["player", "season_type"])
    .apply(lambda x: x["Weighted_BPM"].sum() / x["adv_MP"].sum())
    .reset_index(name="CareerNetRating")
)

df_combined = df_combined.merge(
    career_net,
    on=["player", "season_type"],
    how="left"
)

  .apply(lambda x: x["Weighted_BPM"].sum() / x["adv_MP"].sum())


---
# **8. Calculating Impact Score and Determining Peak Impact Seasons**
A **composite Impact Score** is created for each player-season by combining multiple advanced performance metrics into a single, comparable value. The Impact Score is calculated by:
*   Ranking each metric **within the dataset** using percentile ranks
*   Summing the percentile ranks across metrics

This produces a normalized measure of **overall season-level impact.**

---

For player *p* in season *s* :
$$\text{ImpactScore}_{p,s}=\text{PctRank}\text{(PER}_{p,s})+\text{PctRank}\text{(BPM}_{p,s})$$

Where:
* $\text{PctRank}$ is the **percentile rank** of that metric across the dataset (values $\epsilon [0,1]$)
* $\text{PER =}$ Player Efficiency Rating
* $\text{WS =}$ Win Shares
* $\text{BPM =}$ Box Plus/Minus

**Interpretation:**
* Each metric is normalized to the same 0-1 scale
* No single stat dominates due to scale differences
* Higher Impact Score = stronger all-around season impact

---

**Peak Impact Season Indicator:**
Flagging peak impact seasons identify each player's single most impactful season, separately for Regular Season and Playoffs. It flags the season where the Impact Score is highest within each player and season type group.

The result is a boolean indicator:
*   `True (1)` -> peak impact season
*   `False(0)` -> all other seasons

This allows for direct identification of peak seasons, accounts for different career lengths, and separates regular-season and playoff peaks.


In [22]:
df_combined["ImpactScore"] = (
    df_combined["adv_PER"].rank(pct=True) +
    df_combined["adv_WS"].rank(pct=True) +
    df_combined["adv_BPM"].rank(pct=True)
)

df_combined["PeakImpactSeason"] = (
    df_combined
    .groupby(["player", "season_type"])["ImpactScore"]
    .transform(lambda x: x == x.max())
)

---
# **9. Creating the Age Curve Index**
An **Age Curve Index** measures each season's performance **relative to a player's own peak efficiency.** The index scales each season's PER between 0 and 1 and uses the player's **maximum PER** (by season type) as the reference point.

---

For player *p*, season *s*, and season type *t* (Regular Season or Playoffs):
$$
\text{AgeCurveIndex}_{p,s,t}
=
\frac{\text{PER}_{p,s,t}}
{\max_{s \in (p,t)} \text{PER}_{p,s,t}}
$$


---

**Interpretation:**
*   `1.0` -> player's peak efficiency season
*   `<1.0` -> efficiency relative to their own peak
* The index is **player-normalized**, not league-normalized

This allows fair comparison of **career trajectories**, even for players with different career lengths (e.g., Giannis vs. Wembanyama)

In [23]:
df_combined["AgeCurveIndex"] = (
    df_combined
    .groupby(["player", "season_type"])["adv_PER"]
    .transform(lambda x: x / x.max())
)

---
# **10. Consolidating Duplicate Fields and Data Cleaning**
Duplicated fields that appear in both the advanced and per-game datasets are consolidated into a single canonical column, then redundant and low-value columns are removed.

In [24]:
# Combine Age
df_combined["Age"] = df_combined["adv_Age"].combine_first(df_combined["pg_Age"])

# Combine Position
df_combined["Pos"] = df_combined["adv_Pos"].combine_first(df_combined["pg_Pos"])

# Combine Games
df_combined["G"] = df_combined["adv_G"].combine_first(df_combined["pg_G"])

# Combine Games Started
df_combined["GS"] = df_combined["adv_GS"].combine_first(df_combined["pg_GS"])

# Combine Minutes Played
df_combined["MP"] = df_combined["adv_MP"].combine_first(df_combined["pg_MP"])

In [25]:
cols_to_drop = [
    "adv_Age", "pg_Age",
    "adv_Pos", "pg_Pos",
    "adv_G", "pg_G",
    "adv_GS", "pg_GS",
    "adv_MP", "pg_MP",
    "adv_Awards", "pg_Awards",
    "adv_Lg", "pg_Lg",
    "pg_Trp-Dbl"
]

df_combined = df_combined.drop(
    columns=[c for c in cols_to_drop if c in df_combined.columns]
)

---
### **Standardizing Percentage Column Names**
All columns containing the `%` symbol are renamed and replaced with `_pct`, standardizing how percentage-based metrics are represented in the dataset.

For example:
*   `adv_TS%` -> `adv_TS_pct`
*   `adv_USG%` -> `adv_USG_pct`



In [26]:
df_combined = df_combined.rename(
    columns=lambda c: c.replace("%", "_pct")
)

In [27]:
df_combined.head(10)

Unnamed: 0,player,Season,season_type,Team,adv_PER,adv_TS_pct,adv_3PAr,adv_FTr,adv_ORB_pct,adv_DRB_pct,...,Weighted_BPM,CareerNetRating,ImpactScore,PeakImpactSeason,AgeCurveIndex,Age,Pos,G,GS,MP
0,Giannis Antetokounmpo,2013-14,Regular Season,MIL,10.8,0.518,0.282,0.483,4.6,16.3,...,-4742.5,6.849705,0.191176,False,0.314869,19.0,,77.0,23.0,1897.0
1,Giannis Antetokounmpo,2014-15,Playoffs,MIL,10.9,0.425,0.014,0.324,5.8,17.3,...,-462.3,8.350592,0.080882,False,0.328313,20.0,,6.0,6.0,201.0
2,Giannis Antetokounmpo,2014-15,Regular Season,MIL,14.8,0.552,0.056,0.445,4.5,19.7,...,0.0,6.849705,0.691176,False,0.431487,20.0,,81.0,71.0,2541.0
3,Giannis Antetokounmpo,2015-16,Regular Season,MIL,18.8,0.566,0.108,0.404,4.6,20.0,...,5928.3,6.849705,0.830882,False,0.548105,21.0,,80.0,79.0,2823.0
4,Giannis Antetokounmpo,2016-17,Playoffs,MIL,21.9,0.563,0.089,0.411,4.6,25.1,...,1603.8,8.350592,0.602941,False,0.659639,22.0,,6.0,6.0,243.0
5,Giannis Antetokounmpo,2016-17,Regular Season,MIL,26.1,0.599,0.143,0.486,5.9,22.6,...,20768.5,6.849705,1.617647,False,0.760933,22.0,,80.0,80.0,2845.0
6,Giannis Antetokounmpo,2017-18,Playoffs,MIL,26.6,0.62,0.116,0.455,3.7,23.5,...,1988.0,8.350592,0.985294,False,0.801205,23.0,,7.0,7.0,280.0
7,Giannis Antetokounmpo,2017-18,Regular Season,MIL,27.3,0.598,0.1,0.457,6.7,25.3,...,17087.2,6.849705,1.602941,False,0.795918,23.0,,75.0,75.0,2756.0
8,Giannis Antetokounmpo,2018-19,Playoffs,MIL,26.6,0.572,0.211,0.644,7.3,27.7,...,4266.2,8.350592,1.220588,False,0.801205,24.0,,15.0,15.0,514.0
9,Giannis Antetokounmpo,2018-19,Regular Season,MIL,30.9,0.644,0.163,0.55,7.3,30.0,...,24523.2,6.849705,2.404412,False,0.900875,24.0,,72.0,72.0,2358.0


In [28]:
df_combined.to_csv("nba_player_impact.csv", index=False)

---
# **11. Uploading Clutch Statistics**
NBA clutch statistics track player performance in the final five minutes of close games (score within five points) and include traditional stats. This highlights key plays like game-winners, crucial rebounds, and defensive stops that significantly impact the game's outcome.

---

### **What Makes a Play "Clutch?"?**
*   **High Pressure Moments:** Excelling when the game is on the line
*   **Impactful Plays:** Game-tying/winning shots, crucial defensive plays (steals, blocks)
*   **Beyond Scoring:** Rebounding, assists, and even drawing fouls contribute to clutch performance.

---

These clutch statistics were manually collected from NBA.com and inputted into a Excel sheet and saved as a CSV file.

In [30]:
from google.colab import files
files.upload()


Saving NBA_Clutch_Statistics.csv to NBA_Clutch_Statistics.csv


{'NBA_Clutch_Statistics.csv': b'\xef\xbb\xbfplayer,season,season_type,team,age,GP,W,L,MIN,PTS,FGM,FGA,FG%,3PM,3PA,3P%,FTM,FTA,FT%,OREB,DREB,REB,AST,TOV,STL,BLK,PF,FP,DD2,TD3,plus_minus\r\nNikola Jokic,2015-16,Regular Season,DEN,21,25,12,13,2.6,0.9,0.3,0.6,43.8,0,0,100,0.3,0.4,72.7,0.1,0.7,0.8,0.2,0.1,0.1,0,0.4,2.4,8,0,-0.4\r\nNikola Jokic,2016-17,Regular Season,DEN,22,34,13,21,2.7,1.7,0.6,1.3,47.7,0.1,0.4,30.8,0.4,0.5,72.2,0.3,0.8,1.1,0.3,0.1,0,0.1,0.3,3.6,21,2,0.9\r\nNikola Jokic,2017-18,Regular Season,DEN,23,39,20,19,3.7,1.6,0.5,1.3,37.3,0.1,0.4,28.6,0.6,0.6,88,0.3,0.8,1.1,0.5,0.4,0.1,0,0.6,3.6,23,5,0.3\r\nNikola Jokic,2018-19,Regular Season,DEN,24,44,31,13,3.2,2.2,0.8,1.6,48.6,0.1,0.3,40,0.4,0.5,82.6,0.3,0.6,0.9,0.7,0.5,0.1,0.1,0.4,4.4,32,8,2\r\nNikola Jokic,2018-19,Playoffs,DEN,24,7,3,4,6.2,1.1,0.4,2.3,18.8,0,0.9,0,0.3,0.6,50,0.3,1.4,1.7,2,0.7,0,0.1,0.7,5.9,7,4,0.6\r\nNikola Jokic,2019-20,Regular Season,DEN,25,43,29,14,3.8,3.4,1.2,2.4,51,0.1,0.5,28.6,0.9,1.1,82.6,0.4,0.8,1.2,0.6,0.

---
# **12. Cleaning and Preparing Clutch Statistics for Integration**
Clutch statistics include percentage-based columns (e.g., FG%, 3P%) and replaced with `_pct` to ensure consistency across all data sources.

---

`player`, `season`, and `season_type` columns were used to uniquely identify each clutch record and are itentionally excluded from prefixing so they can be used as join keys.

Examples:
*   `PTS` -> `clutch_PTS`
*   `FG_pct` -> `clutch_FG_pct`

This follows the same prefixing logic used for `adv_` and `pg_` metrics.

---

Lastly, the main dataset uses `Season` as the season identifier, so `season` in the clutch statistics dataset was changed and aligned with the main dataset.


In [31]:
clutch = pd.read_csv("NBA_Clutch_Statistics.csv")

In [32]:
clutch = clutch.rename(
    columns=lambda c: c.replace("%", "_pct")
)

In [33]:
clutch_id_cols = ["player", "season", "season_type"]

clutch = clutch.rename(
    columns={
        c: f"clutch_{c}"
        for c in clutch.columns
        if c not in clutch_id_cols
    }
)

In [34]:
clutch = clutch.rename(columns={"season": "Season"})

---
# **13. Merging Clutch Statistics into the Main Dataset**
Clutch performance dataset is merged into the main season-level dataset, aligning clutch metrics with each player's corresponding season and season type.Additionally, redundant clutch columns were removed.



In [35]:
df_combined = df_combined.merge(
    clutch,
    on=["player", "Season", "season_type"],
    how="left"
)

In [36]:
df_combined = df_combined.drop(
    columns=[c for c in ["clutch_team", "clutch_age"] if c in df_combined.columns]
)

In [37]:
df_combined.head(5)

Unnamed: 0,player,Season,season_type,Team,adv_PER,adv_TS_pct,adv_3PAr,adv_FTr,adv_ORB_pct,adv_DRB_pct,...,clutch_REB,clutch_AST,clutch_TOV,clutch_STL,clutch_BLK,clutch_PF,clutch_FP,clutch_DD2,clutch_TD3,clutch_plus_minus
0,Giannis Antetokounmpo,2013-14,Regular Season,MIL,10.8,0.518,0.282,0.483,4.6,16.3,...,0.5,0.2,0.1,0.2,0.0,0.1,1.9,0.0,0.0,-2.2
1,Giannis Antetokounmpo,2014-15,Playoffs,MIL,10.9,0.425,0.014,0.324,5.8,17.3,...,1.3,0.0,0.3,0.0,0.7,0.7,3.3,1.0,0.0,0.3
2,Giannis Antetokounmpo,2014-15,Regular Season,MIL,14.8,0.552,0.056,0.445,4.5,19.7,...,1.3,0.0,0.3,0.0,0.7,0.7,3.3,1.0,0.0,0.3
3,Giannis Antetokounmpo,2015-16,Regular Season,MIL,18.8,0.566,0.108,0.404,4.6,20.0,...,0.9,0.3,0.2,0.0,0.3,0.5,3.7,12.0,3.0,-0.4
4,Giannis Antetokounmpo,2016-17,Playoffs,MIL,21.9,0.563,0.089,0.411,4.6,25.1,...,0.3,0.7,0.0,0.0,0.0,1.0,4.4,1.0,0.0,-0.7


In [38]:
df_combined.to_csv("Final_NBA_Player_Impact.csv", index=False)