#### Rolling Averages for Game Statistics

In this section, we define a function to compute rolling averages for various game statistics over a specified number of past games (`game_window`). This is useful for tracking team performance trends over time while smoothing out short-term fluctuations.

##### Methodology
1. **Load the Data:** We read game logs from a CSV file into a Pandas DataFrame.
2. **Preprocess Data:**
   - Convert the `date` column to a datetime format.
   - Sort the dataset by `team` and `date` to ensure chronological order.
3. **Feature Engineering:**
   - Compute key metrics such as possessions (`poss`), usage percentage (`usg_pct`), turnover-to-possession ratio (`tov_to_poss`), and free throws per possession (`ft_to_poss`).
4. **Calculate Rolling Averages:**
   - Apply an exponentially weighted moving average (EWMA) with a defined `span` equal to `game_window`.
   - Shift the rolling values by one game to ensure that each row only reflects past performance.
   - Retain the first game's original values to avoid NaNs in the output.
5. **Post-processing:**
   - Round all numerical values to two decimal places for readability.
   - Save the processed data to an output CSV file, overwriting any existing file if necessary.

This approach ensures that rolling averages are computed efficiently and can be easily used for further predictive modeling or analysis.

In [None]:
# Importing libraries
import pandas as pd
import os
from rich.console import Console

# Initialize rich console
console: Console = Console()

# Game window size
game_window: int = 25

# Calculate the rolling average over the last n game_window games
def compute_rolling_averages(game_window: int, gamelogs_file: str, output_file: str):
    # Load the CSV file
    console.print("[bold green]Loading CSV file...[/bold green]")
    df: pd.DataFrame = pd.read_csv(gamelogs_file)

    # Sort by team and date
    console.print("[bold green]Sorting data by team and date...[/bold green]")
    df["date"] = pd.to_datetime(df["date"])
    df: pd.DataFrame = df.sort_values(by=["team", "date"])

    # Identify columns for rolling averages (excluding 'date' and 'team')
    stat_columns: list[str] = [col for col in df.columns if col not in ["date", "team"]]

    # Compute rolling averages
    console.print("[bold green]Computing rolling averages...[/bold green]")

    def compute_rolling_avg(group: pd.DataFrame) -> pd.DataFrame:
        rolling_avg: pd.DataFrame = (
            group[stat_columns].ewm(span=game_window, adjust=False).mean().shift(1)
        )
        rolling_avg.iloc[0] = group.iloc[0][stat_columns]
        return rolling_avg

    df[stat_columns] = df.groupby("team", group_keys=False, observed=True)[
        stat_columns
    ].apply(compute_rolling_avg)

    # Limit decimal places to 2
    console.print("[bold green]Rounding values...[/bold green]")
    df[stat_columns] = df[stat_columns].round(2)

    # Save to CSV
    if os.path.exists(output_file):
        console.print(
            f"[bold yellow]File {output_file} already exists. Removing...[/bold yellow]"
        )
        os.remove(output_file)

    df.to_csv(output_file, index=False)
    console.print(f"[bold cyan]Rolling averages saved to {output_file}[/bold cyan]")

compute_rolling_averages(game_window, "./csv/gamelogs.csv", "./csv/averages.csv")