# **LineupLab**: NBA Matchup Prediction using Transformer Networks

## Project Overview
This project is part of the final requirement for the **Introduction to Deep Learning** course. The objective is to develop a machine learning model that predicts NBA matchup outcomes based on player lineups and team configurations. 

By leveraging the BallDontLie API, we will retrieve, clean, and process NBA data to create a dataset suitable for training and testing. A transformer-based deep learning model will be implemented using PyTorch to analyze player lineups and generate predictions.

## Goals
1. **Data Exploration**: Analyze and preprocess NBA data to ensure compatibility with the model.
2. **Model Creation**: Build a transformer network to learn relationships between players in a lineup and predict game outcomes.
3. **Hyperparameter Tuning**: Experiment with learning rate, optimizer, number of epochs, and other hyperparameters to optimize performance.
4. **Evaluation and Analysis**: Evaluate model performance using metrics such as accuracy, loss, and F1-score. Provide insights into the model's strengths, limitations, and potential improvements.

## Key Features
- **Transformer Networks**: Leveraging multi-head attention to capture player and team relationships.
- **Comprehensive Dataset**: Utilizing player stats, game results, and team information from the BallDontLie API.
- **Visualization and Analysis**: Incorporating visual representations of data distributions, training progress, and performance metrics.

This notebook will serve as the main documentation for the project, including all steps from data retrieval to model evaluation.


In [None]:
# Importing necessary libraries
import requests  # For API requests
import pandas as pd  # For data manipulation and analysis
import numpy as np  # For numerical operations
import matplotlib.pyplot as plt  # For data visualization
import torch  # For deep learning model implementation
import torch.nn as nn  # For neural network components
import torch.optim as optim  # For optimization algorithms
from torch.utils.data import Dataset, DataLoader  # For data handling in PyTorch
from nba_api.stats.endpoints import leaguedashlineups
import json
import time
import os
from concurrent.futures import ThreadPoolExecutor
import threading

# Ensure plots are displayed inline
%matplotlib inline

# Display confirmation message
print("Libraries successfully loaded!")


Libraries successfully loaded!


## **I. Data Exploration and Preparation**

### Overview
In this section, we will collect, prepare, and process NBA data using the **BallDontLie API** to build a dataset for training a transformer-based deep learning model.

### Goals and Actions
1. **Data Collection**:
   - Retrieve detailed NBA game data (2003–2023) using game IDs.
   - Collect individual player statistics (e.g., minutes played, offensive and defensive ratings, usage percentages) for each game.
   - Extract team information (e.g., home and away team IDs, scores) and player metadata (e.g., names, positions).

2. **Data Cleaning**:
   - Normalize data formats, particularly for time strings (e.g., parsing minutes played).
   - Handle missing or incomplete data by assigning default or null values where necessary.
   - Rename and standardize column names (e.g., `id` → `game_id`) for consistency.

3. **Parallelized Processing**:
   - Implement a **threaded processing solution** to speed up API calls and data extraction.
   - Use incremental saving to ensure progress is retained during long data processing tasks.

4. **Dataset Preparation**:
   - Combine game-level and player-level statistics into a single dataset.
   - Structure the data for model input, including creating columns for the top 12 players (home and away) with their associated metrics.
   - Include player positions directly from the API to enhance the feature set for modeling.


In [None]:
# Fetching game data for 2003-2023, Home/Away Teams, Date, & Score

# Base URL and API Key
BASE_URL = "https://api.balldontlie.io/v1/games"
API_KEY = "3c5f3508-5962-4809-8f3e-2b42449e253f"

# Headers for the API request
HEADERS = {
    "Authorization": API_KEY
}

# Define the seasons to retrieve (2003 to 2023)
START_YEAR = 2003
END_YEAR = 2023

# List to store game data
all_games = []

# Function to fetch games for a specific season
def fetch_games_for_season(season):
    cursor = None  # Start without a cursor
    while True:
        print(f"Fetching season {season}, cursor: {cursor}")
        
        # Construct the API URL with cursor for pagination
        url = f"{BASE_URL}?seasons[]={season}&per_page=100"
        if cursor:
            url += f"&cursor={cursor}"
        
        response = requests.get(url, headers=HEADERS)
        
        if response.status_code != 200:
            print(f"Error fetching data: {response.status_code}. Retrying in 60 seconds...")
            time.sleep(60)
            continue

        data = response.json()
        
        # Ensure the response contains new data
        if not data['data']:
            print(f"No more data found for season {season}, exiting loop.")
            break  # Exit loop if no more games are found

        # Add new games to the list
        all_games.extend(data['data'])
        print(f"Fetched {len(data['data'])} games. Total games collected: {len(all_games)}")
        
        # Update the cursor for the next page
        cursor = data.get('meta', {}).get('next_cursor', None)
        if not cursor:  # No more pages
            print(f"All pages fetched for season {season}.")
            break
        
        # Throttle requests to avoid hitting rate limits
        time.sleep(0.5)

# Fetch data for each season
for season in range(START_YEAR, END_YEAR + 1):
    fetch_games_for_season(season)

# Process the collected data into a DataFrame
print("Processing data into DataFrame...")
games_data = [
    {
        "id": game["id"],
        "date": game["date"],
        "season": game["season"],
        "status": game["status"],
        "home_team_score": game["home_team_score"],
        "visitor_team_score": game["visitor_team_score"],
        "home_team_name": game["home_team"]["full_name"],
        "home_team_id": game["home_team"]["id"],
        "visitor_team_name": game["visitor_team"]["full_name"],
        "visitor_team_id": game["visitor_team"]["id"]
    }
    for game in all_games
]

games_df = pd.DataFrame(games_data)

# Save the data to a CSV file
output_file = "games_2003_2023.csv"
games_df.to_csv(output_file, index=False)
print(f"Data saved to {output_file}.")


Fetching season 2003, cursor: None
Fetched 100 games. Total games collected: 100
Fetching season 2003, cursor: 16582
Fetched 100 games. Total games collected: 200
Fetching season 2003, cursor: 17359
Fetched 100 games. Total games collected: 300
Fetching season 2003, cursor: 12772
Fetched 100 games. Total games collected: 400
Fetching season 2003, cursor: 13238
Fetched 100 games. Total games collected: 500
Fetching season 2003, cursor: 16627
Fetched 100 games. Total games collected: 600
Fetching season 2003, cursor: 15539
Fetched 100 games. Total games collected: 700
Fetching season 2003, cursor: 16428
Fetched 100 games. Total games collected: 800
Fetching season 2003, cursor: 17812
Fetched 100 games. Total games collected: 900
Fetching season 2003, cursor: 16065
Fetched 100 games. Total games collected: 1000
Fetching season 2003, cursor: 13761
Fetched 100 games. Total games collected: 1100
Fetching season 2003, cursor: 13772
Fetched 100 games. Total games collected: 1200
Fetching seaso

#### **Success!** We have successfully gathered the scores for every game from 2003-2023

#### Now we will enrich this data-frame with the top 12 most used players for each team including their Player-ID's, Position, Minutes, Offensive Rating, Defensive Rating, and Usage Pctg.

In [None]:
#Example of how we are going to expand data with information from stats and advanced_stats - we will do this on a loop for every game we have gathered
game_id = 15486  # Example game ID

# Base URLs and API Key
BASE_URL_STATS = "https://api.balldontlie.io/v1/stats"
BASE_URL_ADVANCED = "https://api.balldontlie.io/v1/stats/advanced"
API_KEY = "3c5f3508-5962-4809-8f3e-2b42449e253f"
HEADERS = {"Authorization": API_KEY}

# Function to fetch stats
def fetch_stats(url, game_id):
    response = requests.get(f"{url}?game_ids[]={game_id}&per_page=100", headers=HEADERS)
    if response.status_code == 200:
        return response.json()["data"]
    else:
        raise Exception(f"Error fetching stats: {response.status_code}, {response.text}")

# Refined parse_minutes function
def parse_minutes(value):
    try:
        if isinstance(value, str):
            if ":" in value:  # Time string in "MM:SS" format
                parts = value.split(":")
                minutes = int(parts[0])
                seconds = int(parts[1])
                return minutes + seconds / 60  # Convert seconds to fractional minutes
            elif value.isdigit():  # Whole number string like "38"
                return float(value)  # Convert directly to float
        return 0  # Default for invalid or missing values
    except Exception as e:
        print(f"Error parsing minutes value '{value}': {e}")
        return 0

# Fetch data
base_stats = fetch_stats(BASE_URL_STATS, game_id)
advanced_stats = fetch_stats(BASE_URL_ADVANCED, game_id)

# Convert to DataFrames
base_df = pd.DataFrame(base_stats)
adv_df = pd.DataFrame(advanced_stats)

# Parse minutes played
base_df["minutes_played"] = base_df["min"].apply(parse_minutes)

# Merge base and advanced stats on player ID
base_df["player_id"] = base_df["player"].apply(lambda x: x["id"])
adv_df["player_id"] = adv_df["player"].apply(lambda x: x["id"])
adv_df["position"] = adv_df["player"].apply(lambda x: x.get("position", None))  # Extract position
merged_df = pd.merge(
    base_df,
    adv_df[["player_id", "offensive_rating", "defensive_rating", "usage_percentage", "position"]],
    on="player_id",
    how="inner"
)

# Add team and player full name
merged_df["team_id"] = merged_df["team"].apply(lambda x: x["id"])
merged_df["team_name"] = merged_df["team"].apply(lambda x: x["full_name"])
merged_df["full_name"] = merged_df["player"].apply(lambda x: f"{x['first_name']} {x['last_name']}")

# Split into home and away teams and take top 12 players by minutes played
home_team_id = 14  # Los Angeles Lakers
away_team_id = 7   # Dallas Mavericks
home_players = merged_df[merged_df["team_id"] == home_team_id].nlargest(12, "minutes_played")
away_players = merged_df[merged_df["team_id"] == away_team_id].nlargest(12, "minutes_played")

# Print top 12 players for each team
print("\nTop 12 Home Players:")
print(home_players[["full_name", "team_name", "minutes_played", "position", "offensive_rating", "defensive_rating", "usage_percentage"]])

print("\nTop 12 Away Players:")
print(away_players[["full_name", "team_name", "minutes_played", "position", "offensive_rating", "defensive_rating", "usage_percentage"]])

# Combine results into a single row for testing
game_row = {
    "id": game_id,
    "date": "2003-10-28",
    "season": 2003,
    "status": "Final",
    "home_team_score": 109,
    "visitor_team_score": 93,
    "home_team_name": "Los Angeles Lakers",
    "home_team_id": home_team_id,
    "visitor_team_name": "Dallas Mavericks",
    "visitor_team_id": away_team_id,
}

for i in range(1, 13):
    if i <= len(home_players):
        game_row[f"home_player_{i}_id"] = home_players.iloc[i - 1]["player_id"]
        game_row[f"home_player_{i}_name"] = home_players.iloc[i - 1]["full_name"]
        game_row[f"home_player_{i}_minutes"] = home_players.iloc[i - 1]["minutes_played"]
        game_row[f"home_player_{i}_position"] = home_players.iloc[i - 1]["position"]
        game_row[f"home_player_{i}_off_rating"] = home_players.iloc[i - 1]["offensive_rating"]
        game_row[f"home_player_{i}_def_rating"] = home_players.iloc[i - 1]["defensive_rating"]
        game_row[f"home_player_{i}_usage"] = home_players.iloc[i - 1]["usage_percentage"]
    else:
        game_row[f"home_player_{i}_id"] = None
        game_row[f"home_player_{i}_name"] = None
        game_row[f"home_player_{i}_minutes"] = None
        game_row[f"home_player_{i}_position"] = None
        game_row[f"home_player_{i}_off_rating"] = None
        game_row[f"home_player_{i}_def_rating"] = None
        game_row[f"home_player_{i}_usage"] = None

    if i <= len(away_players):
        game_row[f"away_player_{i}_id"] = away_players.iloc[i - 1]["player_id"]
        game_row[f"away_player_{i}_name"] = away_players.iloc[i - 1]["full_name"]
        game_row[f"away_player_{i}_minutes"] = away_players.iloc[i - 1]["minutes_played"]
        game_row[f"away_player_{i}_position"] = away_players.iloc[i - 1]["position"]
        game_row[f"away_player_{i}_off_rating"] = away_players.iloc[i - 1]["offensive_rating"]
        game_row[f"away_player_{i}_def_rating"] = away_players.iloc[i - 1]["defensive_rating"]
        game_row[f"away_player_{i}_usage"] = away_players.iloc[i - 1]["usage_percentage"]
    else:
        game_row[f"away_player_{i}_id"] = None
        game_row[f"away_player_{i}_name"] = None
        game_row[f"away_player_{i}_minutes"] = None
        game_row[f"away_player_{i}_position"] = None
        game_row[f"away_player_{i}_off_rating"] = None
        game_row[f"away_player_{i}_def_rating"] = None
        game_row[f"away_player_{i}_usage"] = None

# Create final DataFrame
final_df = pd.DataFrame([game_row])

# Save the data to a CSV file
output_file = "miniexp_games_2003_2023_top12_with_positions.csv"
final_df.to_csv(output_file, index=False)
print(f"Data saved to {output_file}.")



Top 12 Home Players:
           full_name           team_name  minutes_played position  \
15      Derek Fisher  Los Angeles Lakers       37.300000            
16       Gary Payton  Los Angeles Lakers       36.000000            
12     Devean George  Los Angeles Lakers       34.583333            
14  Shaquille O'Neal  Los Angeles Lakers       32.000000            
13       Karl Malone  Los Angeles Lakers       29.000000            
17     Bryon Russell  Los Angeles Lakers       20.783333            
18      Horace Grant  Los Angeles Lakers       20.000000            
21       Kareem Rush  Los Angeles Lakers       14.000000            
20     Jannero Pargo  Los Angeles Lakers        9.000000            
19       Luke Walton  Los Angeles Lakers        7.000000            
22       Kobe Bryant  Los Angeles Lakers        0.000000            
23  Slava Medvedenko  Los Angeles Lakers        0.000000            

    offensive_rating  defensive_rating  usage_percentage  
15             113.0 

In [None]:
# Load the existing games DataFrame, and expand with our new categories ready to be filled in the next step
games_df = pd.read_csv("games_2003_2023.csv")

# Define the player-specific columns for home and away teams, including position
player_columns = []
for i in range(1, 13):
    player_columns.extend([
        f"home_player_{i}_id", f"home_player_{i}_name", f"home_player_{i}_position",
        f"home_player_{i}_minutes", f"home_player_{i}_off_rating",
        f"home_player_{i}_def_rating", f"home_player_{i}_usage",
    ])
for i in range(1, 13):
    player_columns.extend([
        f"away_player_{i}_id", f"away_player_{i}_name", f"away_player_{i}_position",
        f"away_player_{i}_minutes", f"away_player_{i}_off_rating",
        f"away_player_{i}_def_rating", f"away_player_{i}_usage",
    ])

# Create an empty DataFrame for player columns
empty_player_df = pd.DataFrame(columns=player_columns)

# Initialize all values as None
for col in empty_player_df.columns:
    empty_player_df[col] = None

# Append the empty player DataFrame to games_df
games_df = pd.concat([games_df, empty_player_df], axis=1)

# Rename the `id` column to `game_id`
if 'id' in games_df.columns:
    games_df.rename(columns={'id': 'game_id'}, inplace=True)
    print("Column 'id' renamed to 'game_id'.")
else:
    print("'id' column not found. Ensure the dataset is correct.")

# Save the updated dataset
expanded_games_file_updated = "expanded_games_2003_2023.csv"
games_df.to_csv(expanded_games_file_updated, index=False)
print(f"Updated dataset saved as {expanded_games_file_updated}.")


Column 'id' renamed to 'game_id'.
Updated dataset saved as expanded_games_2003_2023.csv.


In [None]:
# Loading the expanded games DataFrame with statistics from BallDontLie utilizing a threaded approach for speed. See above (single game example - but in a loop).
expanded_games_file = "expanded_games_2003_2023.csv"
if os.path.exists(expanded_games_file):
    games_df = pd.read_csv(expanded_games_file)
else:
    raise FileNotFoundError(f"{expanded_games_file} not found.")

# Identify unprocessed games
unprocessed_games = games_df[games_df["home_player_1_id"].isna()]["game_id"].tolist()
print(f"Found {len(unprocessed_games)} unprocessed games.")

# Set threading and batch processing parameters
lock = threading.Lock()
processed_games = 0
batch_size = 100

def process_game(game_id):
    try:
        # print(f"Processing game ID {game_id}...")

        # Fetch stats for the game
        base_stats = fetch_stats(BASE_URL_STATS, game_id)
        advanced_stats = fetch_stats(BASE_URL_ADVANCED, game_id)

        if not base_stats or not advanced_stats:
            print(f"No stats found for game ID {game_id}. Skipping...")
            return None, game_id

        # Convert to DataFrames
        base_df = pd.DataFrame(base_stats)
        adv_df = pd.DataFrame(advanced_stats)

        # Parse minutes played
        base_df["minutes_played"] = base_df["min"].apply(parse_minutes)

        # Merge base and advanced stats on player ID
        base_df["player_id"] = base_df["player"].apply(lambda x: x["id"])
        adv_df["player_id"] = adv_df["player"].apply(lambda x: x["id"])
        adv_df["position"] = adv_df["player"].apply(lambda x: x.get("position", None))  # Extract position
        merged_df = pd.merge(
            base_df,
            adv_df[["player_id", "offensive_rating", "defensive_rating", "usage_percentage", "position"]],
            on="player_id",
            how="inner"
        )

        # Add team and player full name
        merged_df["team_id"] = merged_df["team"].apply(lambda x: x["id"])
        merged_df["team_name"] = merged_df["team"].apply(lambda x: x["full_name"])
        merged_df["full_name"] = merged_df["player"].apply(lambda x: f"{x['first_name']} {x['last_name']}")

        # Extract home and away team IDs from the main DataFrame
        home_team_id = games_df.loc[games_df["game_id"] == game_id, "home_team_id"].values[0]
        away_team_id = games_df.loc[games_df["game_id"] == game_id, "visitor_team_id"].values[0]

        # Split into home and away players and select top 12 by minutes played
        home_players = merged_df[merged_df["team_id"] == home_team_id].nlargest(12, "minutes_played")
        away_players = merged_df[merged_df["team_id"] == away_team_id].nlargest(12, "minutes_played")

        # Create a dictionary to store processed player data
        player_data = {}
        for i in range(1, 13):
            for team, players in [("home", home_players), ("away", away_players)]:
                if i <= len(players):
                    player_data[f"{team}_player_{i}_id"] = players.iloc[i - 1]["player_id"]
                    player_data[f"{team}_player_{i}_name"] = players.iloc[i - 1]["full_name"]
                    player_data[f"{team}_player_{i}_minutes"] = players.iloc[i - 1]["minutes_played"]
                    player_data[f"{team}_player_{i}_position"] = players.iloc[i - 1]["position"]
                    player_data[f"{team}_player_{i}_off_rating"] = players.iloc[i - 1]["offensive_rating"]
                    player_data[f"{team}_player_{i}_def_rating"] = players.iloc[i - 1]["defensive_rating"]
                    player_data[f"{team}_player_{i}_usage"] = players.iloc[i - 1]["usage_percentage"]
                else:
                    player_data[f"{team}_player_{i}_id"] = None
                    player_data[f"{team}_player_{i}_name"] = None
                    player_data[f"{team}_player_{i}_minutes"] = None
                    player_data[f"{team}_player_{i}_position"] = None
                    player_data[f"{team}_player_{i}_off_rating"] = None
                    player_data[f"{team}_player_{i}_def_rating"] = None
                    player_data[f"{team}_player_{i}_usage"] = None

        # print(f"Finished processing game ID {game_id}.")
        return player_data, game_id

    except Exception as e:
        print(f"Error processing game {game_id}: {e}")
        return None, game_id
# Function to update the DataFrame with processed game data
def update_games_df(game_data, game_id):
    global games_df
    with lock:
        for key, value in game_data.items():
            games_df.loc[games_df["game_id"] == game_id, key] = value

# Function to save progress to a file
def save_progress():
    global games_df
    with lock:
        games_df.to_csv(expanded_games_file, index=False)
        print(f"Progress saved at {time.strftime('%Y-%m-%d %H:%M:%S')}")

# ThreadPoolExecutor for processing games
with ThreadPoolExecutor(max_workers=10) as executor:
    futures = executor.map(process_game, unprocessed_games)
    for game_data, game_id in futures:
        if game_data:
            update_games_df(game_data, game_id)
            processed_games += 1
            if processed_games % batch_size == 0:
                save_progress()

# Final save
save_progress()
print("Processing complete.")


  exec(code_obj, self.user_global_ns, self.user_ns)


Found 26743 unprocessed games.
Progress saved at 2024-11-26 17:30:29
Progress saved at 2024-11-26 17:31:01
Progress saved at 2024-11-26 17:31:32
No stats found for game ID 17959. Skipping...
Progress saved at 2024-11-26 17:32:11
Progress saved at 2024-11-26 17:32:36
Progress saved at 2024-11-26 17:33:06
Progress saved at 2024-11-26 17:33:35
Progress saved at 2024-11-26 17:33:59
Progress saved at 2024-11-26 17:34:24
Progress saved at 2024-11-26 17:34:48
Progress saved at 2024-11-26 17:35:12
Progress saved at 2024-11-26 17:35:38
Progress saved at 2024-11-26 17:36:05
Progress saved at 2024-11-26 17:36:29
Progress saved at 2024-11-26 17:36:53
Progress saved at 2024-11-26 17:37:19
Progress saved at 2024-11-26 17:37:42
Progress saved at 2024-11-26 17:38:03
Progress saved at 2024-11-26 17:38:23
Progress saved at 2024-11-26 17:38:44
Progress saved at 2024-11-26 17:39:04
Progress saved at 2024-11-26 17:39:34
Progress saved at 2024-11-26 17:39:54
Progress saved at 2024-11-26 17:40:15
Progress sa

## Summary of Accomplishments

- Successfully retrieved and processed **~27,000 NBA games** across two decades.
- Incorporated detailed player and game statistics into a comprehensive dataset for deep learning.
- Addressed challenges with missing data, API rate limits, and processing time through incremental saving and parallelization.
- Created a well-structured dataset with features like **player usage, minutes played, and team compositions**, enabling future modeling efforts.
- This framework allows for a good jumping off point - we can incorporate more detailed player statistics (e.g. height) if wanted through a quick API access.

This dataset now provides a solid foundation for training the transformer-based deep learning model in subsequent sections.

## **II. Model Creation**

### Overview
This section focuses on building the transformer-based deep learning model to predict NBA game outcomes. The model will analyze player lineups and their relationships to generate predictions.

### Goals
1. **Model Architecture**:
   - Implement a transformer network using **PyTorch**.
   - Utilize multi-head attention mechanisms to analyze player and team relationships.
   - Include potential residual connections to enhance model depth and stability.
2. **Input and Output Design**:
   - Process tokenized player information as inputs.
   - Predict game outcomes (e.g., winners, scores) as outputs.
3. **Model Training**:
   - Define the training loop, including loss functions and optimizers.
   - Split data into training, validation, and test sets for evaluation.

### Implementation Steps
1. **Define the Transformer Architecture**:
   - Specify input dimensions, number of attention heads, and transformer layers.
2. **Configure the Training Pipeline**:
   - Choose a loss function and optimizer (e.g., CrossEntropyLoss, Adam).
   - Set hyperparameters like learning rate, number of epochs, and batch size.
3. **Initial Testing**:
   - Train the model on a subset of the data to ensure functionality.
   - Evaluate initial performance before moving to hyperparameter tuning.

This section will document the step-by-step process of creating and implementing the model, including explanations for each architectural choice.


## **III. Hyperparameter Tuning**

### Overview
This section explores the impact of various hyperparameters on the model's performance. By systematically adjusting key parameters, we aim to optimize the transformer network for better predictions.

### Goals
1. **Experimentation**:
   - Test different values for hyperparameters such as:
     - Learning rate.
     - Number of epochs.
     - Optimizer (e.g., Adam, SGD).
     - Batch size.
     - Number of transformer layers and attention heads.
2. **Performance Evaluation**:
   - Assess the impact of each hyperparameter on model accuracy, loss, and F1-score.
   - Document observations to identify the most effective configurations.

### Implementation Steps
1. **Baseline Configuration**:
   - Train the model with default or commonly used hyperparameter values.
   - Record baseline performance metrics.
2. **Iterative Testing**:
   - Adjust one hyperparameter at a time while keeping others constant.
   - Monitor changes in performance and identify trends.
3. **Optimal Configuration**:
   - Combine the best-performing hyperparameters into a final configuration for training the model.

This section will detail the experiments conducted and the resulting insights into hyperparameter optimization for the transformer network.


## **IV. Evaluation and Analysis**

### Overview
In this section, we evaluate the performance of the transformer-based model using two primary loss metrics and other relevant performance indicators. The focus will be on understanding the model's strengths, limitations, and areas for improvement.

### Metrics
1. **Score Prediction Accuracy**:
   - Measure the actual distance between predicted scores and the true game scores (e.g., Mean Squared Error or Mean Absolute Error).
   - Assess how well the model captures the scoring trends in games.
2. **Winning Outcome Prediction**:
   - Evaluate the model's ability to correctly predict the winning team (e.g., Accuracy, F1-score).
   - Analyze classification performance using confusion matrices.

### Goals
1. **Performance Metrics**:
   - Quantify how accurately the model predicts game outcomes and scores.
   - Identify patterns or biases in the model’s predictions.
2. **Visual Representations**:
   - Plot training and validation loss over epochs.
   - Generate confusion matrices for winning outcome predictions.
   - Visualize score prediction distributions.
3. **Strengths and Limitations**:
   - Discuss areas where the model performs well and where it struggles.
   - Identify real-world scenarios where the model could be applied effectively.
4. **Future Improvements**:
   - Suggest ways to enhance the model, such as adjusting hyperparameters, adding new features, or increasing dataset size.

This section will summarize the model's overall performance, supported by quantitative metrics and visualizations.
