# Final Report: NBA Timeout Effectiveness Analysis

## Research Overview

### Abstract
This research investigates the strategic effectiveness of timeouts in NBA basketball games. By applying advanced data science techniques, we analyze how timeouts impact game momentum and team performance.

## 1. Introduction

### Research Objectives
1. Quantify the impact of timeouts on game momentum
2. Identify key factors influencing timeout effectiveness
3. Develop a predictive model for timeout success

In [None]:
# Import essential libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os

# Statistical and machine learning libraries
from scipy import stats
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import (
    classification_report, 
    confusion_matrix, 
    roc_curve, 
    roc_auc_score, 
    precision_recall_curve
)

# Visualization setup
plt.style.use('seaborn')
plt.rcParams['figure.figsize'] = (16, 10)
plt.rcParams['font.size'] = 12

## 2. Data Preparation and Exploratory Analysis

In [None]:
# Load and preprocess data
def load_and_preprocess_data(filepath='dsa project/ml_ready_timeout_data.csv'):
    df = pd.read_csv(filepath)
    df['effective'] = df['effective'].astype(int)
    print("Dataset Overview:")
    print(f"Total Observations: {len(df)}")
    print("Timeout Effectiveness Distribution:")
    print(df['effective'].value_counts(normalize=True))
    return df

# Load dataset
df = load_and_preprocess_data()

## 3. Visualization Analysis

### 3.1 Comprehensive Data Visualization

In [None]:
def generate_comprehensive_visualizations(df):
    os.makedirs('dsa project/outputs/figures', exist_ok=True)
    # Image 1: Timeout Effectiveness Distribution
    plt.figure(figsize=(10, 6))
    df['effective'].value_counts().plot(kind='bar', color=['coral', 'teal'])
    plt.title('Timeout Effectiveness Distribution')
    plt.xlabel('Timeout Effectiveness')
    plt.ylabel('Count')
    plt.xticks([0, 1], ['Ineffective', 'Effective'], rotation=0)
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img1.png')
    plt.close()

    # Image 2: Key Features Boxplot
    plt.figure(figsize=(12, 6))
    features_to_plot = ['pre_timeout_oe', 'timeout_pressure_index']
    df_melted = df.melt(id_vars='effective', value_vars=features_to_plot, 
                        var_name='Feature', value_name='Value')
    sns.boxplot(x='Feature', y='Value', hue='effective', data=df_melted)
    plt.title('Key Features by Timeout Effectiveness')
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img2.png')
    plt.close()

    # Image 3: Correlation Heatmap
    plt.figure(figsize=(16, 12))
    correlation_matrix = df.corr()
    sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', center=0, 
                linewidths=0.5, fmt='.2f', square=True)
    plt.title('Feature Correlation Heatmap')
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img3.png')
    plt.close()

    # Image 4: Score Difference Distribution
    plt.figure(figsize=(10, 6))
    sns.violinplot(x='effective', y='score_diff', data=df)
    plt.title('Score Difference Distribution')
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img4.png')
    plt.close()

    # Image 5: Pre-Timeout Metrics Scatter
    plt.figure(figsize=(12, 8))
    scatter = plt.scatter(df['pre_timeout_fg_pct'], df['pre_timeout_ts'], 
                          c=df['effective'], cmap='viridis', alpha=0.7)
    plt.title('Pre-Timeout Field Goal vs True Shooting')
    plt.xlabel('Pre-Timeout Field Goal %')
    plt.ylabel('Pre-Timeout True Shooting %')
    plt.colorbar(scatter, label='Effectiveness')
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img5.png')
    plt.close()

    # Image 6: Timeout Pressure Index Distribution
    plt.figure(figsize=(12, 8))
    sns.kdeplot(data=df, x='timeout_pressure_index', hue='effective', 
                fill=True, common_norm=False)
    plt.title('Timeout Pressure Index Distribution')
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img6.png')
    plt.close()

# Generate visualizations
generate_comprehensive_visualizations(df)

## 4. Machine Learning Model

### 4.1 Predictive Modeling Approach

**Objective:** Develop a predictive model to determine timeout effectiveness based on multiple game-context features.

**Model Details:**
- **Algorithm:** Random Forest Classifier
- **Purpose:** Predict whether a timeout will be effective in disrupting opponent momentum
- **Input Features:** Multiple game-context variables
- **Target Variable:** Timeout Effectiveness (Binary: 0-Ineffective, 1-Effective)

In [None]:
def train_timeout_effectiveness_model(df):
    # Prepare features and target
    features = [col for col in df.columns if col not in ['effective', 'efficiency_change']]
    X = df[features]
    y = df['effective']
    # Split data
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.3, random_state=42, stratify=y
    )
    # Scale features
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)
    # Train Random Forest
    rf_classifier = RandomForestClassifier(
        n_estimators=300, max_depth=10, random_state=42
    )
    rf_classifier.fit(X_train_scaled, y_train)
    # Predictions
    y_pred = rf_classifier.predict(X_test_scaled)
    y_proba = rf_classifier.predict_proba(X_test_scaled)[:, 1]
    # Model Evaluation Visualizations
    # Image 7: ROC Curve
    plt.figure(figsize=(10, 8))
    fpr, tpr, _ = roc_curve(y_test, y_proba)
    roc_auc = roc_auc_score(y_test, y_proba)
    plt.plot(fpr, tpr, color='darkorange', lw=2,
             label=f'ROC Curve (AUC = {roc_auc:.2f})')
    plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
    plt.xlim([0.0, 1.0])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('Receiver Operating Characteristic (ROC) Curve')
    plt.legend(loc="lower right")
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img7.png')
    plt.close()
    # Image 8: Feature Importance
    feature_importance = pd.DataFrame({
        'feature': features,
        'importance': rf_classifier.feature_importances_
    }).sort_values('importance', ascending=False)
    plt.figure(figsize=(12, 8))
    plt.barh(feature_importance['feature'][:10], 
             feature_importance['importance'][:10],
             color='steelblue')
    plt.title('Top 10 Features Predicting Timeout Effectiveness')
    plt.xlabel('Feature Importance')
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img8.png')
    plt.close()
    # Image 9: Precision-Recall Curve
    plt.figure(figsize=(10, 8))
    precision, recall, _ = precision_recall_curve(y_test, y_proba)
    plt.plot(recall, precision, color='green', lw=2)
    plt.title('Precision-Recall Curve')
    plt.xlabel('Recall')
    plt.ylabel('Precision')
    plt.tight_layout()
    plt.savefig('dsa project/outputs/figures/img9.png')
    plt.close()
    # Print model performance
    print("Model Performance Metrics:")
    print(classification_report(y_test, y_pred))
    return rf_classifier, feature_importance

# Train the model
model, feature_importance = train_timeout_effectiveness_model(df)

## 5. Key Findings and Insights

### 5.1 Timeout Effectiveness Analysis
- Total timeouts analyzed: Determined by dataset size
- Effective timeout rate: Insights from model and visualizations

### 5.2 Critical Predictive Factors
Top features influencing timeout effectiveness:
1. Pre-timeout offensive efficiency
2. Game period progress
3. Timeout pressure index

### 5.3 Model Performance
- Predictive accuracy: Determined from classification report
- ROC AUC Score: Indicates model's discriminative power

## 6. Conclusions and Recommendations

### 6.1 Strategic Implications
- Timeouts are strategic tools beyond mere game interruptions
- Specific game contexts significantly influence timeout effectiveness
- Data-driven approach reveals nuanced timeout strategies

### 6.2 Practical Recommendations
1. **Timing and Context**
   - Prioritize timeouts during critical momentum shifts
   - Consider pre-timeout offensive efficiency
   - Analyze game period and pressure index

2. **Decision-Making Insights**
   - Use predictive model as decision support tool
   - Understand key factors influencing timeout success

### 6.3 Research Limitations
- Dataset represents specific NBA seasons
- Potential unobserved game complexity
- Model based on available features

### 6.4 Future Research Directions
1. Expand dataset across multiple seasons
2. Incorporate more granular player-level data
3. Develop real-time timeout effectiveness prediction tool
4. Explore additional contextual features

## 7. Visual Summary - All Figures

In [None]:
# Visual summary: Display all saved PNG figures in the notebook
from IPython.display import Image, display
for i in range(1, 10):
    fname = f'dsa project/outputs/figures/img{i}.png'
    print(f'Figure {i}: {fname}')
    try:
        display(Image(filename=fname))
    except Exception as e:
        print(f'Could not display {fname}:', e)