# Tennis Ball Trajectory Prediction with LSTM
## Data Preparation and Model Training - Paper Implementation

This notebook prepares the preprocessed tennis data for the LSTM model and runs the complete training pipeline as specified in the research paper.

### Paper Implementation (Sections 7.8-7.14):
- **LSTM Architecture**: Two layers (128→64 units) with TimeDistributed output
- **Input Specification**: 12 frames × 69 features → 5 frames × 2 coordinates
- **Preprocessing**: Savitzky-Golay filtering + Z-score normalization
- **Optimization**: Bayesian hyperparameter search (30 iterations)
- **Evaluation**: RMSE metrics with comprehensive error analysis

### Target Performance (Paper Results):
- **RMSE X-axis**: 1.19 meters (9.8% improvement over GBRT)
- **RMSE Y-axis**: 1.04 meters (5.5% improvement over GBRT)
- **Error Growth**: 26% increase from first to last prediction frame
- **Serve Impact**: 18% higher error during serve/impact phases

In [None]:
import sys
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

from tennis_lstm import TennisLSTMAligned, TennisLSTMModel, TennisLSTMDataset
import torch
import torch.nn as nn
from torch.utils.data import DataLoader

import data_analysis_help.lstm_training_helper as lstm_helper

plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 8)

print("Libraries imported successfully!")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print("Ready for LSTM tennis ball trajectory prediction")

## 1. Data Loading and Initial Setup

In [None]:
lstm_model = lstm_helper.initialize_lstm_model()

dataset_file = 'complete_tennis_comprehensive_preprocessed_ml4qs.csv'
dataset_available = lstm_helper.check_dataset_availability(dataset_file)

## 2. Data Loading and Preprocessing

In [None]:
df_raw = lstm_helper.load_and_prepare_lstm_data(lstm_model, dataset_file)

## 3. Feature Engineering (69 Features per Frame)

In [None]:
df_features, feature_cols = lstm_helper.create_lstm_features(lstm_model, df_raw)

## 4. Sequence Preparation for LSTM

In [None]:
X, y = lstm_helper.prepare_lstm_sequences(lstm_model, df_features, feature_cols)

## 5. Data Splitting and Preprocessing

In [None]:
X_train_norm, X_test_norm, y_train_norm, y_test_norm = lstm_helper.apply_temporal_split_and_normalization(lstm_model, X, y)

## 6. Model Architecture Visualization

In [None]:
demo_model, total_params, trainable_params = lstm_helper.create_and_visualize_model_architecture(lstm_model)

## 7. Bayesian Hyperparameter Optimization

In [None]:
run_optimization = True  # Set to False to skip for quick testing

best_params, opt_results, opt_fig = lstm_helper.run_bayesian_optimization(
    lstm_model, X_train_norm, X_test_norm, y_train_norm, y_test_norm, run_optimization
)

if opt_fig is not None:
    plt.show()

## 8. Model Training

In [None]:
final_model, train_losses, val_losses, train_loader, test_loader = lstm_helper.train_final_model(
    lstm_model, best_params, X_train_norm, X_test_norm, y_train_norm, y_test_norm
)

training_fig = lstm_helper.create_training_progress_plots(train_losses, val_losses)
plt.show()

## 9. Model Evaluation and Paper Comparison

In [None]:
results, improvement_x, improvement_y = lstm_helper.evaluate_model_performance(lstm_model, final_model, test_loader)

performance_fig = lstm_helper.create_performance_comparison_plots(results, improvement_x, improvement_y)
plt.show()

## 10. Comprehensive Error Analysis (Paper Section 7.14)

In [None]:
error_analysis = lstm_helper.perform_comprehensive_error_analysis(lstm_model, results)

error_fig = lstm_helper.create_error_analysis_plots(error_analysis)
plt.show()

## 11. Spatial Error Heatmap

In [None]:
spatial_errors = lstm_helper.create_spatial_error_heatmap(lstm_model, results)

## 12. Model Deployment and Performance Summary

In [None]:
export_info = lstm_helper.save_model_and_create_summary(
    final_model, results, error_analysis, best_params, train_losses, trainable_params, improvement_x, improvement_y
)