Hackathon project for analyzing motorsport telemetry data from the Toyota Gazoo Racing GR Cup series
This project analyzes race telemetry data from 7 tracks (Barber, COTA, Indianapolis, Road America, Sebring, Sonoma, VIR) with a focus on tire degradation modeling. The dataset includes high-frequency telemetry, lap timing, and race results stored in PostgreSQL with ML-ready preprocessing pipelines.
Status: โ Database loaded (3,257 laps) | โ ML Model trained (Rยฒ = 0.631) | ๐จ Interactive Dashboard
- Python 3.9+
- PostgreSQL 14+
- 100+ GB disk space
# 1. Clone and navigate to project
cd hack_the_track
# 2. Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Connect to database
psql -h localhost -U postgres -d gr_cup_racingfrom src.data_preprocessing import TireDegradationPreprocessor
# Configure database connection
db_config = {
'host': 'localhost',
'database': 'gr_cup_racing',
'user': 'postgres',
'password': ''
}
# Initialize preprocessor
preprocessor = TireDegradationPreprocessor(db_config)
# Get normalized training data (one line!)
X, y = preprocessor.prepare_training_data(
normalization_method='standard', # Z-score normalization
outlier_threshold=3.0
)
# Train your model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor()
model.fit(X, y)NEW! Interactive Streamlit dashboard for visualizing tire degradation predictions in real-time.
- ๐ Live Track Visualization - Animated racing line with degradation overlay on all 7 tracks
- ๐ฎ What-If Analysis - Interactive sliders to test driving style changes
- ๐ฅ Driver Comparison - Side-by-side tire management analysis
- ๐ ML Predictions - Real-time tire wear forecasting using Random Forest model
# Install dashboard dependencies
pip install -r requirements.txt
# Run the dashboard
streamlit run hackathon_app/app.py
# Open browser to http://localhost:8501- Rยฒ Score: 0.631 (63% accuracy)
- MAE: 0.375 seconds/lap
- Training Data: 2,036 laps, 23 features
- Features: Weather conditions, driving aggression, stint position
- Track Visualization - Watch animated laps with degradation heatmap
- What-If Scenarios - "What if I brake 20% softer?" โ See prediction change
- Driver Comparison - Compare tire management efficiency between drivers
๐ Full Documentation: docs/HACKATHON_DASHBOARD.md
hack_the_track/
โโโ README.md # This file - project overview
โโโ requirements.txt # Python dependencies
โโโ db_config.yaml # Database configuration
โโโ Hackathon 2025.pdf # Challenge documentation
โ
โโโ hackathon_app/ # ๐จ Interactive Dashboard (NEW!)
โ โโโ app.py # Main Streamlit landing page
โ โโโ pages/ # Dashboard pages
โ โ โโโ 1_๐_Track_Visualization.py
โ โ โโโ 2_๐ฎ_What_If_Analysis.py
โ โ โโโ 3_๐ฅ_Driver_Comparison.py
โ โโโ utils/ # Dashboard utilities
โ โ โโโ data_loader.py # Database queries
โ โ โโโ model_predictor.py # ML predictions
โ โ โโโ track_plotter.py # Visualizations
โ โโโ assets/ # Track images and branding
โ
โโโ docs/ # Detailed documentation
โ โโโ DATABASE.md # Database schema, ETL, querying
โ โโโ PREPROCESSING.md # ML preprocessing pipeline
โ โโโ HACKATHON_DASHBOARD.md # Dashboard documentation (NEW!)
โ
โโโ models/ # Trained ML models
โ โโโ tire_degradation_model_random_forest_with_weather.pkl
โ โโโ model_metadata_with_weather.json
โ
โโโ src/ # Source code
โ โโโ data_preprocessing.py # TireDegradationPreprocessor class
โ
โโโ sql/ # SQL scripts
โ โโโ schema/
โ โ โโโ schema.sql # Database schema definition
โ โโโ views/
โ โ โโโ create_preprocessing_views.sql # ML views
โ โโโ queries/
โ โโโ ml_queries.sql # Example queries
โ
โโโ ml_data/ # Processed ML datasets
โ โโโ features_normalized.csv
โ โโโ target_degradation.csv
โ โโโ features_with_weather.csv (NEW!)
โ โโโ target_with_weather.csv (NEW!)
โ
โโโ track_maps/ # Track circuit maps (PDFs)
โ
โโโ notebooks/ # Jupyter notebooks
โ โโโ model_training_exploration.ipynb
โ
โโโ scripts/ # Training scripts
โ โโโ train_with_weather.py
โ
โโโ examples/ # Example usage
โ โโโ test_preprocessing.py # Demo preprocessing pipeline
โ
โโโ archive/ # Historical scripts
โโโ etl_scripts/ # Data migration scripts
โโโ column_data/ # CSV metadata
โโโ logs/ # ETL logs
- 10x faster than pure Python (0.5s vs 15s for 10k laps)
- SQL pre-aggregates telemetry into lap-level features
- Python handles normalization & ML pipelines
- 21 aggression metrics per lap (brake pressure, lateral G's, steering smoothness)
- Automatic outlier filtering & data quality checks
- Target variable: lap time degradation over stint
lap_aggression_metrics: Lap-level telemetry featuresstint_degradation: Tire degradation indicatorsvehicle_aggression_profile: Driving style summaries
PostgreSQL: gr_cup_racing
- Tables: tracks, races, sessions, laps, telemetry_readings (100M+ rows)
- Views: 3 pre-computed views for fast ML data retrieval
- Indexes: Optimized for vehicle_id, lap_id, meta_time queries
Aggression Metrics:
pbrake_f,pbrake_r- Front/rear brake pressure (bar)accy_can- Lateral G forces (cornering aggression)accx_can- Longitudinal acceleration/brakingSteering_Angle- Steering wheel angle (smoothness)aps,ath- Throttle pedal & blade position
Speed & Engine:
Speed- Vehicle speed (km/h)Gear- Current gear selectionnmot- Engine RPM
Position:
VBOX_Long_Minutes,VBOX_Lat_Min- GPS coordinatesLaptrigger_lapdist_dls- Distance from start/finish (m)
- Lap #32768: Erroneous lap count (filtered)
- ECU timestamps may be inaccurate (we use
meta_time) - Vehicle IDs tracked by chassis number for consistency
See Hackathon 2025.pdf for complete data specifications.
# Run example script
python examples/test_preprocessing.py
# Query database directly
psql -h localhost -U postgres -d gr_cup_racing# See examples/test_preprocessing.py for complete example
from src.data_preprocessing import TireDegradationPreprocessor
preprocessor = TireDegradationPreprocessor(db_config)
X, y = preprocessor.prepare_training_data()
# Your model training code here...# Views are already created, but to recreate:
psql -h localhost -U postgres -d gr_cup_racing -f sql/views/create_preprocessing_views.sql- docs/HACKATHON_DASHBOARD.md - ๐จ Interactive dashboard guide (NEW!)
- docs/DATABASE.md - Database schema, ETL pipeline, SQL queries
- docs/PREPROCESSING.md - ML preprocessing, feature engineering, API reference
- Hackathon 2025.pdf - Official challenge documentation
- Data Processing: pandas, numpy
- Visualization: matplotlib, seaborn, plotly
- Database: sqlalchemy, psycopg2-binary
- Machine Learning: scikit-learn
- Config: PyYAML, tqdm, tabulate
Install all: pip install -r requirements.txt
Server Name: GR Cup Racing
db_config = {
'host': 'localhost',
'database': 'gr_cup_racing',
'user': 'postgres',
'password': '' # Update if password-protected
}Command Line:
psql -h localhost -U postgres -d gr_cup_racing| Operation | Time | Dataset Size |
|---|---|---|
| Load lap features (SQL) | ~0.5s | 2,545 laps |
| Normalize features (Python) | ~1s | 21 features |
| Total preprocessing | ~1.5s | โ 10x faster than pandas |
- โ Database loaded - 3,257 laps from 8 races
- โ Preprocessing ready - SQL views + Python pipeline
- โญ๏ธ Train models - RandomForest, XGBoost, Neural Networks
- โญ๏ธ Optimize - Find optimal aggression level per track
- โญ๏ธ Visualize - Plot aggression vs degradation curves
- Series: SRO Motorsports
- 2025 Season: Search "TGRNA GR CUP NORTH AMERICA"
- 2024 Season: Search "Toyota GR Cup"
- Official Timing: Available through SRO website
This is hackathon data for analyzing Toyota GR86 Cup racing performance. Common analysis tasks:
- Lap time prediction
- Tire degradation modeling
- Driver style classification
- Optimal racing line analysis
- Telemetry visualization
Good luck with your racing data analysis! ๐
For detailed documentation, see: