DATASCIENCE HACK BY TOYOTA – Hack The Track

Accelerate Insights, Dominate Race Strategies Instantly

Built with the tools and technologies:

📄 Table of Contents

Overview
Inspiration
What It Does
How We Built It
Challenges We Ran Into
Accomplishments That We’re Proud Of
What We Learned
Whats Next for Racing Hokies
Tech Stack
Data Sources
Getting Started
Features
Project Structure
- Project Index
Roadmap
Contributing
License
Acknowledgment

✨ Overview

DataScienceHackbyToyota is an advanced simulation and analysis platform built around Toyota GR86 GR Cup data, with a focus on Barber Motorsports Park. It combines interactive visualizations, large-scale telemetry processing, and AI-driven insights (via Google Gemini 2.5 Flash) to help a race engineer answer questions like:

What’s our ideal pit window right now?
How fast are our tyres degrading by stint and by sector?
What should we do if a caution comes out in the next 2–3 laps?
If we box now, do we win or lose track position by the flag?

The system links:

A Tkinter + Matplotlib race map with a moving car icon, running on the actual digitised Barber layout.
A Streamlit “race engineer console” that updates from a live JSON state file.
A strategy engine that simulates tyre degradation, pit windows, caution scenarios, and “mini multiverse” Monte Carlo races.
Gemini 2.5 Flash for real-time natural-language insights, radio messages, and decision reviews.
A predictive lap-time model (Random Forest) trained on GR86 lap features.
An experimental computer-vision pipeline using Gemini Vision on sample GR86 images.

💡 Inspiration

In modern motorsport, races are often decided by fractions of a second and one pit call. A mistimed stop or a slow reaction to a Safety Car can cost multiple positions and tens of seconds of race time, even when the driver’s pace is strong.

At the the same time, a single car can generate millions of telemetry data points per race (speed, throttle, brake, tyres, weather, timing, gaps, etc.). Race engineers have to digest all of this under pressure, in real time, while talking to the driver and coordinating with the team.

We wanted to build a tool that acts like an AI co-engineer: watching the data continuously, surfacing only what matters (“box now”, “tyre cliff in 3 laps”, “caution window coming”), and turning RAW NUMBERS 🔜 DECISIONS.

🏎️ What It Does

Our project turns real GR86 GR Cup data into an interactive race-strategy cockpit for Barber Motorsports Park:

🏁 Live track sim – a car icon runs around a digitised Barber circuit map based on lap timing and stint logic.
📊 Real-time strategy console – every second, the app updates:
- Tyre phase (warm-up / stable / degradation)
- Net gain/loss if we pit now vs 2 laps earlier/later
- Caution / Safety-Car “what if” (next 3 laps)
- Clean air vs traffic risk, gaps ahead/behind
📈 Predictive lap-time model – Random Forest model trained on lap features (aps_mean, pbrake_f_mean, …) for GR86-002-000 @ Barber R2, with:
- RMSE and R² validation metrics
- Comparison plots: actual vs predicted lap times, residuals, parity plot
- Serialized model + JSON metadata for use in the app
🤖 Gemini-powered race engineer radio – Gemini 2.5 Flash reads the current lap metrics plus notebook insights and generates short, actionable radio calls:
- “Box now under caution – you’ll undercut P5 by ~1.2s”
- “Stay out, overcut in clean air, target 2 laps more”
- “Tyres stable – push S2, save S3”
💬 Strategy Chat – a chat assistant that:
- Sees track & car context, best strategy table, and current lap snapshot
- Answers engineer-style questions in plain English
- Stays grounded in the numbers and cites which metrics it used
🧠 Decision Reviewer – an AI “second pair of eyes”:
- You describe your intended radio call (“Box now for 4 tyres and fuel to the end”)
- Gemini reviews it with the current race context and returns:
  - Verdict (Go / Borderline / Don’t do it)
  - Rationale and key risks
  - Safer alternative calls
👁️ Vision + Gemini – an experimental computer-vision notebook:
- Uses Gemini 2.5 Flash (vision) on data/vision/sample_gr86_barber.png
- Asks the model to describe car position, lane usage, runoff, and risk
- Aggregates outputs into simple stats and a plot (e.g. lane-centre histogram)

🛠️ How We Built It

Data

We used the official TRD hackathon dataset from:
https://trddev.com/hackathon-2025/

Key Barber files (race 1 & 2):

R1_barber_telemetry_data.csv – 11,556,519 rows × 13 columns
R2_barber_telemetry_data.csv – 11,749,604 rows × 13 columns
Plus timing, weather, sector stats and results files — about 20 CSV files in total.

We pre-processed these into lap-level features and strategy summaries (see /data/processed and the analysis notebooks 03–11_barber_*.ipynb).

Core stack

Python, NumPy, pandas – data processing, lap & stint features, tyre-deg models
Matplotlib – live animation of the car icon moving around the digitised Barber map and plotting strategy / model results
Streamlit – race-engineer dashboard with tabs:
- Strategy Brain
- Driver Insights
- Predictive Models
- Strategy Chat
- Live Race Copilot
scikit-learn – Random Forest regression for lap-time prediction
Google Gemini 2.5 Flash – chat, radio-style strategy calls, decision review, and experimental vision
Tkinter / Matplotlib backends – local animation window that writes a live_state JSON
Custom modules:
- strategy_engine.py – degradation modeling, pit window simulation, Monte Carlo strategy multiverse
- pit_model.py – stint segmentation and simple deg curves
- track_meta.py – track metadata, pit-lane loss assumptions
- live_state.py – atomic JSON sync between animation and Streamlit
- predictive_models.py – lap-time model training / saving / inference
- chat_assistant.py – LLM-powered strategy chat context builder
- decision_reviewer.py – structured AI review of engineer calls
- vision_gemini.py – helper for running Gemini vision analysis over images

The desktop animation (barber_lap_anim.py) continuously updates a JSON file (data/live/live_state_barber.json). The Streamlit app (streamlit_app.py) reads that same state up to once per second, recomputes metrics, optionally calls Gemini, and renders the UI.

🧗 Challenges We Ran Into

Handling huge telemetry files – reading >11 million-row CSVs per race meant we had to be careful with memory, summarising to lap/stint-level data before doing heavier modeling.
Keeping everything in sync – we needed the Matplotlib animation, JSON writer, and Streamlit dashboard to stay in lockstep without race conditions or crashes.
Prompt design for Gemini – making sure the AI produces short, trustworthy, race-engineer-style bullet points instead of essays was an iteration loop of its own.
Unifying many tools – predictive model, chat, decision review, and vision all had to coexist cleanly inside a single Streamlit file.

🏆 Accomplishments That We’re Proud Of

Turning raw GR Cup telemetry into a live strategy simulator that really feels like a race-engineer console, not just static plots.
Building an AI radio feed that reacts to tyre life, pit window, and caution scenarios in language a driver could actually understand mid-race.
Shipping a working lap-time Random Forest model with sane metrics and visual diagnostics.
Creating a modular pipeline: notebooks → processed features → strategy engine → live animation → Streamlit + Gemini.

📚 What We Learned

How quickly motorsport data explodes in size, and why aggregation & feature engineering are critical before doing any fancy modeling.
The importance of race-decision framing: engineers don’t want “here’s every metric,” they want “what should we do this lap and why?”.
How to combine classical modeling (tyre deg, pit-lane loss, Monte Carlo strategies) with LLM-synthesised insights so the AI is grounded in real numbers.
Practical patterns for safely using Gemini 2.5 Flash for both chat and structured decision review.

🔮 Whats Next for Racing Hokies

❯❯❯❯ The Future – expand real-time voice assistant, deeper AI decision reviewer, richer computer vision system, and more predictive models.
💠 Multi-car & multi-track support – extend to more GR Cup cars and to VIR using the rest of the TRD dataset.
🌐 Cloud + live feed – adapt the pipeline to live telemetry streams and host the dashboard so a team can connect during a session.
🧠 Richer “what-if” engine – allow the engineer to simulate alternative strategies (extra stop, short-fill, extreme fuel save) and compare projected race time in real time.
👂 Driver-aware coaching – incorporate driver consistency, error patterns, and sector strengths into the radio calls (“strong in S1, losing time in S3 braking zone – adjust bias + lift earlier”).

By combining real-world-scale data, physics-style models, and LLM insights, our goal is to give race engineers a tool that doesn’t just visualise the race — it helps call it.

🧩 Tech Stack

Languages & Libraries

Python, NumPy, pandas, SciPy
Matplotlib, Seaborn (in notebooks)
scikit-learn (Random Forest regression)

Frameworks & Platforms

Streamlit – interactive race engineer dashboard
Tkinter + Matplotlib – local live animation window
Jupyter – data exploration and model development

AI & APIs

Google Gemini 2.5 Flash (text & vision) via google-generativeai

Data & Storage

CSV-based telemetry & timing data (TRD dataset)
JSON live state in data/live/*.json
Joblib + JSON for model artifacts in models/

Tooling

GitHub for version control
VS Code / JupyterLab for development

📊 Data Sources

This project uses official hackathon data and maps from:

TRD Dev – DataScienceHackbyToyota 2025
Dataset page: https://trddev.com/hackathon-2025/
Barber Motorsports Park telemetry bundle
File: barber-motorsports-park.zip
~20 CSV files including:
- R1_barber_telemetry_data.csv — Rows: 11,556,519 • Cols: 13
- R2_barber_telemetry_data.csv — Rows: 11,749,604 • Cols: 13
  plus lap timing, weather, best-10 laps, results, etc.
Barber circuit map (official PDF)
URL: https://trddev.com/hackathon-2025/Barber_Circuit_Map.pdf

In this repo, the track map image used for digitising and visualisation is:

data/track_maps/IMG_4381.jpg

Embedded preview:

The digitised centerline (from manual clicks on this map) lives in:

data/track_geom/barber_track_xy.csv
data/track_geom/barber_track_xy_s.csv

These are what the car icon animation and strategy tools use.

🚀 Getting Started

📋 Prerequisites

Python: 3.10+ (tested with 3.10 / 3.11 / 3.13)
Package Manager: pip
(Optional) virtualenv / venv for isolation
(Optional but recommended) Google Gemini API key for AI insights (chat, decision review, vision)

⚙️ Installation

Clone the repository

git clone https://github.com/ngstephen1/DataScienceHackbyToyota.git
cd DataScienceHackbyToyota

Create & activate a virtual environment (recommended)

python -m venv .venv
source .venv/bin/activate        # on macOS / Linux
# .venv\Scripts\Activate       # on Windows PowerShell

Install dependencies

pip install -r requirements.txt

(Optional) Configure Gemini

Set your Gemini API key and model name (Gemini 2.5 Flash):

export GEMINI_API_KEY="your-key-here"
export GEMINI_MODEL_NAME="gemini-2.5-flash"

On Windows PowerShell:

$env:GEMINI_API_KEY="your-key-here"
$env:GEMINI_MODEL_NAME="gemini-2.5-flash"

You can quickly verify the key with:

python - << 'PY'
import os, google.generativeai as genai
api_key = os.getenv("GEMINI_API_KEY")
model_name = os.getenv("GEMINI_MODEL_NAME", "gemini-2.5-flash")
if not api_key:
    raise SystemExit("GEMINI_API_KEY is not set.")
genai.configure(api_key=api_key)
print(f"Trying model: {model_name}")
model = genai.GenerativeModel(model_name)
resp = model.generate_content("Say exactly: Gemini OK.")
print("Response:", resp.text.strip())
PY

▶️ Usage

There are two main entrypoints: the animated map and the Streamlit race-engineer console. They communicate through data/live/ JSON files.

1. Run the animated Barber map (Tkinter + Matplotlib)

This opens a window showing the Barber map with a car icon moving along the digitised track, plus real-time metrics and Gemini insights:

python src/barber_lap_anim.py

This script:

Loads the Barber track map (data/track_maps/IMG_4381.jpg).
Loads digitised centerline from data/track_geom/barber_track_xy_s.csv.
Reads lap features and strategy summaries from data/processed/barber/....
Animates one car icon per lap based on lap times.
Writes live state to:

data/live/barber_state.json
data/live/live_state_barber.json

2. Run the Streamlit “Race Engineer Console”

In another terminal (same venv):

streamlit run streamlit_app.py

The Streamlit app provides multiple tabs:

Strategy Brain – strategy multiverse, caution simulation, best strategy summary
Driver Insights – lap & sector trends, tyre phases, consistency
Predictive Models – Random Forest lap-time model summary and diagnostic plots
Strategy Chat – Gemini 2.5 Flash chat assistant with full race context
Live Race Copilot – live state from animation, decision reviewer, and AI radio

If your Gemini key is configured, you’ll also see Gemini insights and decision review outputs (short bullet-point radio style).

💡 Tip
For a complete “live” demo, run:

Terminal 1: python src/barber_lap_anim.py

Terminal 2: streamlit run streamlit_app.py

🧪 Testing

There is no full automated test suite yet. For a quick sanity check:

python tests/manual_test.py

This exercises core loading and geometry logic to ensure things run without errors.

(You’re encouraged to add pytest tests for src/ modules if you extend this project.)

📦 Features

	Component	Details
⚙️	Architecture	Modular Jupyter Notebook workflows for data analysis and modeling Separation of data processing (`src/`), visualization (`barber_lap_anim.py`, `streamlit_app.py`), and strategy logic (`strategy_engine.py`)
🔩	Code Quality	Clear function boundaries in modules Notebooks used for exploratory analysis, with logic gradually migrated into reusable functions
📄	Documentation	README with overview, data sources, and how to run `RUN.md` for step-by-step setup and demo instructions
🔌	Integrations	`requirements.txt` for dependency management Integrates `numpy`, `pandas`, `matplotlib`, `streamlit`, `google-generativeai`, `scikit-learn`
🧩	Modularity	Separate notebooks for VIR vs Barber, race 1 vs race 2 Reusable strategy, track-meta, telemetry loader, predictive modeling, chat, decision review, and vision modules
🧪	Testing	Basic manual tests via `tests/manual_test.py` (no formal unit test suite yet)
⚡️	Performance	Handles 10M+ row telemetry CSVs using `pandas` and columnar workflows Animation and dashboard driven by pre-aggregated lap features
🛡️	Security	No external services beyond Gemini API; API key is read from environment variables
📦	Dependencies	Managed via `requirements.txt` Includes `jupyter`, `streamlit`, `matplotlib`, `pandas`, `numpy`, `ipykernel`, `google-generativeai`, `scikit-learn`

📁 Project Structure

└── DataScienceHackbyToyota/
    ├── README.md
    ├── RUN.md
    ├── data
    │   ├── raw/            # Original TRD telemetry, results, weather, etc.
    │   ├── processed/      # Derived lap features, sector summaries, strategy outputs
    │   ├── track_geom/     # Digitised track centerline for Barber
    │   ├── track_maps/     # Track images (PDF-derived JPG/PNG) + car icon
    │   └── vision/         # Sample GR86 images for Gemini vision
    ├── models
    │   ├── lap_time_barber_GR86-002-000.joblib
    │   └── lap_time_barber_GR86-002-000.json
    ├── notebooks           # VIR + Barber exploration, lap times, sections, strategy MVP
    │   ├── 01_explore_vir.ipynb
    │   ├── 02_vir_sectors_r1r2.ipynb
    │   ├── 03_barber_telemetry_r1.ipynb
    │   ├── 04_barber_lap_times_r1.ipynb
    │   ├── 05_barber_sections_r1.ipynb
    │   ├── 06_barber_telemetry_r2.ipynb
    │   ├── 07_barber_lap_times_r2.ipynb
    │   ├── 08_barber_sections_r2.ipynb
    │   ├── 09_barber_driver_profile.ipynb
    │   ├── 10_barber_strategy_mvp.ipynb
    │   ├── 11_barber_predictive_model.ipynb
    │   └── 13_vir_telemetry_r1.ipynb
    ├── requirements.txt
    ├── src
    │   ├── __init__.py
    │   ├── barber_build_track_s.py
    │   ├── barber_digitize_track.py
    │   ├── barber_lap_anim.py
    │   ├── chat_assistant.py
    │   ├── decision_reviewer.py
    │   ├── live_state.py
    │   ├── pit_model.py
    │   ├── predictive_models.py
    │   ├── strategy_cli.py
    │   ├── strategy_engine.py
    │   ├── telemetry_loader.py
    │   ├── test.py
    │   ├── track_meta.py
    │   ├── track_utils.py
    │   └── vision_gemini.py
    ├── streamlit_app.py   # Live dashboard / race engineer console
    ├── tests
    │   └── manual_test.py
    └── tools
        └── extract_barber_r1_vehicle.py

📑 Project Index

DataScienceHackbyToyota/

__root__

⦿ __root__

File Name Summary

README.md - The provided code file, README.md, serves as the foundational documentation for the DATASCIENCE HACK BY TOYOTA-Hack The Track project
- Its primary purpose is to introduce the project’s goal of leveraging data science techniques to optimize race strategies, enabling users to gain rapid insights and make data-driven decisions in racing scenarios
- The README offers an overview of the project’s objectives, highlights its key features, and provides contextual information to orient users and contributors within the overall architecture
- It acts as a gateway to understanding how this codebase fits into Toyota’s broader initiative to accelerate insights and dominate race strategies through data science.

streamlit_app.py - Streamlit_app.pyThis script serves as the central entry point for the applications user interface, orchestrating various components to deliver an integrated experience
- It leverages multiple modules to facilitate decision review, predictive modeling, conversational assistance, and visual analysis within a cohesive Streamlit-based frontend
- By managing data paths, initializing live state storage, and coordinating interactions, this file enables users to analyze race data, receive insights, and interact with AI-driven tools seamlessly, thereby supporting the broader architecture of a race analysis and decision support platform.

requirements.txt - Defines project dependencies essential for data analysis, visualization, and machine learning tasks
- Ensures all necessary libraries are available for building, training, and deploying models, as well as creating interactive dashboards and conducting exploratory data analysis within the broader architecture
- Supports seamless environment setup to facilitate efficient development and reproducibility across the entire codebase.

RUN.md - Provides step-by-step instructions to set up and launch the web-based visualization for the Barber project
- It guides users through cloning the repository, creating an isolated environment, installing dependencies, and running the Streamlit application
- This facilitates interactive exploration of data insights, integrating seamlessly into the overall data science workflow within the project architecture.

src

⦿ src

File Name Summary

predictive_models.py - Provides tools for training, saving, loading, and applying machine learning models to predict lap times based on lap summary features
- Facilitates quick deployment of models for real-time or batch predictions, supporting analysis and optimization of driving performance across different tracks and vehicles within the overall racing simulation or telemetry system.

chat_assistant.py - Provides a GPT-powered interface for race strategy and telemetry analysis in GT racing
- It generates contextual summaries, evaluates driver pace, and formulates radio-style responses to engineering questions by integrating static race data and live telemetry snapshots
- This facilitates real-time strategic decision-making and communication, enhancing race management and driver support within the overall system architecture.

pit_model.py - Implements a vehicle lap analysis pipeline that detects pit laps, segments laps into stints, and models lap time degradation over race duration
- Facilitates understanding of performance decline, enabling insights into vehicle wear and strategy optimization within the broader race data architecture.

barber_digitize_track.py - Provides an interactive tool to digitize the Barber Motorsports Park track centerline by allowing users to click along the track map
- If interactivity isnt available, generates a synthetic oval track as a fallback
- Outputs normalized and pixel coordinates of the track centerline to a CSV file, integrating seamlessly into the larger project architecture focused on track geometry analysis and visualization.

telemetry_loader.py - Provides tools to load, process, and summarize vehicle telemetry data from CSV files
- Facilitates lap and sector identification based on distance metrics, enabling detailed per-lap analysis and track segmentation
- Supports flexible data extraction for performance metrics, supporting race analysis, track mapping, and telemetry-driven insights within the overall simulation or racing data architecture.

strategy_engine.py - Provides a comprehensive framework for simulating and optimizing race strategies by modeling lap times, tyre degradation, and caution effects
- Facilitates evaluation of pit stop timing, race outcomes, and probabilistic scenarios through Monte Carlo simulations, supporting real-time analytics and decision-making in race engineering and strategy planning.

vision_gemini.py - Provides tools for analyzing race track images using a generative AI model to extract detailed vehicle and track insights
- Facilitates batch processing of images, generating structured statistics on car detection, positioning, speed, and risk levels
- Includes visualization functions to summarize lane positioning and risk trends across multiple frames, supporting race strategy and safety assessments within the overall system architecture.

track_utils.py - Provides functionality to determine the current sector of a vehicle based on its lap distance within a racing circuit
- It leverages track metadata to map a given distance to one of three predefined sectors, supporting real-time race analysis and telemetry processing within the overall track management system.

test.py - Visualizes the digitized centerline of the Barber track by loading coordinate data and generating a scaled plot
- It supports the broader project by providing a clear graphical representation of track geometry, facilitating analysis and validation within the overall data processing and simulation workflows
- This enhances understanding of track layout essential for modeling and testing purposes.

barber_lap_anim.py - The src/barber_lap_anim.py file is a core component responsible for visualizing and animating lap data within the project
- It orchestrates the creation of dynamic, interactive plots that depict lap progressions, leveraging data processing and visualization libraries
- Additionally, it integrates live state management by exporting real-time updates to JSON, facilitating seamless updates and external integrations
- This module plays a pivotal role in translating raw lap metrics into insightful, animated representations, thereby supporting analysis and presentation of racing performance within the overall architecture.

live_state.py - Manages persistent storage of live state data for individual tracks within the project
- Facilitates saving and retrieving real-time status information by generating standardized file paths and ensuring atomic updates
- Supports the overall architecture by maintaining up-to-date, reliable state information crucial for real-time processing and data consistency across the system.

track_meta.py - Defines metadata for various race tracks, including identifiers, names, pit lane times, lengths, and optional geometric paths
- Serves as a centralized reference for track-specific information, supporting accurate simulation, analysis, and visualization within the broader racing data processing architecture
- Facilitates consistent access to track attributes across the project.

barber_build_track_s.py - Calculates and appends normalized and cumulative arc length data to track geometry, enabling precise spatial analysis within the overall racing simulation architecture
- Facilitates accurate positioning and timing along the track by transforming raw coordinate data into standardized distance metrics, supporting downstream components such as vehicle dynamics, telemetry, and visualization modules.

decision_reviewer.py - Provides an AI-powered review mechanism for race engineer decisions during live races
- It evaluates proposed strategic calls by leveraging a generative model to generate structured, markdown-formatted feedback, including verdicts, rationales, risks, and safer alternatives
- When the AI model is unavailable, it defaults to a rule-based heuristic, ensuring continuous decision support within the race strategy architecture.

strategy_cli.py - Provides a command-line interface for recommending optimal pit strategies in racing scenarios by analyzing lap data, simulating various strategies, and evaluating their performance under different caution conditions
- Integrates track metadata and lap features to generate probabilistic insights, enabling users to select strategies with the highest win probability and best race time across multiple simulated universes.

models

⦿ models

File Name Summary

lap_time_barber_GR86-002-000.json - Provides a JSON model representing lap time predictions for a specific track and vehicle, capturing key features and performance metrics
- It supports the broader architecture by enabling data-driven analysis and validation of vehicle performance models, facilitating insights into how selected features influence lap times and aiding in the refinement of predictive accuracy within the racing simulation ecosystem.

lap_time_barber_GR86-002-000.joblib Certainly! Please provide the code file youd like me to summarize, along with any additional context or project structure details youd like me to consider.

notebooks

⦿ notebooks

File Name Summary

11_barber_predictive_model.ipynb - The notebooks/11_barber_predictive_model.ipynb file serves as an exploratory and predictive modeling notebook within the project
- Its primary purpose is to analyze historical data related to barber services and develop a machine learning model that predicts future customer demand or service outcomes
- This notebook facilitates data-driven decision-making by enabling stakeholders to forecast trends, optimize resource allocation, and improve operational efficiency across the broader system architecture
- It acts as a key component for integrating predictive insights into the overall project, supporting strategic planning and automation efforts.

08_barber_sections_r2.ipynb - The notebooks/08_barber_sections_r2.ipynb file serves as a key analytical component within the project, focusing on segmenting and analyzing barber-related data
- Its primary purpose is to process and visualize data to identify distinct barber sections or categories, contributing to the broader goal of understanding customer segmentation or service patterns
- This notebook supports the overall architecture by providing insights that inform decision-making, enhance data-driven strategies, and improve the systems ability to tailor services or optimize operations based on the identified barber sections.

07_barber_lap_times_r2.ipynb - SummaryThis notebook analyzes lap time data for the Barber Motorsports Park, serving as a key component in the project’s data exploration and performance evaluation pipeline
- It aims to extract insights from lap time recordings, facilitating performance benchmarking and trend identification within the broader racing analytics architecture
- By processing and visualizing lap time metrics, this notebook supports the project's goal of enhancing race strategy and driver performance analysis.---If you provide the full content or specific details from the notebook, I can tailor the summary further!

05_barber_sections_r1.ipynb - The notebooks/05_barber_sections_r1.ipynb file serves as an analytical step within the project, focusing on segmenting and analyzing barber-related data
- Its primary purpose is to process and visualize data to identify distinct barber sections or categories, contributing to the broader goal of understanding and organizing barber shop data within the overall architecture
- This notebook supports data exploration and feature extraction, which are essential for downstream modeling or reporting tasks in the project pipeline.

06_barber_telemetry_r2.ipynb - The notebooks/06_barber_telemetry_r2.ipynb file serves as an analytical exploration within the project, focusing on processing and visualizing telemetry data related to barber operations
- Its primary purpose is to validate data existence, perform preliminary data analysis, and generate visual insights that support understanding patterns or anomalies in the telemetry dataset
- This notebook contributes to the broader codebase by enabling data-driven decision-making and validation, ensuring the integrity and usefulness of telemetry data across the system architecture.

04_barber_lap_times_r1.ipynb - The notebooks/04_barber_lap_times_r1.ipynb file serves as an analytical exploration within the project, focusing on processing and visualizing lap time data related to barber race events
- Its primary purpose is to analyze race performance metrics, providing insights into lap times and patterns that contribute to understanding race dynamics
- This notebook supports the broader codebase by enabling data-driven assessments of race performance, which can inform model development, performance optimization, or strategic decision-making within the overall architecture.

02_vir_sectors_r1r2.ipynb - The notebooks/02_vir_sectors_r1r2.ipynb notebook is designed to analyze and visualize the distribution of virtual sectors within the broader data processing pipeline
- It serves as a crucial step in understanding sector-specific patterns and relationships, supporting the projects goal of detailed data segmentation and insights extraction
- This notebook contributes to the overall architecture by enabling targeted analysis of virtual sectors, which can inform downstream modeling, reporting, or decision-making processes across the system.

10_barber_strategy_mvp.ipynb - The notebooks/10_barber_strategy_mvp.ipynb file serves as a core component for demonstrating and validating the Barber Strategy within the project
- It functions as a proof-of-concept or minimal viable product (MVP) notebook that showcases how the strategy operates in practice, providing insights into its effectiveness and potential application
- This notebook is integral to the overall architecture by enabling experimentation, testing, and visualization of the strategy's performance, thereby supporting iterative development and refinement of the trading or decision-making approach across the project.

01_explore_vir.ipynb - Summary of notebooks/01_explore_vir.ipynbThis notebook serves as an initial exploratory analysis of the VIR Tele dataset within the broader project architecture
- Its primary purpose is to understand the datas structure, quality, and key characteristics, laying the groundwork for subsequent data processing, modeling, and integration tasks
- By providing insights into the dataset, this notebook helps inform decisions on data cleaning, feature engineering, and overall pipeline design, ensuring that the project leverages high-quality, well-understood data for accurate and reliable outcomes.

09_barber_driver_profile.ipynb - The notebooks/09_barber_driver_profile.ipynb file serves as an analytical notebook within the project, focusing on profiling and understanding the characteristics of barber drivers
- It contributes to the broader data exploration and feature analysis efforts, helping to identify key patterns and insights related to this specific driver segment
- This work supports the overall architecture by informing data-driven decision-making and feature engineering strategies across the project.

13_vir_telemetry_r1.ipynb - Defines and retrieves metadata for Virginia International Raceway, integrating track-specific details into the broader telemetry data processing framework
- Facilitates contextual understanding of track characteristics, enabling accurate analysis and modeling within the overall telemetry and race data architecture
- Supports data organization and consistency across different data sources and analysis workflows.

03_barber_telemetry_r1.ipynb - The notebooks/03_barber_telemetry_r1.ipynb file serves as an analytical exploration within the project, focusing on processing and visualizing telemetry data related to barber services
- Positioned within the broader codebase, this notebook likely functions as a data analysis and validation tool, enabling stakeholders to understand patterns, performance metrics, or operational insights derived from telemetry streams
- Its role is to facilitate data-driven decision-making by transforming raw telemetry data into meaningful visualizations, thereby supporting the projects overarching goal of monitoring and optimizing barber-related services or systems.

tools

⦿ tools

File Name Summary

extract_barber_r1_vehicle.py - Extracts telemetry data specific to a designated vehicle from a large dataset, enabling focused analysis of that vehicles performance
- This script supports the overall data pipeline by isolating individual vehicle data, facilitating detailed investigations or model training on specific vehicle behavior within the broader telemetry data architecture.

📈 Roadmap

Task 1: Build Barber R2 strategy MVP and live race-engineer demo (animation + Streamlit + Gemini).
Task 1.5: Add predictive lap-time model, Strategy Chat, Decision Reviewer, and Gemini vision prototype.
Task 2: Generalise live tooling to VIR and additional tracks from the TRD dataset.
Task 3: Add richer Monte Carlo strategy simulations and multi-car race scenarios into the Streamlit UI.
Task 4: Real-time voice assistant on top of Strategy Chat + Decision Reviewer.

🤝 Contributing

💬 Join the Discussions – Ideas, feedback, and questions.
🐛 Report Issues – Bugs, edge cases, or feature requests.
💡 Submit Pull Requests – Improvements to strategy models, visualisations, predictive modeling, or Gemini prompts are very welcome.

Contributing Guidelines

Fork the Repository

git fork https://github.com/ngstephen1/DataScienceHackbyToyota

Clone Locally

git clone https://github.com/<your-username>/DataScienceHackbyToyota
cd DataScienceHackbyToyota

Create a New Branch
```
git checkout -b feature/my-improvement
```
Make Your Changes – and run the demo / manual tests.

Commit Your Changes

git commit -m "Add <short description of change>"

Push to GitHub
```
git push origin feature/my-improvement
```
Open a Pull Request – Describe what you changed and why.

Contributor Graph

📜 License

DataScienceHackbyToyota is released under the MIT License.
See the LICENSE file for details.

✨ Acknowledgments

Toyota Racing Development (TRD) for providing the GR86 GR Cup data and Barber circuit map.
DataScienceHackbyToyota 2025 organisers for framing the “real-time race engineer” challenge.
Open-source libraries: numpy, pandas, matplotlib, streamlit, scikit-learn, google-generativeai, and others in requirements.txt.

⬆ Return

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DATASCIENCE HACK BY TOYOTA – Hack The Track

📄 Table of Contents

✨ Overview

💡 Inspiration

🏎️ What It Does

🛠️ How We Built It

Data

Core stack

🧗 Challenges We Ran Into

🏆 Accomplishments That We’re Proud Of

📚 What We Learned

🔮 Whats Next for Racing Hokies

🧩 Tech Stack

📊 Data Sources

🚀 Getting Started

📋 Prerequisites

⚙️ Installation

▶️ Usage

1. Run the animated Barber map (Tkinter + Matplotlib)

2. Run the Streamlit “Race Engineer Console”

🧪 Testing

📦 Features

📁 Project Structure

📑 Project Index

📈 Roadmap

🤝 Contributing

📜 License

✨ Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
models		models
notebooks		notebooks
src		src
tests		tests
tmp/node-compile-cache/v24.1.0-arm64-7b018323-501		tmp/node-compile-cache/v24.1.0-arm64-7b018323-501
tools		tools
.gitignore		.gitignore
DataScienceHackbyToyota.png		DataScienceHackbyToyota.png
README.md		README.md
RUN.md		RUN.md
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

File Name	Summary
README.md	- The provided code file, `README.md`, serves as the foundational documentation for the DATASCIENCE HACK BY TOYOTA-Hack The Track project - Its primary purpose is to introduce the project’s goal of leveraging data science techniques to optimize race strategies, enabling users to gain rapid insights and make data-driven decisions in racing scenarios - The README offers an overview of the project’s objectives, highlights its key features, and provides contextual information to orient users and contributors within the overall architecture - It acts as a gateway to understanding how this codebase fits into Toyota’s broader initiative to accelerate insights and dominate race strategies through data science.
streamlit_app.py	- Streamlit_app.pyThis script serves as the central entry point for the applications user interface, orchestrating various components to deliver an integrated experience - It leverages multiple modules to facilitate decision review, predictive modeling, conversational assistance, and visual analysis within a cohesive Streamlit-based frontend - By managing data paths, initializing live state storage, and coordinating interactions, this file enables users to analyze race data, receive insights, and interact with AI-driven tools seamlessly, thereby supporting the broader architecture of a race analysis and decision support platform.
requirements.txt	- Defines project dependencies essential for data analysis, visualization, and machine learning tasks - Ensures all necessary libraries are available for building, training, and deploying models, as well as creating interactive dashboards and conducting exploratory data analysis within the broader architecture - Supports seamless environment setup to facilitate efficient development and reproducibility across the entire codebase.
RUN.md	- Provides step-by-step instructions to set up and launch the web-based visualization for the Barber project - It guides users through cloning the repository, creating an isolated environment, installing dependencies, and running the Streamlit application - This facilitates interactive exploration of data insights, integrating seamlessly into the overall data science workflow within the project architecture.

File Name	Summary
predictive_models.py	- Provides tools for training, saving, loading, and applying machine learning models to predict lap times based on lap summary features - Facilitates quick deployment of models for real-time or batch predictions, supporting analysis and optimization of driving performance across different tracks and vehicles within the overall racing simulation or telemetry system.
chat_assistant.py	- Provides a GPT-powered interface for race strategy and telemetry analysis in GT racing - It generates contextual summaries, evaluates driver pace, and formulates radio-style responses to engineering questions by integrating static race data and live telemetry snapshots - This facilitates real-time strategic decision-making and communication, enhancing race management and driver support within the overall system architecture.
pit_model.py	- Implements a vehicle lap analysis pipeline that detects pit laps, segments laps into stints, and models lap time degradation over race duration - Facilitates understanding of performance decline, enabling insights into vehicle wear and strategy optimization within the broader race data architecture.
barber_digitize_track.py	- Provides an interactive tool to digitize the Barber Motorsports Park track centerline by allowing users to click along the track map - If interactivity isnt available, generates a synthetic oval track as a fallback - Outputs normalized and pixel coordinates of the track centerline to a CSV file, integrating seamlessly into the larger project architecture focused on track geometry analysis and visualization.
telemetry_loader.py	- Provides tools to load, process, and summarize vehicle telemetry data from CSV files - Facilitates lap and sector identification based on distance metrics, enabling detailed per-lap analysis and track segmentation - Supports flexible data extraction for performance metrics, supporting race analysis, track mapping, and telemetry-driven insights within the overall simulation or racing data architecture.
strategy_engine.py	- Provides a comprehensive framework for simulating and optimizing race strategies by modeling lap times, tyre degradation, and caution effects - Facilitates evaluation of pit stop timing, race outcomes, and probabilistic scenarios through Monte Carlo simulations, supporting real-time analytics and decision-making in race engineering and strategy planning.
vision_gemini.py	- Provides tools for analyzing race track images using a generative AI model to extract detailed vehicle and track insights - Facilitates batch processing of images, generating structured statistics on car detection, positioning, speed, and risk levels - Includes visualization functions to summarize lane positioning and risk trends across multiple frames, supporting race strategy and safety assessments within the overall system architecture.
track_utils.py	- Provides functionality to determine the current sector of a vehicle based on its lap distance within a racing circuit - It leverages track metadata to map a given distance to one of three predefined sectors, supporting real-time race analysis and telemetry processing within the overall track management system.
test.py	- Visualizes the digitized centerline of the Barber track by loading coordinate data and generating a scaled plot - It supports the broader project by providing a clear graphical representation of track geometry, facilitating analysis and validation within the overall data processing and simulation workflows - This enhances understanding of track layout essential for modeling and testing purposes.
barber_lap_anim.py	- The `src/barber_lap_anim.py` file is a core component responsible for visualizing and animating lap data within the project - It orchestrates the creation of dynamic, interactive plots that depict lap progressions, leveraging data processing and visualization libraries - Additionally, it integrates live state management by exporting real-time updates to JSON, facilitating seamless updates and external integrations - This module plays a pivotal role in translating raw lap metrics into insightful, animated representations, thereby supporting analysis and presentation of racing performance within the overall architecture.
live_state.py	- Manages persistent storage of live state data for individual tracks within the project - Facilitates saving and retrieving real-time status information by generating standardized file paths and ensuring atomic updates - Supports the overall architecture by maintaining up-to-date, reliable state information crucial for real-time processing and data consistency across the system.
track_meta.py	- Defines metadata for various race tracks, including identifiers, names, pit lane times, lengths, and optional geometric paths - Serves as a centralized reference for track-specific information, supporting accurate simulation, analysis, and visualization within the broader racing data processing architecture - Facilitates consistent access to track attributes across the project.
barber_build_track_s.py	- Calculates and appends normalized and cumulative arc length data to track geometry, enabling precise spatial analysis within the overall racing simulation architecture - Facilitates accurate positioning and timing along the track by transforming raw coordinate data into standardized distance metrics, supporting downstream components such as vehicle dynamics, telemetry, and visualization modules.
decision_reviewer.py	- Provides an AI-powered review mechanism for race engineer decisions during live races - It evaluates proposed strategic calls by leveraging a generative model to generate structured, markdown-formatted feedback, including verdicts, rationales, risks, and safer alternatives - When the AI model is unavailable, it defaults to a rule-based heuristic, ensuring continuous decision support within the race strategy architecture.
strategy_cli.py	- Provides a command-line interface for recommending optimal pit strategies in racing scenarios by analyzing lap data, simulating various strategies, and evaluating their performance under different caution conditions - Integrates track metadata and lap features to generate probabilistic insights, enabling users to select strategies with the highest win probability and best race time across multiple simulated universes.

File Name	Summary
11_barber_predictive_model.ipynb	- The `notebooks/11_barber_predictive_model.ipynb` file serves as an exploratory and predictive modeling notebook within the project - Its primary purpose is to analyze historical data related to barber services and develop a machine learning model that predicts future customer demand or service outcomes - This notebook facilitates data-driven decision-making by enabling stakeholders to forecast trends, optimize resource allocation, and improve operational efficiency across the broader system architecture - It acts as a key component for integrating predictive insights into the overall project, supporting strategic planning and automation efforts.
08_barber_sections_r2.ipynb	- The `notebooks/08_barber_sections_r2.ipynb` file serves as a key analytical component within the project, focusing on segmenting and analyzing barber-related data - Its primary purpose is to process and visualize data to identify distinct barber sections or categories, contributing to the broader goal of understanding customer segmentation or service patterns - This notebook supports the overall architecture by providing insights that inform decision-making, enhance data-driven strategies, and improve the systems ability to tailor services or optimize operations based on the identified barber sections.
07_barber_lap_times_r2.ipynb	- SummaryThis notebook analyzes lap time data for the Barber Motorsports Park, serving as a key component in the project’s data exploration and performance evaluation pipeline - It aims to extract insights from lap time recordings, facilitating performance benchmarking and trend identification within the broader racing analytics architecture - By processing and visualizing lap time metrics, this notebook supports the project's goal of enhancing race strategy and driver performance analysis.---If you provide the full content or specific details from the notebook, I can tailor the summary further!
05_barber_sections_r1.ipynb	- The `notebooks/05_barber_sections_r1.ipynb` file serves as an analytical step within the project, focusing on segmenting and analyzing barber-related data - Its primary purpose is to process and visualize data to identify distinct barber sections or categories, contributing to the broader goal of understanding and organizing barber shop data within the overall architecture - This notebook supports data exploration and feature extraction, which are essential for downstream modeling or reporting tasks in the project pipeline.
06_barber_telemetry_r2.ipynb	- The `notebooks/06_barber_telemetry_r2.ipynb` file serves as an analytical exploration within the project, focusing on processing and visualizing telemetry data related to barber operations - Its primary purpose is to validate data existence, perform preliminary data analysis, and generate visual insights that support understanding patterns or anomalies in the telemetry dataset - This notebook contributes to the broader codebase by enabling data-driven decision-making and validation, ensuring the integrity and usefulness of telemetry data across the system architecture.
04_barber_lap_times_r1.ipynb	- The `notebooks/04_barber_lap_times_r1.ipynb` file serves as an analytical exploration within the project, focusing on processing and visualizing lap time data related to barber race events - Its primary purpose is to analyze race performance metrics, providing insights into lap times and patterns that contribute to understanding race dynamics - This notebook supports the broader codebase by enabling data-driven assessments of race performance, which can inform model development, performance optimization, or strategic decision-making within the overall architecture.
02_vir_sectors_r1r2.ipynb	- The `notebooks/02_vir_sectors_r1r2.ipynb` notebook is designed to analyze and visualize the distribution of virtual sectors within the broader data processing pipeline - It serves as a crucial step in understanding sector-specific patterns and relationships, supporting the projects goal of detailed data segmentation and insights extraction - This notebook contributes to the overall architecture by enabling targeted analysis of virtual sectors, which can inform downstream modeling, reporting, or decision-making processes across the system.
10_barber_strategy_mvp.ipynb	- The `notebooks/10_barber_strategy_mvp.ipynb` file serves as a core component for demonstrating and validating the Barber Strategy within the project - It functions as a proof-of-concept or minimal viable product (MVP) notebook that showcases how the strategy operates in practice, providing insights into its effectiveness and potential application - This notebook is integral to the overall architecture by enabling experimentation, testing, and visualization of the strategy's performance, thereby supporting iterative development and refinement of the trading or decision-making approach across the project.
01_explore_vir.ipynb	- Summary of `notebooks/01_explore_vir.ipynb`This notebook serves as an initial exploratory analysis of the VIR Tele dataset within the broader project architecture - Its primary purpose is to understand the datas structure, quality, and key characteristics, laying the groundwork for subsequent data processing, modeling, and integration tasks - By providing insights into the dataset, this notebook helps inform decisions on data cleaning, feature engineering, and overall pipeline design, ensuring that the project leverages high-quality, well-understood data for accurate and reliable outcomes.
09_barber_driver_profile.ipynb	- The `notebooks/09_barber_driver_profile.ipynb` file serves as an analytical notebook within the project, focusing on profiling and understanding the characteristics of barber drivers - It contributes to the broader data exploration and feature analysis efforts, helping to identify key patterns and insights related to this specific driver segment - This work supports the overall architecture by informing data-driven decision-making and feature engineering strategies across the project.
13_vir_telemetry_r1.ipynb	- Defines and retrieves metadata for Virginia International Raceway, integrating track-specific details into the broader telemetry data processing framework - Facilitates contextual understanding of track characteristics, enabling accurate analysis and modeling within the overall telemetry and race data architecture - Supports data organization and consistency across different data sources and analysis workflows.
03_barber_telemetry_r1.ipynb	- The `notebooks/03_barber_telemetry_r1.ipynb` file serves as an analytical exploration within the project, focusing on processing and visualizing telemetry data related to barber services - Positioned within the broader codebase, this notebook likely functions as a data analysis and validation tool, enabling stakeholders to understand patterns, performance metrics, or operational insights derived from telemetry streams - Its role is to facilitate data-driven decision-making by transforming raw telemetry data into meaningful visualizations, thereby supporting the projects overarching goal of monitoring and optimizing barber-related services or systems.

ngstephen1/DataScienceHackbyToyota

Folders and files

Latest commit

History

Repository files navigation

DATASCIENCE HACK BY TOYOTA – Hack The Track

📄 Table of Contents

✨ Overview

💡 Inspiration

🏎️ What It Does

🛠️ How We Built It

Data

Core stack

🧗 Challenges We Ran Into

🏆 Accomplishments That We’re Proud Of

📚 What We Learned

🔮 Whats Next for Racing Hokies

🧩 Tech Stack

📊 Data Sources

🚀 Getting Started

📋 Prerequisites

⚙️ Installation

▶️ Usage

1. Run the animated Barber map (Tkinter + Matplotlib)

2. Run the Streamlit “Race Engineer Console”

🧪 Testing

📦 Features

📁 Project Structure

📑 Project Index

📈 Roadmap

🤝 Contributing

📜 License

✨ Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages