Este proyecto utiliza machine learning para predecir los resultados de las carreras de Fórmula 1, específicamente el Gran Premio de China 2025.
- Predicción de tiempos de carrera basada en datos históricos
- Uso de Gradient Boosting para el modelo de predicción
- Análisis de datos de múltiples carreras
- Visualización de resultados
- Python 3.9+
- fastf1
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn
- Clonar el repositorio:
git clone https://github.com/benjamack/DataScience.git
cd DataScience
- Instalar dependencias:
pip install -r requirements.txt
Para ejecutar las predicciones:
python prediction1.py
El modelo predice los tiempos de carrera para el GP de China 2025, utilizando datos históricos de las primeras carreras de 2024.
- Error Absoluto Medio (MAE): 3.47 segundos
prediction1.py
: Script principal de predicciónrequirements.txt
: Lista de dependenciasf1_cache/
: Directorio de caché para datos de FastF1
Welcome to the F1 Predictions 2025 repository! This project uses machine learning, FastF1 API data, and historical F1 race results to predict race outcomes for the 2025 Formula 1 season.
This repository contains a Gradient Boosting Machine Learning model that predicts race results based on past performance, qualifying times, and other structured F1 data. The model leverages:
- FastF1 API for historical race data
- 2024 race results
- 2025 qualifying session results
- Over the course of the season we will be adding additional data to improve our model as well
- Feature engineering techniques to improve predictions
- FastF1 API: Fetches lap times, race results, and telemetry data
- 2025 Qualifying Data: Used for prediction
- Historical F1 Results: Processed from FastF1 for training the model
- Data Collection: The script pulls relevant F1 data using the FastF1 API.
- Preprocessing & Feature Engineering: Converts lap times, normalizes driver names, and structures race data.
- Model Training: A Gradient Boosting Regressor is trained using 2024 race results.
- Prediction: The model predicts race times for 2025 and ranks drivers accordingly.
- Evaluation: Model performance is measured using Mean Absolute Error (MAE).
fastf1
numpy
pandas
scikit-learn
matplotlib
- For every race the end of the file will be numbered in correlation to the race on the calendar, ex. prediction1 - Australia, prediction2 - China, etc.
Run the prediction script:
python3 prediction1.py
Expected output:
🏁 Predicted 2025 Australian GP Winner 🏁
Driver: Charles Leclerc, Predicted Race Time: 82.67s
...
🔍 Model Error (MAE): 3.22 seconds
The Mean Absolute Error (MAE) is used to evaluate how well the model predicts race times. Lower MAE values indicate more accurate predictions.
- Incorporate weather conditions as a feature
- Add pit stop strategies into the model
- Explore deep learning models for improved accuracy
- @mar_antaya on Instagram and TikTok will update with the latest predictions before every race of the 2025 F1 season
This project is licensed under the MIT License.
🏎️ Start predicting F1 races like a data scientist! 🚀