This project aims to analyze and visualize Formula 1 race data from 2009 to 2024, focusing on various aspects such as driver performance, team performance, and specific race events..
Dataset:
- Name: Formula 1 Race Data
- Source: Kaggle
- Link: https://www.kaggle.com/datasets/jtrotman/formula-1-race-data
Data Preprocessing:
- Handled missing values
- Converted categorical variables to numerical
- Converted recorded laptimes times to seconds
- Feature scaling
- Multiple CSV files were merged to make new related information
- Filtered data to focus on relevant years (2009-2024) and specific drivers or teams of interest.
Analysis Framework:
- Exploratory Data Analysis (EDA)
- Utilized pandas for efficient data manipulation and management
- Data visualization using Matplotlib and Seaborn
Analysis:
- Visualized fastest lap times, highest ranks, and performance trends for drivers and teams.
- Identified Lando Norris's fastest lap times and highest positions each year.
- Analyzed Miami Grand Prix fastest laps from 2021-2024.
- Evaluated team and driver points performance for the 2023 season.
- Determined drivers with most pole positions and wins without pole positions in 2023.
- Checked consistency of driver performance in 2023 and analyzed the relationship between qualifying and final race positions.
- Distribution of points and wins for specific drivers over the years.
- Driver with the most pole positions and wins in 2023. (DUDUDDU MAX VERSTAPPEN)
- Driver with most wins without pole in 2023. (DUDUDDU MAX VERSTAPPEN)
Limitations:
- Dataset might not be representative of core factors such as engine,power,tires age etc.
- Core accuracy could be improved with additional factors.
- The analysis focused on specific years and drivers, potentially overlooking broader trends.
Potential Developments:
- Incorporate external factors such as weather conditions, track characteristics, and car specifications into the analysis.
- Develop interactive dashboards for better data exploration and visualization.
- Conduct more detailed comparisons between drivers across different eras and teams.
Visualisation:
- The project uses Matplotlib and Seaborn to create various plots, such as bar plots, line plots, heatmaps, and density plots, to visualize the data and derive insights.