# Phase II: Data Curation, Exploratory Analysis and Plotting (5\%)

### Group Members: Lauren Cummings, Riyana Roy, Satvik Repaka, Justin Huang

#### Due (Each Group): October 25

Each **project group** will submit a single **jupyter notebook** which contains:

1. (1\%) Expresses the central motivation of the project and explains the (at least) two key questions to be explored. Gives a summary of the data processing pipeline so a technical expert can easily follow along.
2. (2\%) Obtains, cleans, and merges all data sources involved in the project.
3. (2\%) Builds at least two visualizations (graphs/plots) from the data which help to understand or answer the questions of interest. These visualizations will be graded based on how much information they can effectively communicate to readers. Please make sure your visualization are sufficiently distinct from each other.

# Project Details

### **Central Motivation**  
This project aims to understand how tire strategy and weather conditions influence race outcomes. By analyzing these factors together, we aim to gain insights into how teams optimize performance and make decisions under changing conditions.

---

### **Key Questions:**  
1. How does tire strategy impact driver performance and race outcomes under varying weather conditions? 
2. How do changing weather conditions influence pit stop timing and tire choices throughout the race?

---

### **Data Processing Overview:**  

1. **Data Ingestion**: Collect data from race sessions, including event logs, telemetry, and weather data.  

2. **Preprocessing**:  
   - **Laps Data:**  
     - Fill missing pit stop times.  
     - Remove deleted or inaccurate laps.  
     - Map track statuses to meaningful labels (e.g, Safety Car, Yellow Flag).  

   - **Telemetry Data:**    
     - Remove outliers in speed, RPM, and gears.  

   - **Weather Data:**  
     - Align weather timestamps with lap data for consistency.  
     - Filter out unrealistic wind speed values.  
     - Group consecutive rainfall events into a single rain period.  
     - Fill missing temperature and pressure values using forward filling.


3. **Exploratory Data Analysis (EDA)**: Visualize strategy patterns and weather impacts.  
4. **Modeling/Analysis**: Apply statistical methods to find relationships between tires, weather, and strategy.  
5. **Reporting**: Write report to answer the key questions.  



# Imported Libraries

In [12]:
import os
import fastf1
import logging

# Data Retrieval

In [27]:
# Create the directory if it doesn't exist
if not os.path.exists('cache'):
    os.makedirs('cache')
    
# Enable cache (after creating the directory)
fastf1.Cache.enable_cache('cache')

# Set to suppress INFO and WARNING messages
logging.getLogger('fastf1').setLevel(logging.ERROR)

# Load session data
session = fastf1.get_session(2021, 'Imola', 'R')
session.load()

# Extract data
laps = session.laps
print("Laps Data:")
print(laps.head())

fastest_lap = laps.pick_fastest()
telemetry = fastest_lap.get_telemetry()
print("\nTelemetry Data:")
print(telemetry.head())

weather = session.weather_data
print("\nWeather Data:")
print(weather.head())


Laps Data:
                    Time Driver DriverNumber                LapTime  \
0 0 days 00:35:09.853000    GAS           10 0 days 00:01:54.003000   
1 0 days 00:37:32.883000    GAS           10 0 days 00:02:23.030000   
2 0 days 00:39:53.731000    GAS           10 0 days 00:02:20.848000   
3 0 days 00:42:18.428000    GAS           10 0 days 00:02:24.697000   
4 0 days 00:44:42.360000    GAS           10 0 days 00:02:23.932000   

   LapNumber  Stint PitOutTime PitInTime            Sector1Time  \
0        1.0    1.0        NaT       NaT                    NaT   
1        2.0    1.0        NaT       NaT 0 days 00:00:43.289000   
2        3.0    1.0        NaT       NaT 0 days 00:00:42.977000   
3        4.0    1.0        NaT       NaT 0 days 00:00:42.573000   
4        5.0    1.0        NaT       NaT 0 days 00:00:41.394000   

             Sector2Time  ... FreshTyre        Team           LapStartTime  \
0 0 days 00:00:37.569000  ...      True  AlphaTauri 0 days 00:33:15.688000   
1 0