# 01 - Explore Toyota GR Dataset

In this notebook, I will be learning Python and pandas by exploring the raw Toyota GR telemetry data.
The goal is to understand:
- what files exist in data/raw
- how to locate and load them using pandas
- how to inspect the structure of a dataset (columns, rows, dtypes)
- how to practice basic filtering and grouping

This is the starting point of my "learn while building" workflow.

## Section A - Setup & Goals

In this section, I will clarify the purpose of this notebook and what I want to learn. This helps me stay organized and intentional as I write code. 

## Section B - Imports & Paths

Before loading any dataset, I need to import two tools:
- pandas -> helps read and manipulate tables of data
- pathlib -> helps build file system paths clearly

I will import them in the next cell. (Make sure pandas package is installed before import) 

In [19]:
import pandas as pd
from pathlib import Path

## Section C - Locate data/raw

- Why do we need to find the current working directory? This is to locate where data/raw is compared to my cwd. This is to make sure I can access the data set from the right folder relative to my cwd for data exploration. 
- Why is navigating with code better than hardcoding "../data/raw"? This is because any changes to the folder structure wouldn't break the code. If we hardcode the file or folder name, we will have to update it everywhere we are using it at. 
- Why does exploring files programmatically matter for this project? This is important for this project since we have to use the files for further computations like calculating the laptimes and so on. 

To find the current working directory and print it:

In [20]:
cwd = Path.cwd()
print(cwd)

c:\Users\krish\OneDrive\Desktop\projects\gr-race-engineer\notebooks


To move from the notebook folder to its parent:

In [21]:
parent_dir = cwd.parent
print(parent_dir)

c:\Users\krish\OneDrive\Desktop\projects\gr-race-engineer


To append "data" and "raw" to the parent path and store it in a variable:

In [26]:
data_path = parent_dir.joinpath("data", "raw")
print(data_path)

c:\Users\krish\OneDrive\Desktop\projects\gr-race-engineer\data\raw


To show the files inside `data_path`

In [27]:
for entry in data_path.iterdir():
    print(entry.name)

COTA


## Section D - Explore files in COTA

Here we will explore the files inside COTA folder to decide which dataset to load first. 

- Why inspect inside the COTA folder?
To explore the different files included in the folder and decide which one to load first.
- What kind of files are expected?
.csv files are expected
- Why this matters for loading data?
We need to ensure that the folder is not empty and if not, that the file we will be loading is appropriate.

To get the file path for the data set and list the files inside:

In [28]:
cota_path = data_path.joinpath("COTA")

for entry in cota_path.iterdir():
    print(entry.name)

Race 1
Race 2


## Section D.2 - Inspecting COTA Race Data

I see that the COTA folder contains separate race sessions (“Race 1” and “Race 2”).
I will explore one of them to find the actual data files (laps, telemetry, session metadata, etc.).


In [29]:
race_path = cota_path / "Race 1"

for entry in race_path.iterdir():
    print(entry.name)

00_Results GR Cup Race 1 Official_Anonymized.CSV
03_Provisional Results_Race 1_Anonymized.CSV
05_Provisional Results by Class_Race 1_Anonymized.CSV
23_AnalysisEnduranceWithSections_Race 1_Anonymized.CSV
26_Weather_Race 1_Anonymized.CSV
99_Best 10 Laps By Driver_Race 1_Anonymized.CSV
COTA_lap_end_time_R1.csv
COTA_lap_start_time_R1.csv
COTA_lap_time_R1.csv
R1_cota_telemetry_data.csv
