# Notebook 02: Data Consolidation for F1 Tyre Prediction (Colab)

This notebook handles the consolidation of raw F1 race data (extracted by `01_Data_Extraction_Colab.ipynb`) into a single processed dataset. It is designed to be run in a Google Colab environment and utilizes scripts and configurations stored in Google Drive.

**Prerequisites:**
-   Successful execution of `01_Data_Extraction_Colab.ipynb`.
-   Raw `.parquet` files present in the `F1/colab/drive/raw_data/` directory on Google Drive (path configured in `colab_path_config.yaml`).
-   Configuration files (`colab_path_config.yaml`) correctly set up.

**Workflow:**
1.  **Mount Google Drive**: Access project files.
2.  **Set Project Path**: Navigate to the correct project directory on Drive.
3.  **Install Dependencies**: (Usually already done if running sequentially after Notebook 01, but included for completeness if run standalone).
4.  **Verify Configuration**: Remind the user to check `colab_path_config.yaml`.
5.  **Run Data Consolidation Script**: Execute `colab_consolidate_data.py`.
6.  **Review Outputs**: Check logs and confirm the consolidated `dataset.parquet` and `consolidation_report.txt` are saved to the `F1/colab/drive/processed_data/` directory on Google Drive.

## 1. Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## 2. Set Project Path

Navigate to the `F1/colab/` directory within your Google Drive. 
**IMPORTANT:** You MUST update the path below (`%cd`) to point to the correct location of the `FASTF1/F1/colab/` directory in your Google Drive if it's different from the previous notebook or if running this standalone.

In [None]:
# USER ACTION REQUIRED: Update this path if necessary!
# Example: %cd "/content/drive/My Drive/Colab Notebooks/FASTF1/F1/colab/"
%cd "/content/drive/My Drive/path/to/your/FASTF1/F1/colab/" # <-- UPDATE THIS LINE IF NEEDED

## 3. Install Dependencies (if not already installed)

This step is generally only needed if you are running this notebook in a fresh Colab session or if dependencies were not installed by a previous notebook in the same session.

In [None]:
# !pip install -r requirements.txt # Uncomment if needed

## 4. Verify Configuration Files

Before running the consolidation, please ensure `F1/colab/configs/colab_path_config.yaml` is correctly set up in your Google Drive, especially the `base_project_drive_path`.

In [None]:
# Optional: Display the content of config files to verify paths
# Make sure you have navigated to F1/colab/ first using %cd

print("--- Contents of F1/colab/configs/colab_path_config.yaml ---")
!cat "configs/colab_path_config.yaml"

## 5. Run Data Consolidation Script

This command executes the `colab_consolidate_data.py` script located in `F1/colab/src/`.
Output and logs from the script will be displayed below.

In [None]:
# Ensure you are in the F1/colab/ directory for the script path to be correct
!python src/colab_consolidate_data.py

## 6. Review Outputs and Logs

After the script finishes:
1.  **Check Script Output**: Review the print statements and any error messages from the cell above.
2.  **Check `colab_data_consolidation.log`**: Navigate to the `F1/colab/drive/logs/` directory (as configured) on your Google Drive and review the log file for detailed information.
3.  **Check `colab_consolidation_report.txt`**: This report should also be in the `F1/colab/drive/logs/` (or `artifacts`) directory. Review the summary statistics.
4.  **Verify `dataset.parquet`**: Check that the consolidated `dataset.parquet` file exists in the `F1/colab/drive/processed_data/` directory on Google Drive and inspect its properties (e.g., size).