# Notebook 03: Feature Engineering for F1 Tyre Prediction (Colab)

This notebook orchestrates the feature engineering process on the consolidated F1 dataset. It uses the `colab_feature_engineering.py` and `colab_create_features_pipeline.py` scripts.

**Prerequisites:**
-   Successful execution of `02_Data_Consolidation_Colab.ipynb`.
-   Consolidated `dataset.parquet` present in the `F1/colab/drive/processed_data/` directory on Google Drive.
-   Configuration files (`colab_path_config.yaml`) correctly set up.

**Workflow:**
1.  **Mount Google Drive**: Access project files.
2.  **Set Project Path**: Navigate to the correct project directory on Drive.
3.  **Install Dependencies**: (Usually already done, but included for completeness).
4.  **Verify Configuration**: Remind the user to check `colab_path_config.yaml`.
5.  **Run Feature Engineering Pipeline Script**: Execute `colab_create_features_pipeline.py`.
6.  **Review Outputs**: Check logs, the feature-engineered dataset (`featured_dataset.parquet`), and saved artifacts (encoders, scalers, imputer values) on Google Drive.

## 1. Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## 2. Set Project Path

Navigate to the `F1/colab/` directory within your Google Drive. 
**IMPORTANT:** You MUST update the path below (`%cd`) if it's different or if running this standalone.

In [None]:
# USER ACTION REQUIRED: Update this path if necessary!
# Example: %cd "/content/drive/My Drive/Colab Notebooks/FASTF1/F1/colab/"
%cd "/content/drive/My Drive/path/to/your/FASTF1/F1/colab/" # <-- UPDATE THIS LINE IF NEEDED

## 3. Install Dependencies (if not already installed)

In [None]:
# !pip install -r requirements.txt # Uncomment if needed

## 4. Verify Configuration Files

Ensure `F1/colab/configs/colab_path_config.yaml` is correctly set up, especially `base_project_drive_path`.

In [None]:
print("--- Contents of F1/colab/configs/colab_path_config.yaml ---")
!cat "configs/colab_path_config.yaml"

## 5. Run Feature Engineering Pipeline Script

This command executes the `colab_create_features_pipeline.py` script located in `F1/colab/src/feature_engineering/`.

In [None]:
# Ensure you are in the F1/colab/ directory for the script path to be correct
!python src/feature_engineering/colab_create_features_pipeline.py

## 6. Review Outputs and Logs

After the script finishes:
1.  **Check Script Output**: Review print statements and errors from the cell above.
2.  **Check `colab_feature_engineering.log` and `colab_create_features_pipeline.log`**: Review logs in `F1/colab/drive/logs/`.
3.  **Verify `featured_dataset.parquet`**: Check that this file exists in `F1/colab/drive/feature_engineered_data/`.
4.  **Verify Artifacts**: Check for `colab_encoders.pkl`, `colab_scaler.pkl`, and `colab_imputer_values.json` in `F1/colab/drive/artifacts/`.