# 🚀 Execute TransportAnalytics on Vertex AI Notebook
This notebook guides you through running the full Vertex AI ML workflow.

## 1️⃣ Clone the Repository

In [None]:
!git clone https://github.com/YOUR_USERNAME/TransportAnalytics.git
%cd TransportAnalytics

## 2️⃣ Install Dependencies

In [None]:
!pip install -r requirements.txt

## 3️⃣ Upload Dataset to GCS (Optional First-time Only)

In [None]:
!gsutil cp preprocessing/merged_feature_data.csv gs://mta-ridership-data/inputs/merged_feature_data.csv

## 4️⃣ Run Feature Engineering

In [None]:
!python preprocessing/feature_engineering.py

## 5️⃣ Train and Upload Models

In [None]:
!python training/train_ridership_model.py

In [None]:
!python training/train_mode_classifier.py

## 6️⃣ Compile and Run Vertex AI Pipeline

In [None]:
from google.cloud import aiplatform
from google.cloud.aiplatform.pipeline_jobs import PipelineJob

aiplatform.init(project="gps-ax-lakehouse", location="us-central1")

pipeline_job = PipelineJob(
    display_name="ridership-pipeline",
    template_path="pipeline/vertex_ridership_pipeline.json",
    enable_caching=True
)
pipeline_job.run()

## 7️⃣ Run Batch Prediction to BigQuery

In [None]:
!python deployment/batch_prediction_gcs_to_bq.py

## 8️⃣ Create BigQuery View for Looker Studio

In [None]:
%%bigquery --project gps-ax-lakehouse
CREATE OR REPLACE VIEW `gps-ax-lakehouse.ridership_analytics.ridership_predictions_view` AS
SELECT
  *,
  FORMAT_DATE('%Y-%m-%d', DATE(batch_run_time)) AS prediction_date,
  CASE
    WHEN predicted_ridership > 1000 THEN 'High'
    WHEN predicted_ridership > 500 THEN 'Medium'
    ELSE 'Low'
  END AS ridership_category
FROM
  `gps-ax-lakehouse.ridership_analytics.batch_predictions`;

## 9️⃣ Connect BigQuery View to Looker Studio
Use [Looker Studio](https://lookerstudio.google.com/) → Connect BigQuery → `gps-ax-lakehouse.ridership_analytics.ridership_predictions_view`