# Gold Layer - Season Standings

## Overview
This notebook generates the **Season Standings** table in the gold layer.  
It aggregates driver and team performance across all races in a season, producing standings similar to official championship tables.

## Steps
1. **Read Silver Layer Tables**  
   - `session_results_silver` (driver performance per race).  
   - `circuit_info_silver` (to identify events and rounds).  

2. **Aggregate Driver Standings**  
   - Sum points per driver across all races in the season.  
   - Count number of wins (Position = 1).  
   - Count total podium finishes (Position ≤ 3).  
   - Count total races participated.

3. **Aggregate Team Standings**  
   - Sum team points across all drivers.  
   - Count team wins and podiums.  

4. **Ranking & Ordering**  
   - Rank drivers and teams by points.  
   - Use wins and podiums as tie-breakers.  

5. **Write Gold Table**  
   - Store as `season_standings_gold` in Delta format, partitioned by `year`.  

## Output
- `DriverStandings`: Driver-level season summary.  
- `TeamStandings`: Team-level season summary.  


In [0]:
from pyspark.sql import functions as F
from pyspark.sql import Window

# Load silver layer tables
session_results = spark.table("silver.session_results_silver")
circuit_info = spark.table("silver.circuit_info_silver")

# Join results with circuit info to include season/year metadata
results_with_season = (
    session_results
    .join(circuit_info, 
          (session_results["round"] == circuit_info["round"]) &
          (session_results["year"] == circuit_info["year"]), 
          "inner")
    .select(
        session_results["*"],
        circuit_info["event_name"],
        circuit_info["year"]
    )
)

# ---------------- DRIVER STANDINGS ----------------
driver_standings = (
    results_with_season
    .groupBy("year", "DriverId", "FullName", "Abbreviation", "CountryCode", "TeamName")
    .agg(
        F.sum("Points").alias("TotalPoints"),
        F.count("*").alias("RacesParticipated"),
        F.sum(F.when(F.col("ClassifiedPosition") == 1, 1).otherwise(0)).alias("Wins"),
        F.sum(F.when(F.col("ClassifiedPosition") <= 3, 1).otherwise(0)).alias("Podiums")
    )
)

# Rank drivers by total points, then wins, then podiums
driver_window = Window.partitionBy("year").orderBy(
    F.col("TotalPoints").desc(),
    F.col("Wins").desc(),
    F.col("Podiums").desc()
)

driver_standings = driver_standings.withColumn("DriverRank", F.row_number().over(driver_window))

# ---------------- TEAM STANDINGS ----------------
team_standings = (
    results_with_season
    .groupBy("year", "TeamId", "TeamName")
    .agg(
        F.sum("Points").alias("TotalPoints"),
        F.count("*").alias("RaceEntries"),
        F.sum(F.when(F.col("ClassifiedPosition") == 1, 1).otherwise(0)).alias("Wins"),
        F.sum(F.when(F.col("ClassifiedPosition") <= 3, 1).otherwise(0)).alias("Podiums")
    )
)

# Rank teams by points, wins, podiums
team_window = Window.partitionBy("year").orderBy(
    F.col("TotalPoints").desc(),
    F.col("Wins").desc(),
    F.col("Podiums").desc()
)

team_standings = team_standings.withColumn("TeamRank", F.row_number().over(team_window))

# ---------------- WRITE TO GOLD LAYER ----------------
(
    driver_standings
    .write
    .format("delta")
    .mode("overwrite")
    .partitionBy("year")
    .saveAsTable("gold.driver_season_standings_gold")
)

(
    team_standings
    .write
    .format("delta")
    .mode("overwrite")
    .partitionBy("year")
    .saveAsTable("gold.team_season_standings_gold")
)