# Gold Layer - Driver Teams

## Overview
This notebook builds the **Driver Teams** gold table.  
It provides a historical mapping of drivers to teams for each season, capturing details such as  
driver metadata, team metadata, and the active years.

## Steps
1. **Read Silver Tables**  
   - `session_results_silver` (driver + team mapping).  
   - `circuit_info_silver` (to get season/year and event details).  

2. **Build Driver-Team Mapping**  
   - Extract distinct combinations of `DriverId`, `FullName`, `TeamId`, `TeamName`, and `year`.  
   - Track multiple stints across teams within a season (if any).  

3. **Add Aggregated Metadata**  
   - Count races driven for each team in a season.  
   - Earliest and latest rounds with that team in the season.  

4. **Write Gold Table**  
   - Store as `driver_teams_gold` in Delta format, partitioned by `year`.  

## Output
- One row per driver-team-season mapping.  
- Includes team and driver details, races participated, and round span.  


In [0]:
from pyspark.sql import functions as F

# Load silver layer tables
session_results = spark.table("silver.session_results_silver")
circuit_info = spark.table("silver.circuit_info_silver")

# Join results with circuit info to include year and round
results_with_season = (
    session_results
    .join(circuit_info, 
          (session_results["round"] == circuit_info["round"]) &
          (session_results["year"] == circuit_info["year"]), 
          "inner")
    .select(
        session_results["DriverId"],
        session_results["FullName"],
        session_results["Abbreviation"],
        session_results["CountryCode"],
        session_results["TeamId"],
        session_results["TeamName"],
        session_results["DriverNumber"],
        circuit_info["year"],
        circuit_info["round"],
        circuit_info["event_name"]
    )
)

# Build driver-team mapping per season
driver_team_mapping = (
    results_with_season
    .groupBy("year", "DriverId", "FullName", "Abbreviation", "CountryCode", 
             "TeamId", "TeamName", "DriverNumber")
    .agg(
        F.countDistinct("round").alias("RacesWithTeam"),
        F.min("round").alias("FirstRound"),
        F.max("round").alias("LastRound")
    )
    .orderBy("year", "DriverId", "FirstRound")
)

# Write to gold layer
(
    driver_team_mapping
    .write
    .format("delta")
    .mode("overwrite")
    .partitionBy("year")
    .saveAsTable("gold.driver_teams_gold")
)