# Complex Calculations

1. Window functions
    - `SELECT SUM(val_col) OVER(PARTITION BY col1 ORDER BY col2)`
    - Other functions: `LAG()`, `LEAD()`, `AVG()`, `ROW_NUMBER()`, `RANK()` etc
2. Layered calculations
    - Aggregate an existing aggregation
    - Leverages a subquery

# Window function on an aggregation

- Similar to how we put everything regarding `aggregation` from `SELECT` into `HAVING`

```
SELECT team_id,
    SUM (points) AS team_points, # AGGREGATED COLUMN NOTED
    SUM (SUM(points)) OVER() AS league_points # WINDOW ON SAME COLUMN WITH SAME AGGREGATION 
FROM original_table
GROUP BY team_id;
```

# Comparing groups

- 2 types of metrics for omparing different groups

- Volume metrics
    - Scale with size
    - eg: team X has 20 players and has 50 goals, team Y has 5 players and has 30 goals. team X has most goals. (NOT A FAIR COMPARISON = team with the most goals wins )
    - Measures performance that is biased by size
    - Does not take into account the normalized distribution
    - example:
    ```
    SELECT player_id, team_id, SUM(points) AS points
    SUM (points) / ( SUM (points) OVER ( PARTITION BY team_id)) AS perc_of_team
    FROM basketball_points
    GROUP BY player_id, team_id;
    ```
- Efficiency metrics
    - Does not scale with size
    - Typically a ratio
    - eg: team X has 20 players and has 50 goals, team Y has 5 players and has 30 goals. team X has most goals. But Performance index shows team Y performed way better (A FAIR COMPARISON = goal per player ).
    - Measures performance that is not biased by size
    - Takes into account the normalized distribution
    - Compares performance to a benchmark (Benchmark typically an average or median)

    - example:
    ```
    SELECT team_id, points/games AS team_ppg,
    SUM (points) OVER () / SUM (games) OVER () AS league_ppg, 
    (points/games) / ( SUM (points) OVER () / SUM (games) OVER ()) AS perf_index
    FROM basketball_summary;
    ```

# Comparing dates

- Use `LAG(value, offset)` : Push 1 row down filling first entry with `null` 
- Use `LEAD(value, offset)` : Pull 1 row up filling last entry with `null` 
- See slides for examples

# Rolling calculations

- Customized window that is applied on ordered rows
- Comes after `ORDER BY ` clause
- Specified range of rows with `ROWS BETWEEN` clause
- `SELECT SUM(col) OVER (ORDER BY col1 ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) AS rolling`
- See slides for examples