<img src = "https://images2.imgbox.com/60/09/VFwl5LOq_o.jpg" width="400">

# 3. Aggregate window functions and frames
---
In this chapter, you'll learn how to use aggregate functions you're familiar with, like `AVG()` and `SUM()`, as window functions, as well as how to define frames to change a window function's output.

In [1]:
%load_ext sql

In [2]:
%sql sqlite:///data/summer.db

'Connected: @data/summer.db'

## Running totals of athlete medals
---
The running total (or cumulative sum) of a column helps you determine what each row's contribution is to the total sum.

### Instructions

Return the athletes, the number of medals they earned, and the medals running total, ordered by the athletes' names in alphabetical order.

In [4]:
%%sql

WITH Athlete_Medals 
     AS (SELECT athlete,
                COUNT(*) AS Medals
         FROM   summer_medals
         WHERE  country = 'USA'
                AND medal = 'Gold'
                AND year >= 2000
         GROUP  BY athlete)
SELECT athlete,
       Medals,
       SUM(Medals) OVER (ORDER BY athlete ASC) AS Max_Medals
FROM   Athlete_Medals 
ORDER  BY athlete ASC
LIMIT  20

 * sqlite:///data/summer.db
Done.


athlete,Medals,Max_Medals
ABDUR-RAHIM Shareef,1,1
ABERNATHY Brent,1,2
ADRIAN Nathan,3,5
AHRENS Chris,1,6
AINSWORTH Kurt,1,7
ALLEN Ray,1,8
ALLEN Wyatt,1,9
AMBROSI Christie,1,10
AMICO Leah,1,11
ANAE Tumua,1,12


## Maximum country medals by year
---

Getting the maximum of a country's earned medals so far helps you determine whether a country has broken its medals record by comparing the current year's earned medals and the maximum so far.

### Instructions

Return the year, country, medals, and the maximum medals earned so far for each country, ordered by year in ascending order.

In [5]:
%%sql

WITH Country_Medals 
     AS (SELECT year,
                country,
                COUNT(*) AS Medals
         FROM   summer_medals
         WHERE  country IN ( 'CHN', 'KOR', 'JPN' )
                AND medal = 'Gold'
                AND year >= 2000
         GROUP  BY year,
                   country)
SELECT year,
       country,
       Medals,
       MAX(Medals) OVER (partition BY country ORDER BY year ASC) AS Max_Medals
FROM   Country_Medals 
ORDER  BY country ASC, year ASC 

 * sqlite:///data/summer.db
Done.


year,country,Medals,Max_Medals
2000,CHN,39,39
2004,CHN,52,52
2008,CHN,74,74
2012,CHN,56,74
2000,JPN,5,5
2004,JPN,21,21
2008,JPN,23,23
2012,JPN,7,23
2000,KOR,12,12
2004,KOR,14,14


## Minimum country medals by year
---

So far, you've seen `MAX` and `SUM`, aggregate functions normally used with `GROUP BY`, being used as window functions. You can also use the other aggregate functions, like `MIN`, as window functions.

### Instructions

Return the year, medals earned, and minimum medals earned so far.

In [6]:
%%sql

WITH France_Medals 
     AS (SELECT year,
                COUNT(*) AS Medals
         FROM   summer_medals
         WHERE  country = 'FRA'
                AND medal = 'Gold'
                AND year >= 2000
         GROUP  BY year)
SELECT year,
       Medals,
       MIN(Medals) OVER (ORDER BY year ASC) AS Min_Medals
FROM   France_Medals
ORDER  BY year ASC

 * sqlite:///data/summer.db
Done.


year,Medals,Min_Medals
2000,22,22
2004,21,21
2008,25,21
2012,30,21


## Moving maximum of Scandinavian athletes' medals
---

Frames allow you to restrict the rows passed as input to your window function to a sliding window for you to define the start and finish.

Adding a frame to your window function allows you to calculate "moving" metrics, inputs of which slide from row to row.

### Instructions

Return the year, medals earned, and the maximum medals earned, comparing only the current year and the next year.

In [7]:
%%sql

WITH Scandinavian_Medals 
     AS (SELECT year,
                COUNT(*) AS Medals
         FROM   summer_medals
         WHERE  country IN ( 'DEN', 'NOR', 'FIN', 'SWE', 'ISL' )
                AND medal = 'Gold'
         GROUP  BY year)
SELECT year,
       Medals,
       MAX(Medals) OVER (ORDER BY year ASC ROWS BETWEEN 
                         CURRENT ROW AND 1 FOLLOWING) AS Max_Medals
FROM   Scandinavian_Medals 
ORDER  BY year ASC 

 * sqlite:///data/summer.db
Done.


year,Medals,Max_Medals
1896,1,1
1900,1,77
1908,77,141
1912,141,159
1920,159,159
1924,48,48
1928,24,24
1932,17,17
1936,15,54
1948,54,54


## Moving maximum of Chinese athletes' medals
---

Frames allow you to "peek" forwards or backward without first using the relative fetching functions, `LAG` and `LEAD`, to fetch previous rows' values into the current row.

### Instructions

Return the athletes, medals earned, and the maximum medals earned, comparing only the last two and current athletes, ordering by athletes' names in alphabetical order.

In [9]:
%%sql

WITH Chinese_Medals 
     AS (SELECT athlete,
                COUNT(*) AS Medals
         FROM   summer_medals
         WHERE  country = 'CHN'
                AND medal = 'Gold'
                AND year >= 2000
         GROUP  BY athlete)
SELECT athlete,
       Medals,
       MAX(Medals) OVER (ORDER BY athlete ASC ROWS BETWEEN 2 
                         PRECEDING AND CURRENT ROW) AS Max_Medals
FROM   Chinese_Medals 
ORDER  BY athlete ASC 
LIMIT  20

 * sqlite:///data/summer.db
Done.


athlete,Medals,Max_Medals
CAI Yalin,1,1
CAI Yun,1,1
CAO Lei,1,1
CAO Yuan,1,1
CHEN Ding,1,1
CHEN Jing,1,1
CHEN Qi,1,1
CHEN Ruolin,4,4
CHEN Xiaomin,1,4
CHEN Xiexia,1,4


## ROWS vc RANGE
---

- `RANGE BETWEEEN [START] AND [FINISH]`

    - Functions much the same as `ROWS BETWEEN`
    
    - `RANGE` treats duplicates in `OVER`s `ORDER BY` subclause as a single entity
    
- `ROWS BETWEEN` is almost always used over `RANGE BETWEEN`

## Moving average of Russian medals
---

Using frames with aggregate window functions allow you to calculate many common metrics, including moving averages and totals. These metrics track the change in performance over time.

### Instructions

Calculate the 3-year moving average of medals earned.

In [10]:
%%sql

WITH Russian_Medals 
     AS (SELECT year,
                COUNT(*) AS Medals
         FROM   summer_medals
         WHERE  country = 'RUS'
                AND medal = 'Gold'
                AND year >= 1980
         GROUP  BY year)
SELECT year,
       medals,
       AVG(Medals) OVER (ORDER BY year ASC ROWS BETWEEN 2 
                         PRECEDING AND CURRENT ROW) AS Medals_MA
FROM   Russian_Medals
ORDER  BY year ASC 

 * sqlite:///data/summer.db
Done.


year,Medals,Medals_MA
1996,36,36.0
2000,66,51.0
2004,47,49.66666666666666
2008,43,52.0
2012,47,45.66666666666666


## Moving total of countries' medals
---

What if your data is split into multiple groups spread over one or more columns in the table? Even with a defined frame, if you can't somehow separate the groups' data, one group's values will affect the average of another group's values.

### Instructions

Calculate the 3-year moving sum of medals earned per country.

In [12]:
%%sql

WITH Country_Medals 
     AS (SELECT year,
                country,
                Count(*) AS Medals
         FROM   summer_medals
         GROUP  BY year,
                   country)
SELECT year,
       country,
       Medals,
       SUM(Medals) OVER (PARTITION BY country ORDER BY year ASC 
                         ROWS BETWEEN 2 preceding AND CURRENT ROW) AS Medals_MA
FROM   Country_Medals
ORDER  BY country ASC,
          year ASC 
LIMIT  20

 * sqlite:///data/summer.db
Done.


year,country,Medals,Medals_MA
2012,,4,4
2008,AFG,1,1
2012,AFG,1,2
1988,AHO,1,1
1984,ALG,2,2
1992,ALG,2,4
1996,ALG,3,7
2000,ALG,5,10
2008,ALG,2,10
2012,ALG,1,8
