<img src = "https://images2.imgbox.com/60/09/VFwl5LOq_o.jpg" width="400">

# 1. Introduction to window functions
---
In this chapter, you'll learn what window functions are, and the two basic window function subclauses, ORDER BY and PARTITION BY.

In [1]:
%load_ext sql

In [6]:
%sql sqlite:///data/summer.db

'Connected: @data/summer.db'

## Numbering rows
---
The simplest application for window functions is numbering rows. Numbering rows allows you to easily fetch the `n`th row. For example, it would be very difficult to get the 35th row in any given table if you didn't have a column with each row's number.

### Instructions

Number each row in the dataset.

In [10]:
%%sql

SELECT *,
       ROW_NUMBER() OVER() AS Row_N
FROM   summer_medals
ORDER  BY row_n ASC 
LIMIT  20

 * sqlite:///data/summer.db
Done.


Year,City,Sport,Discipline,Athlete,Country,Gender,Event,Medal,Row_N
1896,Athens,Aquatics,Swimming,HAJOS Alfred,HUN,Men,100M Freestyle,Gold,1
1896,Athens,Aquatics,Swimming,HERSCHMANN Otto,AUT,Men,100M Freestyle,Silver,2
1896,Athens,Aquatics,Swimming,DRIVAS Dimitrios,GRE,Men,100M Freestyle For Sailors,Bronze,3
1896,Athens,Aquatics,Swimming,MALOKINIS Ioannis,GRE,Men,100M Freestyle For Sailors,Gold,4
1896,Athens,Aquatics,Swimming,CHASAPIS Spiridon,GRE,Men,100M Freestyle For Sailors,Silver,5
1896,Athens,Aquatics,Swimming,CHOROPHAS Efstathios,GRE,Men,1200M Freestyle,Bronze,6
1896,Athens,Aquatics,Swimming,HAJOS Alfred,HUN,Men,1200M Freestyle,Gold,7
1896,Athens,Aquatics,Swimming,ANDREOU Joannis,GRE,Men,1200M Freestyle,Silver,8
1896,Athens,Aquatics,Swimming,CHOROPHAS Efstathios,GRE,Men,400M Freestyle,Bronze,9
1896,Athens,Aquatics,Swimming,NEUMANN Paul,AUT,Men,400M Freestyle,Gold,10


## Numbering Olympic games in ascending order
---

The Summer Olympics dataset contains the results of the games between 1896 and 2012. The first Summer Olympics were held in 1896, the second in 1900, and so on. What if you want to easily query the table to see in which year the 13th Summer Olympics were held? You'd need to number the rows for that.

### Instructions

Assign a number to each year in which Summer Olympic games were held.

In [11]:
%%sql

SELECT year,
       ROW_NUMBER() OVER() AS Row_N
FROM   (SELECT DISTINCT year
        FROM   summer_medals
        ORDER  BY year ASC) AS Years
ORDER  BY year ASC 

 * sqlite:///data/summer.db
Done.


year,Row_N
1896,1
1900,2
1904,3
1908,4
1912,5
1920,6
1924,7
1928,8
1932,9
1936,10


## Numbering Olympic games in descending order
---
You've already numbered the rows in the Summer Medals dataset. What if you need to reverse the row numbers so that the most recent Olympic games' rows have a lower number?

### Instructions

Assign a number to each year in which Summer Olympic games were held so that rows with the most recent years have lower row numbers.

In [12]:
%%sql

SELECT year,
       ROW_NUMBER() OVER (ORDER BY year DESC) AS Row_N
FROM   (SELECT DISTINCT year
        FROM   summer_medals) AS Years
ORDER  BY year 

 * sqlite:///data/summer.db
Done.


year,Row_N
1896,27
1900,26
1904,25
1908,24
1912,23
1920,22
1924,21
1928,20
1932,19
1936,18


## Numbering Olympic athletes by medals earned
---

Row numbering can also be used for ranking. For example, numbering rows and ordering by the count of medals each athlete earned in the OVER clause will assign 1 to the highest-earning medalist, 2 to the second highest-earning medalist, and so on.

### Instructions

For each athlete, count the number of medals he or she has earned.

In [14]:
%%sql

SELECT athlete,
       COUNT(*) AS Medals
FROM   summer_medals
GROUP  BY athlete
ORDER  BY medals DESC
LIMIT  20

 * sqlite:///data/summer.db
Done.


Athlete,Medals
PHELPS Michael,22
LATYNINA Larisa,18
ANDRIANOV Nikolay,15
SHAKHLIN Boris,13
ONO Takashi,13
MANGIAROTTI Edoardo,13
TORRES Dara,12
THOMPSON Jenny,12
NURMI Paavo,12
NEMOV Alexei,12


Having wrapped the previous query in the `Athlete_Medals` CTE, rank each athlete by the number of medals they've earned.

In [16]:
%%sql

WITH Athlete_Medals 
     AS (SELECT athlete,
                COUNT(*) AS Medals
         FROM   summer_medals
         GROUP  BY athlete)
SELECT athlete,
       ROW_NUMBER() OVER (ORDER BY medals DESC) AS Row_N
FROM   Athlete_Medals
ORDER  BY medals DESC
LIMIT  20

 * sqlite:///data/summer.db
Done.


athlete,Row_N
PHELPS Michael,1
LATYNINA Larisa,2
ANDRIANOV Nikolay,3
MANGIAROTTI Edoardo,4
ONO Takashi,5
SHAKHLIN Boris,6
COUGHLIN Natalie,7
FISCHER Birgit,8
KATO Sawao,9
NEMOV Alexei,10


## Reigning weightlifting champions
---
A reigning champion is a champion who's won both the previous and current years' competitions. To determine if a champion is reigning, the previous and current years' results need to be in the same row, in two different columns.

### Instructions

Return each year's gold medalists in the Men's 69KG weightlifting competition.

In [17]:
%%sql

SELECT year,
       country AS champion
FROM   summer_medals
WHERE  discipline = 'Weightlifting'
       AND event = '69KG'
       AND gender = 'Men'
       AND medal = 'Gold' 

 * sqlite:///data/summer.db
Done.


Year,champion
2000,BUL
2004,CHN
2008,CHN
2012,CHN


Having wrapped the previous query in the `Weightlifting_Gold` CTE, get the previous year's champion for each year.

In [18]:
%%sql

WITH Weightlifting_Gold
     AS (SELECT year,
                country AS champion
         FROM   summer_medals
         WHERE  discipline = 'Weightlifting'
                AND event = '69KG'
                AND gender = 'Men'
                AND medal = 'Gold')
SELECT year,
       champion,
       LAG(champion) OVER (ORDER BY year ASC) AS Last_Champion
FROM   Weightlifting_Gold
ORDER  BY year ASC 

 * sqlite:///data/summer.db
Done.


year,champion,Last_Champion
2000,BUL,
2004,CHN,BUL
2008,CHN,CHN
2012,CHN,CHN


## Reigning champions by gender
---
You've already fetched the previous year's champion for one event. However, if you have multiple events, genders, or other metrics as columns, you'll need to split your table into partitions to avoid having a champion from one event or gender appear as the previous champion of another event or gender.

### Instructions

Return the previous champions of each year's event by gender.

In [20]:
%%sql

WITH Tennis_Gold AS (
  SELECT DISTINCT gender, 
                  year, 
                  country
  FROM Summer_Medals
  WHERE
    year >= 2000 AND
    event = 'Javelin Throw' AND
    medal = 'Gold')

SELECT  gender, year,
        country AS Champion,
        LAG(country) OVER (PARTITION BY gender
            ORDER BY gender ASC) AS Last_Champion
FROM    Tennis_Gold
ORDER   BY gender ASC, year ASC

 * sqlite:///data/summer.db
Done.


gender,year,Champion,Last_Champion
Men,2000,CZE,
Men,2004,NOR,CZE
Men,2008,NOR,NOR
Men,2012,TTO,NOR
Women,2000,NOR,
Women,2004,CUB,NOR
Women,2008,CZE,CUB
Women,2012,CZE,CZE


## Reigning champions by gender and event
---

In the previous exercise, you partitioned by gender to ensure that data about one gender doesn't get mixed into data about the other gender. If you have multiple columns, however, partitioning by only one of them will still mix the results of the other columns.

### Instructions

Return the previous champions of each year's events by gender and event.

In [21]:
%%sql

WITH Athletics_Gold AS (
  SELECT DISTINCT
    gender, year, event, country
  FROM Summer_Medals
  WHERE
    year >= 2000 AND
    discipline = 'Athletics' AND
    event IN ('100M', '10000M') AND
    medal = 'Gold')

SELECT gender,
       year,
       event,
       country AS Champion,
       LAG(country) OVER (partition BY gender, event
           ORDER BY year ASC) AS Last_Champion
FROM   Athletics_Gold
ORDER  BY event ASC,
          gender ASC,
          year ASC 

 * sqlite:///data/summer.db
Done.


gender,year,event,Champion,Last_Champion
Men,2000,10000M,ETH,
Men,2004,10000M,ETH,ETH
Men,2008,10000M,ETH,ETH
Men,2012,10000M,GBR,ETH
Women,2000,10000M,ETH,
Women,2004,10000M,CHN,ETH
Women,2008,10000M,ETH,CHN
Women,2012,10000M,ETH,ETH
Men,2000,100M,USA,
Men,2004,100M,USA,USA
