# PostgreSQL Summary Stats and Window Functions
Here you can access the `summer_medals` table used in the course. To access the table, you will need to specify the `medals` schema in your queries (e.g., `medals.summer_medals`).

## Take Notes

Add notes about the concepts you've learned and SQL cells with queries you want to keep.

_Add your notes here_

In [1]:
-- Add your own queries here
SELECT *
FROM medals.summer_medals
LIMIT 5

Unnamed: 0,year,city,sport,discipline,athlete,country,gender,event,medal
0,1896,Athens,Aquatics,Swimming,HAJOS Alfred,HUN,Men,100M Freestyle,Gold
1,1896,Athens,Aquatics,Swimming,HERSCHMANN Otto,AUT,Men,100M Freestyle,Silver
2,1896,Athens,Aquatics,Swimming,DRIVAS Dimitrios,GRE,Men,100M Freestyle For Sailors,Bronze
3,1896,Athens,Aquatics,Swimming,MALOKINIS Ioannis,GRE,Men,100M Freestyle For Sailors,Gold
4,1896,Athens,Aquatics,Swimming,CHASAPIS Spiridon,GRE,Men,100M Freestyle For Sailors,Silver


## Explore Datasets
Use the `summer_medals` table to explore the data and practice your skills!
- Select the `athlete`, `event`, and `year` from the `summer_medals` table.
    - Add another column, `previous_winner`, which contains the previous winner of the same event.
    - Filter your results for gold medalists.
- Return the `year`, total number of medalists per year, and running total number of medalists in the history of the Summer Olympics.
    - Order your results by year in ascending order.
- Return the `country`, `year`, and the number of gold medals earned.
   - Limit your results to the years 2004, 2008, and 2012.
   - Each country should have a subtotal of all gold medals earned across the three years.

## Window functions

Funções que se comportam como o group by, mas cada row permanece no final. Então fica mais fácil iterar sobre valores em cada row.

### _Introdução a window function_

In [4]:
-- COMO ADICIONAR RANK(numeração) NAS LINHAS
SELECT
    year, event, country,
    ROW_NUMBER() OVER() AS row_n
FROM medals.summer_medals
WHERE medal='Gold'
LIMIT 5;

Unnamed: 0,year,event,country,row_n
0,1896,100M Freestyle,HUN,1
1,1896,100M Freestyle For Sailors,GRE,2
2,1896,1200M Freestyle,HUN,3
3,1896,400M Freestyle,AUT,4
4,1896,100M,USA,5


In [9]:
/* ORDER BY subclausula em OVER para ordenar em função de algum campo especificado*/
SELECT
    year,event,country,
    ROW_NUMBER() OVER(ORDER BY year DESC) AS row_n
FROM medals.summer_medals
WHERE medal='Gold'
LIMIT 5;

/* É possível ordenar por dois campos ao mesmo tempo e em ASC e DESC*/
SELECT
    year,event,country,
    ROW_NUMBER() OVER(ORDER BY year DESC,event ASC) AS row_n
FROM medals.summer_medals
WHERE medal='Gold'
ORDER BY country ASC, year DESC -- ordenando por fora, não influencia row_n
LIMIT 5;

Unnamed: 0,year,event,country,row_n
0,2012,1500M,,26
1,2012,63KG,,159
2,2012,1500M,ALG,27
3,2000,1500M,ALG,1988
4,1996,1500M,ALG,2653


In [17]:
/* LAG(coluna,n) retorna o valor de n rows anteriores ao current row */
WITH Discus_Gold AS(
SELECT
    year,country AS campeao
FROM medals.summer_medals
WHERE
    year IN ('1996','2000','2004','2008','2012')
    AND gender='Men' AND medal='Gold'
    AND event='Discus Throw') -- Criando CTE para ser usado posteriormente
    
SELECT
    year,campeao,
    LAG(campeao,1) OVER(ORDER BY year ASC) AS ultimo_campeao
FROM Discus_Gold;

-- na saída podemos notas que o ultimo_campeao primeiro row é null, nao pegamos esse valor

Unnamed: 0,year,campeao,ultimo_campeao
0,1996,GER,
1,2000,LTU,GER
2,2004,LTU,LTU
3,2008,EST,LTU
4,2012,GER,EST


In [1]:
/* PARTITION BY subclausula em OVER para separar em função de algum campo especificado, e resetado a função aplicada para cada partição*/

WITH Discus_Gold AS(
SELECT
    year,country AS campeao,event
FROM medals.summer_medals
WHERE
    year IN ('1996','2000','2004','2008','2012')
    AND gender='Men' AND medal='Gold'
    AND event='Discus Throw' OR event='Triple Jump') -- Criando CTE para ser usado posteriormente

SELECT
    year,event,campeao,
    LAG(campeao) OVER(PARTITION BY event -- possivel particionar em mais de um
                     ORDER BY event ASC, year ASC) 
FROM Discus_Gold
ORDER BY event ASC, year ASC;

Unnamed: 0,year,event,campeao,lag
0,1996,Discus Throw,GER,
1,2000,Discus Throw,LTU,GER
2,2004,Discus Throw,LTU,LTU
3,2008,Discus Throw,EST,LTU
4,2012,Discus Throw,GER,EST
...,...,...,...,...
96,2012,Triple Jump,USA,USA
97,2012,Triple Jump,ITA,USA
98,2012,Triple Jump,KAZ,ITA
99,2012,Triple Jump,COL,KAZ


### _Fetching,ranking and paging_

- _**Fetching:**_
-  **Relativo(relação com o current row):**
1. LAG (n para trás)
2. LEAD (n para frente)
-  **Absoluto(não relativo ao current row):**
1. FIRST_VALUE(coluna)
2. LAST_VALUE(coluna)

In [3]:
/*Usando LEAD*/
WITH cidades AS(
    SELECT DISTINCT year,city
    FROM medals.summer_medals
    WHERE year IN ('1996','2000','2004','2008','2012')
)

SELECT
    year,city,
    LEAD(city,1) OVER (ORDER BY year ASC) AS proxima_cidade,
    LEAD(city,2) OVER (ORDER BY year ASC) AS depois_proxima
FROM cidades;

Unnamed: 0,year,city,proxima_cidade,depois_proxima
0,1996,Atlanta,Sydney,Athens
1,2000,Sydney,Athens,Beijing
2,2004,Athens,Beijing,London
3,2008,Beijing,London,
4,2012,London,,


In [7]:
/*Usando FIRST ou LAST*/
WITH cidades AS(
    SELECT DISTINCT year,city
    FROM medals.summer_medals
    WHERE year IN ('1996','2000','2004','2008','2012')
)

SELECT
    year,city,
    FIRST_VALUE(city) OVER(ORDER BY year ASC) AS primeira,
    LAST_VALUE(city) OVER(ORDER BY year ASC 
                          RANGE BETWEEN UNBOUNDED PRECEDING AND
                                        UNBOUNDED FOLLOWING)
    AS ultima
FROM cidades;

-- É necessário colocar unbounded para não pegar o valor do current row

Unnamed: 0,year,city,primeira,ultima
0,1996,Atlanta,Atlanta,London
1,2000,Sydney,Atlanta,London
2,2004,Athens,Atlanta,London
3,2008,Beijing,Atlanta,London
4,2012,London,Atlanta,London
