### Load ipython-sql extension

In [1]:
# The 2 lines below prevent an error message from being displayed when we run %load_ext sql
import warnings
warnings.filterwarnings('ignore')

%load_ext sql
%config SqlMagic.feedback = False

### Connect to the database

In [2]:
%%sql

postgresql://localhost/dvdrental

'Connected: None@dvdrental'

### Sample from `customer` table

In [3]:
%%sql

SELECT
    r.rental_id
    , customer_id
    , r.return_date
FROM
    rental r
LIMIT
    5

rental_id,customer_id,return_date
2,459,2005-05-28 19:40:33
3,408,2005-06-01 22:12:39
4,333,2005-06-03 01:43:41
5,222,2005-06-02 04:33:21
6,549,2005-05-27 01:32:07


### Ranking our results
Using the `OVER` clause allows us to run a function on the results returned from the database. In this example we rank all customer rentals in the order they were returned, most-recent first, for each customer.

In [16]:
%%sql

SELECT
    r.rental_id
    , customer_id
    , r.return_date
    , rank() OVER(PARTITION BY r.customer_id ORDER BY r.return_date DESC) as rk
FROM
    rental r
-- Show only the first 10 results or the results are waaay to big!
LIMIT
    10

rental_id,customer_id,return_date,rk
15315,1,2005-08-30 01:51:46,1
15298,1,2005-08-28 22:49:37,2
14825,1,2005-08-27 07:01:57,3
13176,1,2005-08-23 08:50:54,4
14762,1,2005-08-23 01:30:57,5
12250,1,2005-08-22 23:05:29,6
13068,1,2005-08-20 14:44:16,7
11824,1,2005-08-19 10:11:54,8
11299,1,2005-08-10 16:40:52,9
10437,1,2005-08-10 12:12:04,10


### Working with our rankings
Once you've made the query above, we can use it as a Common Table Expression to filter it. For example, let's find the two most-recently returned films for each customer.

In [17]:
%%sql

WITH rental_ranked AS (
    SELECT
        r.rental_id
        , customer_id
        , r.return_date
        , rank() OVER(PARTITION BY r.customer_id ORDER BY r.return_date DESC) as rk
    FROM
        rental r
)

SELECT
    *
FROM
    rental_ranked rr
WHERE
    rr.rk < 3
LIMIT
    10

rental_id,customer_id,return_date,rk
15315,1,2005-08-30 01:51:46,1
15298,1,2005-08-28 22:49:37,2
15145,2,2005-08-31 15:51:04,1
14743,2,2005-08-29 00:18:56,2
14699,3,2005-08-29 18:08:48,1
13403,3,2005-08-27 19:23:07,2
15147,4,2005-08-28 14:33:23,1
13807,4,2005-08-28 09:06:40,2
13209,5,,1
14053,5,2005-08-26 20:50:59,2


Note that `None` is ranked above all other times. To avoid this, we would add a `WHERE` clause to our CTE specifying `return_date IS NOT NULL`.

There are lots of other window functions that can be used in place of `rank`; [check the Postgres docs](https://www.postgresql.org/docs/9.6/static/functions-window.html) for full details.