# Time difference between latest actions

Source: https://towardsdatascience.com/twenty-five-sql-practice-exercises-5fc791e24082

From the following table of user actions, write a query to return for each user the time elapsed between the last action and the second-to-last action, in ascending order by user ID.

In [2]:
%run Question.ipynb

 * postgresql://fknight:***@localhost/postgres
Done.
Done.
8 rows affected.
8 rows affected.


# Part A

Write a query to rank the dates of each action, for each user.

In [6]:
%%sql

SELECT 
    *, 
    row_number() 
        OVER (PARTITION by user_id ORDER BY date DESC) 
        AS date_rank
FROM users

 * postgresql://fknight:***@localhost/postgres
8 rows affected.


user_id,action,date,date_rank
1,publish,2020-02-19,1
1,cancel,2020-02-13,2
1,start,2020-02-12,3
2,publish,2020-02-14,1
2,start,2020-02-11,2
3,start,2020-02-15,1
3,cancel,2020-02-15,2
4,start,2020-02-18,1


# Part B

Using the subquery from Part A, determine the most recent action for each user.

In [8]:
%%sql

WITH date_ranks AS (
    SELECT 
    *, 
    row_number() 
        OVER (PARTITION by user_id ORDER BY date DESC) 
        AS date_rank
    FROM users
)

SELECT *
FROM date_ranks
WHERE date_rank = 1

 * postgresql://fknight:***@localhost/postgres
4 rows affected.


user_id,action,date,date_rank
1,publish,2020-02-19,1
2,publish,2020-02-14,1
3,start,2020-02-15,1
4,start,2020-02-18,1


# Part C

Using the subquery from Part A, determine the 2nd most recent action for each user.

In [9]:
%%sql

WITH date_ranks AS (
    SELECT 
    *, 
    row_number() 
        OVER (PARTITION by user_id ORDER BY date DESC) 
        AS date_rank
    FROM users
)

SELECT *
FROM date_ranks
WHERE date_rank = 2

 * postgresql://fknight:***@localhost/postgres
3 rows affected.


user_id,action,date,date_rank
1,cancel,2020-02-13,2
2,start,2020-02-11,2
3,cancel,2020-02-15,2


# Part D

Using the subqueries from Parts A, B, & C, solve the original problem.

In [10]:
%%sql

WITH date_ranks AS (
    SELECT 
        *, 
        row_number() 
            OVER (PARTITION by user_id ORDER BY date DESC) 
            AS date_rank
    FROM users 
),

latest AS (
    SELECT *
    FROM date_ranks
    WHERE date_rank = 1
),

next_latest AS ( 
    SELECT *
    FROM date_ranks
    WHERE date_rank = 2
)

SELECT l1.user_id,
l1.date - l2.date AS days_elapsed
FROM latest l1
LEFT JOIN next_latest l2 ON l1.user_id = l2.user_id ORDER BY 1;

 * postgresql://fknight:***@localhost/postgres
4 rows affected.


user_id,days_elapsed
1,6.0
2,3.0
3,0.0
4,


## The solution is given below

In [3]:
%%sql

WITH date_ranks AS (
    SELECT 
        *, 
        row_number() 
            OVER (PARTITION by user_id ORDER BY date DESC) 
            AS date_rank
    FROM users 
),

latest AS (
    SELECT *
    FROM date_ranks
    WHERE date_rank = 1
),

next_latest AS ( 
    SELECT *
    FROM date_ranks
    WHERE date_rank = 2
)

-- left join these two tables 
-- (everyone will have a latest action, 
-- not everyone will have a second latest action), 
-- subtracting latest from second latest to get time elapsed

SELECT l1.user_id,
l1.date - l2.date AS days_elapsed
FROM latest l1
LEFT JOIN next_latest l2 ON l1.user_id = l2.user_id ORDER BY 1;

 * postgresql://fknight:***@localhost/postgres
4 rows affected.


user_id,days_elapsed
1,6.0
2,3.0
3,0.0
4,
