In [1]:
%run helper/setup_notebook.ipynb

Successfully connected to sql_lab database.


#### Using `MAX()` Analytical Function To Group Data Based On a Partition Column

We can use `MAX()` analytical function in MySQL to group data based on a partition column and create a list of registrations that have the latest registration date for each group. 

This can be useful for various purposes, such as:
- finding the latest order date for each customer, and
- the latest activity date for each user.

In [4]:
%%sql
-- The sample table has some registration records. 
-- Notice there are states with different user registration dates.
SELECT *
FROM registrations;


ID,FIRST_NAME,LAST_NAME,ADDRESS_STATE,REGISTER_DATE
1,Emma,Gonzales,Florida,2022-01-01 06:06:00
2,Avery,Jordan,West Virginia,2021-09-23 05:04:00
3,Liam,Murphy,California,2020-11-16 22:34:00
4,Olivia,Moore,Florida,2020-04-15 15:26:00
5,Noah,Bishop,New Jersey,2023-04-04 15:22:00
6,Sophia,White,Michigan,
7,Jackson,Black,Ohio,2020-07-02 06:34:00
8,Isabella,Crawford,California,2020-03-15 14:07:00
9,Lucas,Simmons,Wisconsin,2021-08-18 06:10:00
10,Mia,Kelly,Georgia,2020-01-21 03:54:00


In [5]:
%%sql 

-- Step 1. using the MAX() analytical function to find the latest registration date for each state

SELECT 
    first_name,
    last_name,
    address_state,
    register_date,
    MAX(register_date) OVER (PARTITION BY address_state) AS latest_register_date
FROM registrations

first_name,last_name,address_state,register_date,latest_register_date
Ava,Torres,Alabama,2021-12-04 08:20:00,2021-12-04 08:20:00
Ethan,Hunt,Arizona,2022-05-16 03:57:00,2022-05-16 03:57:00
Liam,Murphy,California,2020-11-16 22:34:00,2023-05-23 05:24:00
Isabella,Crawford,California,2020-03-15 14:07:00,2023-05-23 05:24:00
William,Scott,California,2022-03-03 02:23:00,2023-05-23 05:24:00
Grace,Ferguson,California,2023-05-23 05:24:00,2023-05-23 05:24:00
Henry,Rose,District of Columbia,2020-10-20 14:11:00,2020-10-20 14:11:00
Emma,Gonzales,Florida,2022-01-01 06:06:00,2022-01-01 06:06:00
Olivia,Moore,Florida,2020-04-15 15:26:00,2022-01-01 06:06:00
Mia,Kelly,Georgia,2020-01-21 03:54:00,2021-08-14 13:36:00


In [7]:
%%sql 

-- Step 2. Filter the register_date to show only the latest date for each state.

SELECT 
    first_name,
    last_name,
    address_state,
    register_date
FROM (
    SELECT 
        first_name,
        last_name,
        address_state,
        register_date,
        MAX(register_date) OVER (PARTITION BY address_state) AS latest_register_date
    FROM registrations
) b -- giving alias to the subquery is optional in MYSQL
WHERE register_date = b.latest_register_date


first_name,last_name,address_state,register_date
Ava,Torres,Alabama,2021-12-04 08:20:00
Ethan,Hunt,Arizona,2022-05-16 03:57:00
Grace,Ferguson,California,2023-05-23 05:24:00
Henry,Rose,District of Columbia,2020-10-20 14:11:00
Emma,Gonzales,Florida,2022-01-01 06:06:00
Harper,Castillo,Georgia,2021-08-14 13:36:00
Elijah,Freeman,Illinois,2023-01-21 23:40:00
Luna,Stevens,Nevada,2022-12-26 02:48:00
Noah,Bishop,New Jersey,2023-04-04 15:22:00
Jackson,Black,Ohio,2020-07-02 06:34:00


#### The `MAX()` function is used with the `OVER()` clause to find the maximum (latest) registration date for each state. 

#### The `OVER()` clause is used to specify a partition of rows based on the address state, and the `MAX()` function is applied to that partition to determine the maximum value of the register_date column for each partition. 

#### This allows us to identify the latest registration date for each state, which can then be used to filter the results to only include the rows with the latest registration date for each state.