# The Patient Puzzle

Source: https://msbiskills.com/2015/03/23/473/

In this puzzle we have to group data based on Patients admission date and discharge date. If any Patients discharge date + 1 = admission date then we have group both rows into one row and sum costs from both the rows. Please check out the sample input and expected output for details.

In [2]:
%run Question.ipynb

 * postgresql://fknight:***@localhost/postgres
Done.
Done.
8 rows affected.
8 rows affected.


# Part A

In Part A, we use a binary sequence to mark the start of every date group.

In [9]:
%%sql

SELECT 
    *, 
    CASE
    WHEN (AdmissionDate - interval '1 day') = LAG(DischargeDate, 1)
        OVER (PARTITION BY PatientID) 
    THEN 0
    ELSE 1
    END as start_of_seq
FROM
    patients;

 * postgresql://fknight:***@localhost/postgres
8 rows affected.


patientid,admissiondate,dischargedate,cost,start_of_seq
1009,2014-07-27,2014-07-31,"$1,050.00",1
1009,2014-08-01,2014-08-23,"$1,070.00",0
1009,2014-08-31,2014-08-31,"$1,900.00",1
1009,2014-09-01,2014-09-14,"$1,260.00",0
1009,2014-12-01,2014-12-31,"$2,090.00",1
1024,2014-06-07,2014-06-28,"$1,900.00",1
1024,2014-06-29,2014-07-31,"$2,900.00",0
1024,2014-08-01,2014-08-02,"$1,800.00",0


# Part B

In Part B, we build on the previous subquery to create a ranking of each date group.

In [16]:
%%sql

WITH seq AS (
    SELECT 
        *, 
        CASE
        WHEN (AdmissionDate - interval '1 day') = LAG(DischargeDate, 1)
            OVER (ORDER BY PatientID, AdmissionDate) 
        THEN 0
        ELSE 1
        END as start_of_seq
    FROM
        patients
)

SELECT 
    *,
    SUM(start_of_seq) 
        OVER (ORDER BY PatientID, AdmissionDate)  
        AS date_rank
FROM seq

 * postgresql://fknight:***@localhost/postgres
8 rows affected.


patientid,admissiondate,dischargedate,cost,start_of_seq,date_rank
1009,2014-07-27,2014-07-31,"$1,050.00",1,1
1009,2014-08-01,2014-08-23,"$1,070.00",0,1
1009,2014-08-31,2014-08-31,"$1,900.00",1,2
1009,2014-09-01,2014-09-14,"$1,260.00",0,2
1009,2014-12-01,2014-12-31,"$2,090.00",1,3
1024,2014-06-07,2014-06-28,"$1,900.00",1,4
1024,2014-06-29,2014-07-31,"$2,900.00",0,4
1024,2014-08-01,2014-08-02,"$1,800.00",0,4


# Part C

In Part C, we build on the previous two subqueries, and solve the original problem.

In [17]:
%%sql

WITH seq AS (
    SELECT 
        *, 
        CASE
        WHEN (AdmissionDate - interval '1 day') = LAG(DischargeDate, 1)
            OVER (ORDER BY PatientID, AdmissionDate) 
        THEN 0
        ELSE 1
        END as start_of_seq
    FROM
        patients
),

ranks AS (
    SELECT 
        *,
        SUM(start_of_seq) 
            OVER (ORDER BY PatientID, AdmissionDate)  
            AS date_rank
    FROM seq
)

SELECT 
    patientid,
    min(admissiondate) as admissiondate,
    max(dischargedate) as dischargedate,
    sum(cost) as cost
FROM ranks 
GROUP BY PatientID, date_rank 
ORDER BY PatientID, AdmissionDate;

 * postgresql://fknight:***@localhost/postgres
4 rows affected.


patientid,admissiondate,dischargedate,cost
1009,2014-07-27,2014-08-23,"$2,120.00"
1009,2014-08-31,2014-09-14,"$3,160.00"
1009,2014-12-01,2014-12-31,"$2,090.00"
1024,2014-06-07,2014-08-02,"$6,600.00"


## The full solution is given below

In [18]:
%%sql

WITH seq AS (
    SELECT 
        *, 
        CASE
        WHEN (AdmissionDate - interval '1 day') = LAG(DischargeDate, 1)
            OVER (ORDER BY PatientID, AdmissionDate) 
        THEN 0
        ELSE 1
        END as start_of_seq
    FROM
        patients
),

ranks AS (
    SELECT 
        *,
        SUM(start_of_seq) 
            OVER (ORDER BY PatientID, AdmissionDate)  
            AS date_rank
    FROM seq
)

SELECT 
    patientid,
    min(admissiondate) as admissiondate,
    max(dischargedate) as dischargedate,
    sum(cost) as cost
FROM ranks 
GROUP BY PatientID, date_rank 
ORDER BY PatientID, AdmissionDate;

 * postgresql://fknight:***@localhost/postgres
4 rows affected.


patientid,admissiondate,dischargedate,cost
1009,2014-07-27,2014-08-23,"$2,120.00"
1009,2014-08-31,2014-09-14,"$3,160.00"
1009,2014-12-01,2014-12-31,"$2,090.00"
1024,2014-06-07,2014-08-02,"$6,600.00"
