# The Candidate Joining Puzzle

Source: https://msbiskills.com/2015/03/23/t-sql-query-the-candidate-joining-puzzle/

In this puzzle we have to find out the valid candidate joining date for each candidate. E.g if you check for CID the joining date is 10-01-2015 and as per the company’s holiday table they have holiday. So in this case we have to prepone the joining by one day. Hence for CJ10101 the valid joining date would be 08-01-2015 as they have holiday on 09-01-2015 also. Please check out the sample input and expected output for details.

In [1]:
%run Question.ipynb

 * postgresql://fknight:***@localhost/postgres
Done.
Done.
5 rows affected.
5 rows affected.
 * postgresql://fknight:***@localhost/postgres
Done.
Done.
5 rows affected.
5 rows affected.


# Part A

In Part A, the goal is to mark calendar dates, such that a sequence of consecutive dates will yield a "1" for the first date, and a "0" for every day after. 

We call this column `start_of_seq` below. See the result of the query below for an example of how this looks with real data.

In [2]:
%%sql

SELECT 
    ID,
    HolidayDate,
    CASE
    WHEN HolidayDate = LAG(HolidayDate, 1) 
        OVER (ORDER BY HolidayDate) + 1
    THEN 0
    ELSE 1
    END AS start_of_seq
FROM Holidays

 * postgresql://fknight:***@localhost/postgres
5 rows affected.


id,holidaydate,start_of_seq
102,2015-01-09,1
101,2015-01-10,0
103,2015-02-19,1
104,2015-03-11,1
105,2015-04-11,1


# Part B

In Part B, we build on the the binary sequence, summing the "1"s in the stream, to create a ranking. In the example below, you'll see that the holidays "2015-01-09" and "2015-01-10" both have rank of "1".

In [3]:
%%sql

WITH seq AS (
    SELECT
        ID,
        HolidayDate,
        CASE
        WHEN HolidayDate = LAG(HolidayDate, 1) 
            OVER (ORDER BY HolidayDate) + 1
        THEN 0
        ELSE 1
        END AS start_of_seq
    FROM Holidays
)

SELECT 
    ID,
    HolidayDate,
    start_of_seq,
    SUM(start_of_seq) OVER (ORDER BY HolidayDate) AS holiday_rank
FROM seq

 * postgresql://fknight:***@localhost/postgres
5 rows affected.


id,holidaydate,start_of_seq,holiday_rank
102,2015-01-09,1,1
101,2015-01-10,0,1
103,2015-02-19,1,2
104,2015-03-11,1,3
105,2015-04-11,1,4


# Part C

In Part C, we use the `holiday_rank` column to group the holidays with minimum and maximum dates, for each group.



In [4]:
%%sql

WITH seq AS (
    SELECT
        ID,
        HolidayDate,
        CASE
        WHEN HolidayDate = LAG(HolidayDate, 1) 
            OVER (ORDER BY HolidayDate) + 1
        THEN 0
        ELSE 1
        END AS start_of_seq
    FROM Holidays
),

ranks AS (
    SELECT 
        ID,
        HolidayDate,
        start_of_seq,
        SUM(start_of_seq) OVER (ORDER BY HolidayDate) AS holiday_rank
    FROM seq
)

SELECT
    MIN(HolidayDate) AS min_date,
    MAX(HolidayDate) AS max_date
FROM ranks
GROUP BY holiday_rank

 * postgresql://fknight:***@localhost/postgres
4 rows affected.


min_date,max_date
2015-03-11,2015-03-11
2015-04-11,2015-04-11
2015-02-19,2015-02-19
2015-01-09,2015-01-10


# Part D

In the final part, Part D, we a use the min & max dates, joining on the `BETWEEN` condition, against the `candidate_joining` table, to get the "valid" joining dates.

## The full solution is given below

In [5]:
%%sql

WITH seq AS (
    SELECT
        ID,
        HolidayDate,
        CASE
        WHEN HolidayDate = LAG(HolidayDate, 1) 
            OVER (ORDER BY HolidayDate) + 1
        THEN 0
        ELSE 1
        END AS start_of_seq
    FROM Holidays
),

ranks AS (
    SELECT 
        ID,
        HolidayDate,
        start_of_seq,
        SUM(start_of_seq) OVER (ORDER BY HolidayDate) AS holiday_rank
    FROM seq
),

min_max AS (
    SELECT
        MIN(HolidayDate) AS min_date,
        MAX(HolidayDate) AS max_date
    FROM ranks
    GROUP BY holiday_rank
)

SELECT 
    CId, 
    CJoiningDate,
    CASE 
    WHEN Min_Date IS NULL 
    THEN CJoiningDate ELSE Min_Date - 1 
    END as Valid_Join_Date 
FROM candidate_joining j
LEFT JOIN min_max c 
ON j.CJoiningDate 
    BETWEEN c.Min_Date AND c.Max_Date

 * postgresql://fknight:***@localhost/postgres
5 rows affected.


cid,cjoiningdate,valid_join_date
CJ10101,2015-01-10,2015-01-08
CJ10104,2015-01-10,2015-01-08
CJ10105,2015-02-18,2015-02-18
CJ10121,2015-03-11,2015-03-10
CJ10198,2015-04-11,2015-04-10
