# Calculating Employment and Unemployment Spells

In [1]:
# this code loads needed packages and connect to database.
# load sqlalchemy package
import sqlalchemy

# Define connection string (Projects is the corpmdrc DSN)
connection_string = "mssql+pyodbc://@TDI"

# Create the engine connecting to the database server
sqlalchemy.create_engine(connection_string)

# Load sql magicks 
%load_ext sql

# Connect to the database server
%sql $connection_string

## Purpose:

During the quarters following the client's program start, we want to figure out:

1. The length of longest employment spell: for how many continuous quarters did the client have reported earnings during the 3 years AFTER the program start quarter?

2. The length of longest unemployment spell:  for how many continuous quarters did the client have no reported earnings during the 3 years AFTER the program start quarter.

3. Did the client have any employment spells lasting 4 consecutive quarters or more?

#### Starting point 

As mentioned above, we do not want to include the quarter the client started in the program (relative quarter 0) in these measures, so we start with relative quarter 1.  We want to follow the client for 3 years so we will use 12 quarters of wage data. With just a visual check of the base data for the first client, it looks like their longest employment spell was 2 quarters and the longest unemployment spell was 5 quarters. 

Notice:
1. We kept only the quarters 1-12 after program start date.
2. We already have a YES/NO 0/1 column (QTR_Employed) that tells us if the client was employed or unemployed in the quarter.
3. To judge spells of continuous employment, it is important to order the rows sequentially by relative quarter (RelativeQTR). 
4. To judge how long employment/unemployment spells lasted, we have to know what happened on the records before and the records after.

In [2]:
%%sql
SELECT TOP 24 
SSN, ProgStart, YR_QTR, RelativeQTR, QTR_Earnings, QTR_Employed
FROM dbo.UIQuarterlyMeasuresv
WHERE RelativeQTR BETWEEN 1 and 12 /* we only want to consider the quarters 1-12 following program start. */
ORDER BY SSN , RelativeQTR;

 * mssql+pyodbc://@TDI
Done.


SSN,ProgStart,YR_QTR,RelativeQTR,QTR_Earnings,QTR_Employed
100000000,2017-06-01,2017Q3,1,12550,1
100000000,2017-06-01,2017Q4,2,0,0
100000000,2017-06-01,2018Q1,3,0,0
100000000,2017-06-01,2018Q2,4,0,0
100000000,2017-06-01,2018Q3,5,0,0
100000000,2017-06-01,2018Q4,6,0,0
100000000,2017-06-01,2019Q1,7,3804,1
100000000,2017-06-01,2019Q2,8,17082,1
100000000,2017-06-01,2019Q3,9,0,0
100000000,2017-06-01,2019Q4,10,0,0


### Calculating length of longest employment and unemployment spells

The code below calculates the length of a client's longest unemployment and employment spell (SpellSize). The variable QTR_Employed is equal to 1 for an employment spell and 0 for an unemployment spell. We will step through each component of the code in the next section. 

We print the final data for the first 10 SSNs. Notice that as expected, the first client has a maximum unemployment spell of 5 quarters (where QTR_Employed=0) and an employment spell of 2 quarters (where QTR_Employed=1).

In [3]:
%%sql
WITH PREV AS(
-- this step uses the LAG statement to copy down the value of QTR_Employed(employment status) to the next record 
select SSN, YR_QTR,QTR_Employed,
LAG(QTR_EMPLOYED) OVER(PARTITION BY SSN ORDER BY SSN,YR_QTR) AS PREV

              
      from UIQuarterlyMeasuresV
     where RelativeQTR  BETWEEN 1 and 12 /* we only want to consider the quarters 1-12 following program start. */
      ),

GROUPS AS(
-- this step assigns a group number to every spell. changing the group number when employment status changes from previous
SELECT SSN,YR_QTR,QTR_EMPLOYED, 
SUM(CASE WHEN QTR_EMPLOYED=PREV THEN 0 ELSE 1 END) OVER(ORDER BY SSN,YR_QTR) AS GROUPS
FROM PREV
       ),
       
SPELLSIZE AS(
-- this step groups by the group number created above and sums how many quarters are in each group 
SELECT SSN, QTR_EMPLOYED,GROUPS , COUNT(*) AS SpellSize
FROM GROUPS

GROUP BY SSN, QTR_EMPLOYED,GROUPS
)
-- this steps selects the maximum spell for each employment status.
SELECT TOP 10 
SSN, QTR_EMPLOYED, MAX(SpellSize) AS MaxSpell

FROM SPELLSIZE

GROUP BY  SSN, QTR_EMPLOYED 
     
ORDER BY  SSN, QTR_EMPLOYED 
;

 * mssql+pyodbc://@TDI
Done.


SSN,QTR_EMPLOYED,MaxSpell
100000000,0,5
100000000,1,2
100900056,0,5
100900056,1,4
101800112,0,2
101800112,1,4
102700168,0,3
102700168,1,2
103600224,0,4
103600224,1,2


In [4]:
%%sql
DROP TABLE IF EXISTS PREV
/* THIS CODED NEEDED IF PROGRAM HAS TO BE RERUN */
;

 * mssql+pyodbc://@TDI
Done.


[]

### Say What?: Let's Step Through the Code Above.

#### Using the Lag statement to copy over the employment status to the next record

1. In order determine if a spell is continuing we have to know if the current record has the same value as the record preceding it.
2. To determine this, the records have to be in sequential order for each SSN.
3. The spells stop when QTR_Employed and PREV are not equal (for example, on row 2 and row 7 below).

In [5]:
%%sql
-- this step uses the LAG statement to copy down the value of QTR_Employed(employment status) to the next record 

SELECT 
    SSN, YR_QTR,QTR_Employed,
    LAG(QTR_EMPLOYED) OVER(PARTITION BY SSN ORDER BY SSN,YR_QTR) AS PREV
    
into PREV
              
FROM UIQuarterlyMeasuresV
where RelativeQTR  BETWEEN 1 and  12 /* we only want to consider the quarters 1-12 following program start. */
;

SELECT TOP 17 * 
FROM PREV
ORDER BY SSN, YR_QTR;


 * mssql+pyodbc://@TDI
9883 rows affected.
Done.


SSN,YR_QTR,QTR_Employed,PREV
100000000,2017Q3,1,
100000000,2017Q4,0,1.0
100000000,2018Q1,0,0.0
100000000,2018Q2,0,0.0
100000000,2018Q3,0,0.0
100000000,2018Q4,0,0.0
100000000,2019Q1,1,0.0
100000000,2019Q2,1,1.0
100000000,2019Q3,0,1.0
100000000,2019Q4,0,0.0


In [6]:
%%sql
DROP TABLE IF EXISTS GROUPS
/* THIS CODED NEEDED IF PROGRAM HAS TO BE RERUN */
;

 * mssql+pyodbc://@TDI
Done.


[]

#### Next, Assign an ID Number to each spell

Notice the column GROUPS assigns an ID number to the spell of employment/unemployment.

In [7]:
%%sql
-- this step assigns a group number to every spell, changing the group number when employment status changes from previous

SELECT 
    SSN,YR_QTR,QTR_EMPLOYED, PREV,
    SUM(CASE WHEN QTR_EMPLOYED=PREV THEN 0 ELSE 1 END) OVER(ORDER BY SSN,YR_QTR) AS GROUPS -- ADD 1 THEN RECORDS DON'T EQUAL
INTO GROUPS
FROM PREV
    ;
SELECT TOP 17 * 
FROM GROUPS 
ORDER BY SSN,YR_QTR;

 * mssql+pyodbc://@TDI
9883 rows affected.
Done.


SSN,YR_QTR,QTR_EMPLOYED,PREV,GROUPS
100000000,2017Q3,1,,1
100000000,2017Q4,0,1.0,2
100000000,2018Q1,0,0.0,2
100000000,2018Q2,0,0.0,2
100000000,2018Q3,0,0.0,2
100000000,2018Q4,0,0.0,2
100000000,2019Q1,1,0.0,3
100000000,2019Q2,1,1.0,3
100000000,2019Q3,0,1.0,4
100000000,2019Q4,0,0.0,4


In [8]:
%%sql
DROP TABLE IF EXISTS SPELLSIZE
/* THIS CODED NEEDED IF PROGRAM HAS TO BE RERUN */
;

 * mssql+pyodbc://@TDI
Done.


[]

#### Count how many records have the same spell Group ID

In [9]:
%%sql

-- this step groups by the group number created above and sums how many quarters are in each group 

SELECT SSN, QTR_EMPLOYED,GROUPS , COUNT(*) AS SpellSize

INTO SPELLSIZE

FROM GROUPS

GROUP BY SSN, QTR_EMPLOYED, GROUPS
;

SELECT TOP 10 * FROM SPELLSIZE ORDER BY SSN, GROUPS;



 * mssql+pyodbc://@TDI
4901 rows affected.
Done.


SSN,QTR_EMPLOYED,GROUPS,SpellSize
100000000,1,1,1
100000000,0,2,5
100000000,1,3,2
100000000,0,4,3
100000000,1,5,1
100900056,1,6,2
100900056,0,7,5
100900056,1,8,4
100900056,0,9,1
101800112,0,10,1


#### Finally, keep only the maximum employment and unemployment spell

In [10]:
%%sql
-- this steps selects the maximum spell for each employment status.
SELECT
SSN, QTR_EMPLOYED, MAX(SpellSize) AS MaxSpell

INTO EmploymentRetention

FROM SPELLSIZE

GROUP BY  SSN, QTR_EMPLOYED 
     
ORDER BY  SSN, QTR_EMPLOYED 
;

SELECT  TOP 10 * FROM EmploymentRetention;

 * mssql+pyodbc://@TDI
1880 rows affected.
Done.


SSN,QTR_EMPLOYED,MaxSpell
380817472,0,1
563528840,0,2
220607504,1,2
764241328,0,1
435720888,0,1
461822512,0,2
190005600,0,1
122501400,1,1
861447376,0,1
657134664,0,3


### Flatten the resulting set

After obtaining the maximum unemployment and employment spell for each SSN, we flatten the file to have one record per SSN.

In [11]:
%%sql
SELECT  TOP 10
SSN,
MAX(CASE QTR_EMPLOYED WHEN 0 THEN MAXSPELL END) AS LongestUnemploymentSpell3Y,
MAX(CASE QTR_EMPLOYED WHEN 1 THEN MAXSPELL END) AS LongestEmploymentSpell3Y,
MAX(CASE WHEN QTR_EMPLOYED=1 AND MAXSPELL>3 THEN 1 ELSE 0 END) AS Employ4QtrConsec
FROM EmploymentRetention
GROUP BY SSN;


 * mssql+pyodbc://@TDI
Done.


SSN,LongestUnemploymentSpell3Y,LongestEmploymentSpell3Y,Employ4QtrConsec
100000000,5,2,0
100900056,5,4,1
101800112,2,4,1
102700168,3,2,0
103600224,4,2,0
104500280,3,4,1
105400336,4,3,0
106300392,1,4,1
107200448,1,3,0
108100504,2,4,1
