# Calculating Employment and Unemployment Spells

Note to users: please read the instructions file in this folder (00_instructions) before using this Jupyter Notebook file.

## SQL Database Connection

This section loads needed packages and connects the IPython Jupyter Notebook to the SQL database. If you are running this code in your own environment, remember to modify the SQL connection string to route the notebook to your own SQL server and database (see the 00_instructions file in this folder for more information). Our code uses the SQLALchemy Python package to interface between python and SQL languages, and uses Jupyter SQL ‘magic’ functions to make the code more concise.


In [25]:
# load sqlalchemy package to interface between Python and SQL databases
import sqlalchemy

# Replace the SQL connection string below (in quotation marks) with your own SQL connection information to run the program
connection_string = "mssql+pyodbc://@TDI"

# Create the engine connecting to the database server
sqlalchemy.create_engine(connection_string)

# Load the ipython-sql library to use Jupyter 'magic' functions, which make your code more concise
%load_ext sql

# Connect to the database server
%sql $connection_string

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


## Purpose:

In the previous notebooks, we created quarterly and annual employment and earning measures. The purpose of this notebook is to calculate the length of time individuals spent being employed with any employer or  not employed. Consecutive quarters of employment or non-employment are known as “spells”. Given that the data are quarterly and an individual can be employed, unemployed, and then reemployed within the same quarter or between two consecutive quarters, these are somewhat rough estimates of continuous employment.  Looking at the quarters following the client's program start, we want to determine:

1.	The length of the longest employment spell with any employer: for how many continuous quarters did the client have reported earnings during the 3 years AFTER program enrollment? We consider someone to be continuously employed even if they change employers within the quarter or between quarters. For example, someone who was employed through January 10 (Q1) with one employer, then was unemployed for a few months, and then gets another job with a different employer starting April 28 (Q2) would be considered continuously employed.
2.	The length of the longest spell of unemployment: for how many continuous quarters did the client have no reported earnings during the 3 years AFTER the program start quarter.
3.	Whether each person has any employment spells lasting 4 or more consecutive quarters.

#### Starting point 

Since we are interested in the follow-up period after program enrollment, we do not include the quarter of program enrollment (relative quarter 0) in these measures, so we start with relative quarter 1. We want to follow the client for 3 years, so we will use 12 quarters of wage data. Below, we show what this looks like for the first two people in our input file (see first 24 rows of results). Notice:

1.	We kept only the quarters 1-12 after program start date.
2.	We already have a binary (0/1) column (QTR_Employed) that tells us whether the client was employed in the quarter.
3.	To calculate spells of continuous employment, it is important to order the rows sequentially by relative quarter (RelativeQTR).
4.	To calculate how long employment or unemployment spells lasted, we need to know the values of the records before and after any particular quarterly record..
5.	Looking at the values of QTR_Employed, we see that the first person’s (SSN 100000000) longest employment spell was 2 quarters (relative quarters 7-8), and longest unemployment spell was 5 quarters (relative quarters 2-6). The second person’s (SSN 100090056) longest employment spell was 4 quarters (relative quarters 8-11) and longest unemployment spell was 5 quarters (relative quarters 3-7).

In [26]:
%%sql
SELECT TOP 24 
SSN, ProgStart, YR_QTR, RelativeQTR, QTR_Earnings, QTR_Employed
FROM dbo.UIQuarterlyMeasuresv
WHERE RelativeQTR BETWEEN 1 and 12 /* we only want to consider the quarters 1-12 following program start. */
ORDER BY SSN , RelativeQTR;

 * mssql+pyodbc://@TDI
Done.


SSN,ProgStart,YR_QTR,RelativeQTR,QTR_Earnings,QTR_Employed
100000000,2017-06-01,2017Q3,1,12550,1
100000000,2017-06-01,2017Q4,2,0,0
100000000,2017-06-01,2018Q1,3,0,0
100000000,2017-06-01,2018Q2,4,0,0
100000000,2017-06-01,2018Q3,5,0,0
100000000,2017-06-01,2018Q4,6,0,0
100000000,2017-06-01,2019Q1,7,3804,1
100000000,2017-06-01,2019Q2,8,17082,1
100000000,2017-06-01,2019Q3,9,0,0
100000000,2017-06-01,2019Q4,10,0,0


### Calculating length of longest employment and unemployment spells

The code below calculates the length of a client's longest unemployment spell and longest employment spell (SpellSize). For this, we use the values in the variable QTR_Employed, which is equal to 1 if the individual was employed in that quarter, and 0 if the person was not employed in that quarter. 

Following the annotated code to create these measures, we show aprint of the final data for the first 10 SSNs in the data file. 

In [27]:
%%sql

/*
Using the Lag statement to retain the employment status on the current record for the next record
1.	To determine if a spell is continuing, we have to know if the current record has the same employment status as the record preceding it, or if the person’s status changed.
2.	To determine this, the records must be in sequential order – in this case, by date -- for each SSN.
3.	To compare the employment status on a record to the status on the prior record, we copy the status from the prior record to the current record – to a column named PREV) using the LAG function. 
4.	The spells stop when QTR_Employed (employment status in current quarter) and PREV (employment record in prior quarter) are not equal.
*/


WITH PREV AS(
-- this step uses the LAG statement to retain the value of QTR_Employed(employment status) to the next record 
select SSN, YR_QTR,QTR_Employed,
LAG(QTR_EMPLOYED) OVER(PARTITION BY SSN ORDER BY SSN,YR_QTR) AS PREV
              
      from UIQuarterlyMeasuresV
     where RelativeQTR  BETWEEN 1 and 12 /* we only want to consider the quarters 1-12 following program start. */
      ),
/*
Next, Assign an ID Number to each spell
This ID number, called GROUPS, will be used to calculate each spell size in the following step. Notice the GROUPS column assigns an ID number to each spell of employment/unemployment. Also notice that GROUPS does not reset to 1 for each distinct individual (SSN) on the file – this is not necessary because GROUPS recognizes a new spell when the records switch from one individual to the next, because the PREV variable for an individual’s initial record is always equal to None.
*/

GROUPS AS(
-- this step assigns a group number to every spell, changing the group number when employment status on the current record is different from that on the previous
SELECT SSN,YR_QTR,QTR_EMPLOYED, 
SUM(CASE WHEN QTR_EMPLOYED=PREV THEN 0 ELSE 1 END) OVER(ORDER BY SSN,YR_QTR) AS GROUPS
FROM PREV
       ),

/*
Count how many records have the same spell Group ID to calculate spell size
*/
       
SPELLSIZE AS(
-- this step groups by the group number created above and sums how many quarters are in each group 
SELECT SSN, QTR_EMPLOYED,GROUPS , COUNT(*) AS SpellSize
FROM GROUPS

GROUP BY SSN, QTR_EMPLOYED,GROUPS
)

/*
Finally, keep only the maximum employment and unemployment spell
*/

-- this steps selects the maximum spell for each employment status.
SELECT TOP 10 
SSN, QTR_EMPLOYED, MAX(SpellSize) AS MaxSpell

FROM SPELLSIZE

GROUP BY  SSN, QTR_EMPLOYED 
     
ORDER BY  SSN, QTR_EMPLOYED 
;

 * mssql+pyodbc://@TDI
Done.


SSN,QTR_EMPLOYED,MaxSpell
100000000,0,5
100000000,1,2
100900056,0,5
100900056,1,4
101800112,0,2
101800112,1,4
102700168,0,3
102700168,1,2
103600224,0,4
103600224,1,2


In [28]:
%%sql
DROP TABLE IF EXISTS PREV
/* THIS CODED NEEDED IF PROGRAM HAS TO BE RERUN */
;

 * mssql+pyodbc://@TDI
Done.


[]

### Flatten the resulting set

After obtaining the maximum length for unemployment and employment spells for each person, we flatten the file to have one record per person. We also added a 0/1 indicator, Employ4QtrConsec, for whether the person was ever employed for 4 consecutive quarters.

In [29]:
%%sql
SELECT  TOP 10
SSN,
MAX(CASE QTR_EMPLOYED WHEN 0 THEN MAXSPELL END) AS LongestUnemploymentSpell3Y,
MAX(CASE QTR_EMPLOYED WHEN 1 THEN MAXSPELL END) AS LongestEmploymentSpell3Y,
MAX(CASE WHEN QTR_EMPLOYED=1 AND MAXSPELL>3 THEN 1 ELSE 0 END) AS Employ4QtrConsec
FROM EmploymentRetention
GROUP BY SSN;


 * mssql+pyodbc://@TDI
Done.


SSN,LongestUnemploymentSpell3Y,LongestEmploymentSpell3Y,Employ4QtrConsec
100000000,5,2,0
100900056,5,4,1
101800112,2,4,1
102700168,3,2,0
103600224,4,2,0
104500280,3,4,1
105400336,4,3,0
106300392,1,4,1
107200448,1,3,0
108100504,2,4,1
