# Relative Quarterly Measures: What happened after program enrollment?

In [1]:
# this code loads needed packages and connect to database.
# load sqlalchemy package
import sqlalchemy

# Define connection string (Projects is the corpmdrc DSN)
connection_string = "mssql+pyodbc://@TDI"

# Create the engine connecting to the database server
sqlalchemy.create_engine(connection_string)

# Load sql magicks 
%load_ext sql

# Connect to the database server
%sql $connection_string

## Purpose:

Analysis questions often focus on what happened after the client enrolled in the program. This code will create a number of measures that start the quarter the client enrolled in the program and follow the quarters after. These are sometimes called "relative" measures because they are relative to the enrollment quarter. Some example measures include:

1. Was the client unemployed the quarter of the enrollment and the quarters that follow?
2. How much did the client earn the quarter of enrollement and then the quarters after enrollment? 

Relative measures also allow us to line up outcomes for people who enter and exit TANF at different points in time.

### Starting Point

On the source file, each client already has a record for every possible quarter in our data follow-up period. Each record has information about the earnings reported, a yes/no (0/1) indicator of employment, and the number of employers who reported earnings for the person during the quarter. Printed below are 6 quarters of data: the quarter just before program enrollment, the quarter of program enrollment, and 4 quarters after program enrollment. The source file already has a column (RelativeQTR), created in the 02_restructure_person_quarter file, that indicates how close each quarter is to the client's program enrollment date.

Let's take a look at the transaction file that we will use as input. For the first client below, notice that:
1. The client enrolled in the program in quarter 2 of 2017 (06/01/2017)
2. Earnings reported in quarter 1 of 2017 (1 quarter prior to enrollment) are assigned the RelativeQTR value of -1
3. Earnings reported in quarter 2 of 2017 (the quarter of enrollment) are assigned the RelativeQTR value of 0

For the second client below, notice that: 
1. The client enrolled in the program in quarter 1 of 2017
2. Quarter 1 of 2017 is the first quarter of UI wage data we received from our data provider
3. Therefore, we have no data for the quarter before this client started the program (i.e. we don't know if the client had reported earnings or not for the quarter before program enrollment)
4. This means there is no record with RelativeQTR value -1 for the second client

In [2]:
%%sql
SELECT TOP 12 *
FROM UIQuarterlyMeasuresV
where RelativeQTR in(-1,0,1,2,3,4) -- selecting just 6 quarters to show what happend just before and just after
order by SSN, YR_QTR;

 * mssql+pyodbc://@TDI
Done.


SSN,ProgStart,ProgEnd,YR_QTR,EarnQTR,RelativeQTR,QTR_Earnings,QTR_EMPLOYED,QTR_NUMEMPLOYERS
100000000,2017-06-01,2017-11-28,2017Q1,2017-01-01,-1,9214,1,1
100000000,2017-06-01,2017-11-28,2017Q2,2017-04-01,0,8561,1,1
100000000,2017-06-01,2017-11-28,2017Q3,2017-07-01,1,12550,1,1
100000000,2017-06-01,2017-11-28,2017Q4,2017-10-01,2,0,0,0
100000000,2017-06-01,2017-11-28,2018Q1,2018-01-01,3,0,0,0
100000000,2017-06-01,2017-11-28,2018Q2,2018-04-01,4,0,0,0
100900056,2017-01-01,2017-06-30,2017Q1,2017-01-01,0,5624,1,1
100900056,2017-01-01,2017-06-30,2017Q2,2017-04-01,1,4371,1,1
100900056,2017-01-01,2017-06-30,2017Q3,2017-07-01,2,10992,1,1
100900056,2017-01-01,2017-06-30,2017Q4,2017-10-01,3,0,0,0


### FLATTEN FILE WITH GROUP BY and CASE: Relative Quarterly Measures

1. We will reduce the multiple records for an SSN (displayed above) to 1 record for the SSN by using the GROUP BY statement.

2. The CASE statement determines if the record is for the quarter of interest and if so adds the earnings for that quarter into a new summary column. If the record is not for the quarter of interest the record is skipped.

3.  Instead of pivoting using calendar quarters (YR_QTR) we will use is the relative calendar indicator (RelativeQTR) to select the rows and name the columns.

In [3]:
%%sql
CREATE VIEW FlatUIv as
SELECT SSN,

/* 1. in SQL it is impossible to flatten a file without group by statement*/
/* 2. in SQL it is impossible to use a group by statement without also using a summary function (e.g. sum below) */

/* relative quarter measures */
/*earnings*/
SUM(CASE RelativeQTR WHEN 0 THEN QTR_Earnings END) AS EARN0, -- this is the quarter of program enrollment
SUM(CASE RelativeQTR WHEN 1 THEN QTR_Earnings END) AS EARN1, -- when statement is true value is moved to 1st col
SUM(CASE RelativeQTR WHEN 2 THEN QTR_Earnings END) AS EARN2, -- when statement is true value is moved to 2nd col
SUM(CASE RelativeQTR WHEN 3 THEN QTR_Earnings END) AS EARN3,
SUM(CASE RelativeQTR WHEN 4 THEN QTR_Earnings END) AS EARN4,

/*employment*/
SUM(CASE RelativeQTR WHEN 0 THEN QTR_Employed END) AS EMP0,
SUM(CASE RelativeQTR WHEN 1 THEN QTR_Employed END) AS EMP1,
SUM(CASE RelativeQTR WHEN 2 THEN QTR_Employed END) AS EMP2,
SUM(CASE RelativeQTR WHEN 3 THEN QTR_Employed END) AS EMP3,
SUM(CASE RelativeQTR WHEN 4 THEN QTR_Employed END) AS EMP4


FROM UIQuarterlyMeasuresV /* using quarter table view */
GROUP BY SSN /* group by is SQL way of reducing multiple record to 1 record per person */
;


 * mssql+pyodbc://@TDI
(pyodbc.ProgrammingError) ('42S01', "[42S01] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]There is already an object named 'FlatUIv' in the database. (2714) (SQLExecDirectW)")
[SQL: CREATE VIEW FlatUIv as
SELECT SSN,

/* 1. in SQL it is impossible to flatten a file without group by statement*/
/* 2. in SQL it is impossible to use a group by statement without also using a summary function (e.g. sum below) */

/* relative quarter measures */
/*earnings*/
SUM(CASE RelativeQTR WHEN 0 THEN QTR_Earnings END) AS EARN0, -- this is the quarter of program enrollment
SUM(CASE RelativeQTR WHEN 1 THEN QTR_Earnings END) AS EARN1, -- when statement is true value is moved to 1st col
SUM(CASE RelativeQTR WHEN 2 THEN QTR_Earnings END) AS EARN2, -- when statement is true value is moved to 2nd col
SUM(CASE RelativeQTR WHEN 3 THEN QTR_Earnings END) AS EARN3,
SUM(CASE RelativeQTR WHEN 4 THEN QTR_Earnings END) AS EARN4,

/*employment*/
SUM(CASE RelativeQTR WHEN 0 THEN QTR_Empl

#### Print a Few Cases

In [4]:
%%sql
SELECT TOP 20 * FROM FlatUIv ORDER BY SSN;

 * mssql+pyodbc://@TDI
Done.


SSN,EMPy1,KEMPy1,PQEMPy1,EARNy1,KQGT3500y1,PQGT3500y1,EMPy2,KEMPy2,PQEMPy2,EARNy2,KQGT3500y2,PQGT3500y2,LastQTR
100000000,1,1,0.25,12550,1,0.5,1,2,0.5,20886,2,0.5,15
100900056,1,2,0.5,15363,2,0.25,1,1,0.25,7582,1,0.25,16
101800112,1,3,0.75,46458,3,0.25,1,2,0.5,11535,1,0.25,14
102700168,1,1,0.25,1324,0,0.25,1,2,0.5,12074,1,0.25,15
103600224,1,2,0.5,14720,2,0.25,1,2,0.5,5117,1,0.25,15
104500280,1,1,0.25,9720,1,0.25,1,3,0.75,15078,1,0.25,15
105400336,1,1,0.25,8254,1,0.5,1,2,0.5,19027,2,0.5,15
106300392,1,4,1.0,27910,3,0.75,1,3,0.75,24574,3,0.75,15
107200448,1,2,0.5,17414,2,0.25,1,2,0.5,8565,1,0.25,16
108100504,1,3,0.75,21789,2,0.25,1,3,0.75,12107,1,0.25,14


#### Check to make sure there are not duplicate SSNs

In [5]:
%%sql
SELECT COUNT(*) as NumRecs, COUNT(distinct SSN) as NumSSNs
FROM FlatUIv;

 * mssql+pyodbc://@TDI
Done.


NumRecs,NumSSNs
1012,1012


### A word about data coverage and missing data for clients who start the program late

Remember that the last quarter of UI Wage data that we received was Q1 2021.  We do not have data for Q2 2021, or after, for any client.  So clients who start in the program close to the end of data coverage will have a diminishing amount of "relative data".

Let's take a look at the data for the last clients to enroll in our program. Notice:

1. This client started in our program in December 2020.
2. We only have information about wages during the quarter this client started in our program (quarter 0) and 1 quarter after program start.
4. Any measures we create after relative quarter 1 post program start will be missing for this client.



In [6]:
%%sql
SELECT *
FROM UIQuarterlyMeasuresV
where ProgStart= -- selecting the last client with a start date equal to the last start date on the file
    (select MAX(ProgStart)
     FROM UIQuarterlyMeasuresV) -- finding the last start date on the file
;

 * mssql+pyodbc://@TDI
Done.


SSN,ProgStart,ProgEnd,YR_QTR,EarnQTR,RelativeQTR,QTR_Earnings,QTR_EMPLOYED,QTR_NUMEMPLOYERS
919951016,2020-12-26,,2017Q1,2017-01-01,-15,0,0,0
919951016,2020-12-26,,2017Q2,2017-04-01,-14,841,1,1
919951016,2020-12-26,,2017Q3,2017-07-01,-13,9851,1,1
919951016,2020-12-26,,2017Q4,2017-10-01,-12,0,0,0
919951016,2020-12-26,,2018Q1,2018-01-01,-11,6340,1,1
919951016,2020-12-26,,2018Q2,2018-04-01,-10,5243,1,1
919951016,2020-12-26,,2018Q3,2018-07-01,-9,0,0,0
919951016,2020-12-26,,2018Q4,2018-10-01,-8,10271,1,1
919951016,2020-12-26,,2019Q1,2019-01-01,-7,0,0,0
919951016,2020-12-26,,2019Q2,2019-04-01,-6,223,1,1


### Let's have a look at the quarterly measures we created for the client who started the program last

Notice that the last 3 quarterly measures are NULL (or "None").  We have no information about those quarters for this client so we want those quarters to be null for this client.

In [7]:
%%sql
SELECT TOP 20
*
FROM FlatUIv

where SSN=
  (select distinct SSN
     FROM UIQuarterlyMeasuresV
     where ProgStart= (select MAX(ProgStart) FROM UIQuarterlyMeasuresV))   
ORDER BY SSN
;


 * mssql+pyodbc://@TDI
Done.


SSN,EMPy1,KEMPy1,PQEMPy1,EARNy1,KQGT3500y1,PQGT3500y1,EMPy2,KEMPy2,PQEMPy2,EARNy2,KQGT3500y2,PQGT3500y2,LastQTR
919951016,,,,,,,,,,,,,1


## Automating the code above

So far all of our code has been written the old fashion way: manually. However, the measures we have created are quarterly measures. To create one quarterly measure for a 3 year follow-up period would have us typing 36 lines of logic and summation code. Now we will demonstate how to generate and run the SQL query above in an automated way. 

This type of SQL coding is known as **Dynamic SQL**.  The result of our coding will be a query that can be run (or executed). So, we are not looking to generate SQL result sets below, rather we are generating SQL query code. 

##### Note about Dynamic SQL: 
*Jupyter Notebooks, the file type we are using to share this code, does not support dynamic SQL code. The code below therefore produces errors when it is executed here. To use this code, you should copy and paste it into your respective SQL Server software.*

Review the Create View code above that created 5 quarters of earning and employment measures. Suppose you wanted to create measures for all the quarters we have data for. Notice that all the lines above are similar to the ones below, but below,   
1. The value that the WHEN clause tests changes for every relative quarter: 0,1,2,3,4
2. The name of the measure created changes for every relative quarter: EARN0, EMP0, EARN1, EMP1 ...

Below, the distinct relative quarter values will be stored in a variable to drive the creation of our SQL query. By using this iterative variable, the only thing that changes in each line is value of the relative quarter. The value of the relative quarter is tested in the WHEN clause and is suffixed at the end of the column names.

In [8]:
%%sql
DECLARE @QTR TABLE (QTR varchar(3), QTRN INT);-- create temporary table variables, character and numeric versions of same

INSERT INTO @QTR (QTR,QTRN)  -- Store the quarter values in the table variables.
SELECT DISTINCT RelativeQTR ,  RelativeQTR FROM dbo.UIQuarterlyMeasuresV
where RelativeQTR >= 0;  -- WHERE selects all quarters on or after program start.

DECLARE @CMD NVARCHAR(MAX); -- create a temporary variable to store our automated sql code

SELECT @CMD = 
'SELECT SSN,'; -- store the start of the query code
SELECT @CMD=@CMD + ' 
SUM(CASE RelativeQTR WHEN ' + QTR + ' THEN QTR_Earnings END) AS EARN'+QTR+ ' ,'+'
SUM(CASE RelativeQTR WHEN ' + QTR + ' THEN QTR_Employed END) AS EMP'+QTR+ ' ,'
FROM @QTR ORDER BY QTRN; -- append each quarterly code to the query code ORDER BY NUMERIC version 

SELECT @CMD=SUBSTRING(@CMD,1,LEN(@CMD)-1); -- remove the , at the end of the last quarter line

SELECT @CMD=@CMD+'
FROM dbo.UIQuarterlyMeasuresV
GROUP BY SSN;'; -- append the end of the query code

PRINT @CMD; -- print the query code we generated
EXEC sp_executesql @CMD; -- execute the code we generated


 * mssql+pyodbc://@TDI
Done.
(pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Must declare the table variable "@QTR". (1087) (SQLExecDirectW)')
[SQL: INSERT INTO @QTR (QTR,QTRN)  -- Store the quarter values in the table variables.
SELECT DISTINCT RelativeQTR ,  RelativeQTR FROM dbo.UIQuarterlyMeasuresV
where RelativeQTR >= 0;  -- WHERE selects all quarters on or after program start.]
(Background on this error at: http://sqlalche.me/e/14/f405)


#### The dynamic SQL coding above generated and executed the SQL Query code below:

In [None]:

(17 rows affected)
SELECT SSN, 
SUM(CASE RelativeQTR WHEN 0 THEN QTR_Earnings END) AS EARN0 ,
SUM(CASE RelativeQTR WHEN 0 THEN QTR_Employed END) AS EMP0 , 
SUM(CASE RelativeQTR WHEN 1 THEN QTR_Earnings END) AS EARN1 ,
SUM(CASE RelativeQTR WHEN 1 THEN QTR_Employed END) AS EMP1 , 
SUM(CASE RelativeQTR WHEN 2 THEN QTR_Earnings END) AS EARN2 ,
SUM(CASE RelativeQTR WHEN 2 THEN QTR_Employed END) AS EMP2 , 
SUM(CASE RelativeQTR WHEN 3 THEN QTR_Earnings END) AS EARN3 ,
SUM(CASE RelativeQTR WHEN 3 THEN QTR_Employed END) AS EMP3 , 
SUM(CASE RelativeQTR WHEN 4 THEN QTR_Earnings END) AS EARN4 ,
SUM(CASE RelativeQTR WHEN 4 THEN QTR_Employed END) AS EMP4 , 
SUM(CASE RelativeQTR WHEN 5 THEN QTR_Earnings END) AS EARN5 ,
SUM(CASE RelativeQTR WHEN 5 THEN QTR_Employed END) AS EMP5 , 
SUM(CASE RelativeQTR WHEN 6 THEN QTR_Earnings END) AS EARN6 ,
SUM(CASE RelativeQTR WHEN 6 THEN QTR_Employed END) AS EMP6 , 
SUM(CASE RelativeQTR WHEN 7 THEN QTR_Earnings END) AS EARN7 ,
SUM(CASE RelativeQTR WHEN 7 THEN QTR_Employed END) AS EMP7 , 
SUM(CASE RelativeQTR WHEN 8 THEN QTR_Earnings END) AS EARN8 ,
SUM(CASE RelativeQTR WHEN 8 THEN QTR_Employed END) AS EMP8 , 
SUM(CASE RelativeQTR WHEN 9 THEN QTR_Earnings END) AS EARN9 ,
SUM(CASE RelativeQTR WHEN 9 THEN QTR_Employed END) AS EMP9 , 
SUM(CASE RelativeQTR WHEN 10 THEN QTR_Earnings END) AS EARN10 ,
SUM(CASE RelativeQTR WHEN 10 THEN QTR_Employed END) AS EMP10 , 
SUM(CASE RelativeQTR WHEN 11 THEN QTR_Earnings END) AS EARN11 ,
SUM(CASE RelativeQTR WHEN 11 THEN QTR_Employed END) AS EMP11 , 
SUM(CASE RelativeQTR WHEN 12 THEN QTR_Earnings END) AS EARN12 ,
SUM(CASE RelativeQTR WHEN 12 THEN QTR_Employed END) AS EMP12 , 
SUM(CASE RelativeQTR WHEN 13 THEN QTR_Earnings END) AS EARN13 ,
SUM(CASE RelativeQTR WHEN 13 THEN QTR_Employed END) AS EMP13 , 
SUM(CASE RelativeQTR WHEN 14 THEN QTR_Earnings END) AS EARN14 ,
SUM(CASE RelativeQTR WHEN 14 THEN QTR_Employed END) AS EMP14 , 
SUM(CASE RelativeQTR WHEN 15 THEN QTR_Earnings END) AS EARN15 ,
SUM(CASE RelativeQTR WHEN 15 THEN QTR_Employed END) AS EMP15 , 
SUM(CASE RelativeQTR WHEN 16 THEN QTR_Earnings END) AS EARN16 ,
SUM(CASE RelativeQTR WHEN 16 THEN QTR_Employed END) AS EMP16 
FROM dbo.UIQuarterlyMeasuresV
GROUP BY SSN;