# Summarizing a Person-Quarter Level File to a Person-Level File: Annual Calendar Measures

In [1]:
# this code loads needed packages and connect to database.
# load sqlalchemy package
import sqlalchemy

# Define connection string (Projects is the corpmdrc DSN)
connection_string = "mssql+pyodbc://@TDI"

# Create the engine connecting to the database server
sqlalchemy.create_engine(connection_string)

# Load sql magicks 
%load_ext sql

# Connect to the database server
%sql $connection_string


## Purpose: 

The purpose of this code is to demonstrate how to convert a person-quarterly transaction-level file to a person-level file, and add a series of annual employment and earnings measures.


#### Starting Point 

In the previous notebook (02_restructure_person_quarter), we created a quarterly file with one record per person per quarter (UIQuarterlyMeasuresV). Here, we will summarize the file to 1 record per person (by distinct SSN) with all the information about each person’s history of employment and earnings on 1 record.

Our source file for the pivot has a record for every possible quarter a person could
be employed. Each record has information about the earning reported, a yes/no [0/1] indicator of employment, and the number of employers who reported earnings for the person during the quarter. The code presented here is limited to the 4 calendar quarters in 2017 but can be expanded and adapted for any time frame. 

Let's take a look at the transaction file that we will use as input.

In [2]:
%%sql
SELECT TOP 12 *
FROM UIQuarterlyMeasuresV
where LEFT(YR_QTR,4) ='2017' -- selecting just 2017 quarters 
order by SSN, YR_QTR;

 * mssql+pyodbc://@TDI
Done.


SSN,ProgStart,ProgEnd,YR_QTR,EarnQTR,RelativeQTR,QTR_Earnings,QTR_EMPLOYED,QTR_NUMEMPLOYERS
100000000,2017-06-01,2017-11-28,2017Q1,2017-01-01,-1,9214,1,1
100000000,2017-06-01,2017-11-28,2017Q2,2017-04-01,0,8561,1,1
100000000,2017-06-01,2017-11-28,2017Q3,2017-07-01,1,12550,1,1
100000000,2017-06-01,2017-11-28,2017Q4,2017-10-01,2,0,0,0
100900056,2017-01-01,2017-06-30,2017Q1,2017-01-01,0,5624,1,1
100900056,2017-01-01,2017-06-30,2017Q2,2017-04-01,1,4371,1,1
100900056,2017-01-01,2017-06-30,2017Q3,2017-07-01,2,10992,1,1
100900056,2017-01-01,2017-06-30,2017Q4,2017-10-01,3,0,0,0
101800112,2017-08-01,2018-01-28,2017Q1,2017-01-01,-2,0,0,0
101800112,2017-08-01,2018-01-28,2017Q2,2017-04-01,-1,0,0,0


### Summarizing and Creating Annual Employment and Earnings Measures

MDRC has developed field-tested annual employment and earnings outcome measures to estimate employment stability, retention, and advancement. This section presents code to demonstrate how to create these measures, focusing on the year 2017. We focus on the following outcomes:

1. Employed in 2017
2. Number of quarters employed in 2017
3. Average quarterly employment in 2017
4. Total earnings in 2017


####  Use of Group BY and Case to Focus on Annual Measures

1. We only want 1 record per SSN so we GROUP BY SSN
2. We want to create Annual Measures for 2017 so we use the Case statement to check if the record was for 2017
3. We use (MAX, AVG, SUM) summary functions to create the measures of interest.

In [3]:
%%sql
SELECT TOP 10
SSN,

/* 1. in SQL it is impossible to flatten a file without group by statement*/
/* 2. in SQL it is impossible to use a group by statement without also using a summary function (max,avg, sum below) */

/*annual yearly outcomes*/
MAX(CASE LEFT(YR_QTR,4) WHEN '2017' THEN QTR_Employed END) AS EVEMP2017,  -- Ever employed in 2017
AVG(CASE LEFT(YR_QTR,4) WHEN '2017' THEN CAST(QTR_Employed AS DECIMAL (4,2)) END)  
    AS AVGEMP2017, -- Average quarterly employment in 2017
SUM(CASE LEFT(YR_QTR,4) WHEN '2017' THEN QTR_Employed END) AS KEMP2017, -- Number of Quarters employed in 2017
SUM(CASE LEFT(YR_QTR,4) WHEN '2017' THEN QTR_Earnings END) AS EARN2017-- Total earnings in 2017

FROM UIQuarterlyMeasuresV /* using quarter table view */
GROUP BY SSN /* group by is SQL way of reducing multiple record to 1 record per person */
ORDER BY SSN
;

 * mssql+pyodbc://@TDI
Done.


SSN,EVEMP2017,AVGEMP2017,KEMP2017,EARN2017
100000000,1,0.75,3,30325
100900056,1,0.75,3,20987
101800112,0,0.0,0,0
102700168,1,0.25,1,1324
103600224,1,0.75,3,22021
104500280,1,0.5,2,16393
105400336,1,0.5,2,13376
106300392,1,0.5,2,18047
107200448,1,0.5,2,33817
108100504,1,1.0,4,34583
