# Domestic Violence Timeliness 

## Contents
#### Setup
1. [import_packages](#import_packages) 
2. [define_key_variables](#define_key_variables) 

3. [dv_case_data_temp](#DV_case_data_temp) - filters by year > 2010 and by first application date

4. [dv_applicant_info](#DV_applicant_info) - joins the roles, parties and address tables for applicants to get information on the applicants
5. [dv_respondent_info](#DV_respondent_info) - joins the roles, parties and address tables for respondents to get information on the respondents

6. [dv_applicants_temp](#DV_applicants_temp) takes the dv_applicant_info table and reformats it, decoding gender and representative role values into strings
7. [dv_respondents_temp](#DV_respondents_temp) - takes the dv_respondent_info table and reformats it, decoding gender and representative role values into strings

8. [dv_app_rep](#DV_app_rep) - joins the case data with the applicant representation data
9. [dv_resp_rep](#DV_resp_rep) - joins the case data with the respondent representation data

10. [dv_hearing_events](#DV_Hearing_Events) - takes the domestic violence hearing events from the hearings table and joins it with the events table data 
11. [dv_hearing_cases](#DV_Hearing_Cases) - takes the hearing events, filters by domestic violence event codes and adds a flag whether the hearing is the first in the case

12. [hearing_dv_applicants](#Hearing_DV_Applicants) - joins the cases from applicant representation table to the first hearing for the case in the DV_Hearing_Cases table
13. [hearing_dv_respondents](#Hearing_DV_Respondents) - joins the cases from respondent representation table to the first hearing for the case in the DV_Hearing_Cases table

14. [dv_app](#DV_App) - groups the hearing_dv_applicants table and produces a count per group
15. [dv_resp](#DV_Resp) - groups the hearing_dv_respondents table and produces a count per group
16. [dv_case](#DV_Case) - groups and formats dv_case_data_v3 table and gives a count for each group
17. [dv_case_hearings](#DV_Case_Hearings) - creates a count of all the cases with a hearing per quarter

18. [domestic violence](#Domestic_Violence) - joins the applicant/respondent representation count tables, case count table, and case hearing count tables

19. [dv_applications_data_sorted](#DV_applications_data_sorted) - orders dv_application_5 by case_number and application date
20. [dv_applications_temp](#DV_applications_temp) - takes the dv_applications_data sorted table and filters it so it only has the first application record per case number

21. [dv_orders_data_sorted](#DV_orders_data_sorted) - sorts the disposals table by case_number and receipt date, removing contact orders, placement revoke or vary orders and other type orders
22. [dv_orders_temp](#DV_orders_temp)- takes the dv_orders_data sorted table and filters it so it only has the first order record per case number

23. [dv_apps_and_orders_match](#DV_apps_and_orders_match) - calculates the timeliness based on the difference between app_date and disp_date


## 1. Import packages and set options 
<a name="import_packages"></a>

In [1]:
import pandas as pd  # a module which provides the data structures and functions to store and manipulate tables in dataframes
import pydbtools as pydb  # A module which allows SQL queries to be run on the Analytical Platform from Python, see https://github.com/moj-analytical-services/pydbtools
import boto3  # allows you to directly create, update, and delete AWS resources from Python scripts

# sets parameters to view dataframes for tables easier
pd.set_option("display.max_columns", 100)
pd.set_option("display.width", 900)
pd.set_option("display.max_colwidth", 200)

## 2. Define key variables to be used throughout the notebook 
<a name="define_key_variables"></a>

In [2]:
#this is the database we will be extracting from
database = "familyman_dev_v3" 

#this extracts the August snapshot from athena
snapshot_date = '2022-08-04'

#this is the athena database we will be storing our tables in
fcsq_database = "fcsq"

#this is the s3 bucket we will be saving data to
s3 = boto3.resource("s3")
bucket = s3.Bucket("alpha-family-data")

#change these to the current quarter and year not the quarter being published
latest_quarter = 3
latest_year = 2022

## 3. DV_case_data temporary tables - filters by receipt_date > 2010 and by first application date
<a name="DV_case_data_temp"></a>

In [3]:
create_dv_case_data_v1 = f"""
SELECT *,
ROW_NUMBER() OVER(
PARTITION BY CASE_NUMBER
ORDER BY CASE_NUMBER, RECEIPT_DATE, EVENT_COURT
) CASE_NUMBER_ID
FROM fcsq.DV_APPS_FINAL ;
"""

pydb.create_temp_table(create_dv_case_data_v1,'dv_case_data_v1')

create_dv_case_data_v2 = f"""
SELECT *
FROM __temp__.dv_case_data_v1
WHERE CASE_NUMBER_ID = 1;
"""

pydb.create_temp_table(create_dv_case_data_v2,'dv_case_data_v2')

create_dv_case_data_v3 = f"""
SELECT *
FROM __temp__.dv_case_data_v2

WHERE RECEIPT_DATE >= cast('2011-01-01'as timestamp) 
AND RECEIPT_DATE  <= cast('2022-06-30'as timestamp)

ORDER BY Year,
Quarter;
"""

pydb.create_temp_table(create_dv_case_data_v3,'dv_case_data_v3')

In [4]:
#DV_CASE_DATA_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.dv_case_data_v3;")
#DV_CASE_DATA_count

## 4. Applicant_Info table - joins the roles, parties and address tables for applicants to get information on the applicants
<a name="DV_applicant_info"></a>

### Drop the Applicant_Info table if it already exists and remove its data from the S3 bucket

In [5]:
drop_Applicants_Table = "DROP TABLE IF EXISTS fcsq.Applicants"
pydb.start_query_execution_and_wait(drop_Applicants_Table)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Applicants").delete();

### Create the Applicant_Info table in Athena

In [6]:
create_Applicants_Table = f"""
CREATE TABLE IF NOT EXISTS fcsq.Applicants
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Applicants') AS
 SELECT DISTINCT
   {database}.roles.ROLE, 
   {database}.roles.REPRESENTATIVE_ROLE, 
   {database}.roles.ROLE_MODEL, 
   {database}.roles.PARTY, 
   {database}.roles.CASE_NUMBER, 
   {database}.parties.PERSON_GIVEN_FIRST_NAME, 
   {database}.parties.PERSON_FAMILY_NAME, 
   {database}.parties.COMPANY, 
   {database}.addresses.POSTCODE, 
   {database}.parties.GENDER, 
   {database}.roles.DELETE_FLAG
FROM 
  ({database}.roles INNER JOIN {database}.parties ON {database}.roles.PARTY = {database}.parties.PARTY) 
  INNER JOIN {database}.addresses ON {database}.roles.ADDRESS = {database}.addresses.ADDRESS
WHERE (((({database}.roles.ROLE_MODEL)= 'APLC') AND (({database}.roles.DELETE_FLAG)= 'N'))
    OR ((({database}.roles.ROLE_MODEL)= 'APLZ') AND (({database}.roles.DELETE_FLAG)= 'N')) 
    OR ((({database}.roles.ROLE_MODEL)= 'APLA') AND (({database}.roles.DELETE_FLAG)= 'N')))
    AND {database}.roles.mojap_snapshot_date = date '{snapshot_date}'
    AND {database}.parties.mojap_snapshot_date = date '{snapshot_date}'
    AND {database}.addresses.mojap_snapshot_date = date '{snapshot_date}';
"""

pydb.start_query_execution_and_wait(create_Applicants_Table);


In [7]:
#pydb.read_sql_query("SELECT * from fcsq.Applicants where case_number='NE19Z02909'")

#### Applicants validation

In [8]:
#Applicants_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.Applicants")
#Applicants_count

In [9]:
#pydb.read_sql_query("select * from fcsq.Applicants where case_number='HB12Z00228'")

## 5. Respondent_Info table - joins the roles, parties and address tables for respondents to get information on the respondents
<a name="DV_respondent_info"></a>

### Drop the Respondent_Info table if it already exists and remove its data from the S3 bucket

In [10]:
drop_Respondents_Table = "DROP TABLE IF EXISTS fcsq.Respondents"
pydb.start_query_execution_and_wait(drop_Respondents_Table)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Respondents").delete();

### Create the Respondent_Info table in Athena

In [11]:
create_Respondents_Table = f"""
CREATE TABLE IF NOT EXISTS fcsq.Respondents
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Respondents') AS
SELECT DISTINCT
  {database}.roles.ROLE, 
  {database}.roles.REPRESENTATIVE_ROLE, 
  {database}.roles.ROLE_MODEL, 
  {database}.roles.PARTY, 
  {database}.roles.CASE_NUMBER, 
  {database}.parties.GENDER, 
  {database}.addresses.POSTCODE, 
  {database}.roles.DELETE_FLAG
FROM 
  ({database}.roles INNER JOIN {database}.parties ON {database}.roles.PARTY = {database}.parties.PARTY) 
  INNER JOIN {database}.addresses ON {database}.roles.ADDRESS = {database}.addresses.ADDRESS
WHERE 
    (((({database}.roles.ROLE_MODEL)='RSPA') AND (({database}.roles.DELETE_FLAG)='N')) 
    OR ((({database}.roles.ROLE_MODEL)='RSPZ') AND (({database}.roles.DELETE_FLAG)='N'))
    OR ((({database}.roles.ROLE_MODEL)='RSPC') AND (({database}.roles.DELETE_FLAG)='N')))
    AND {database}.roles.mojap_snapshot_date = date '{snapshot_date}'
    AND {database}.parties.mojap_snapshot_date = date '{snapshot_date}'
    AND {database}.addresses.mojap_snapshot_date = date '{snapshot_date}';
"""

pydb.start_query_execution_and_wait(create_Respondents_Table);

#### Respondents validation

In [12]:
#Respondents_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.Respondents")
#Respondents_count

In [13]:
#pydb.read_sql_query("select * from fcsq.Respondents where case_number='HB12Z00228'")

## 6. Applicants temporary tables - takes the Applicants_Info table and reformats it, decoding gender and representative role values into strings
<a name="DV_applicants_temp"></a>

In [14]:
create_applicants_v1 = f"""
SELECT T1.role,
    T1.representative_role,
    T1.role_model,
    T1.party,
    T1.case_number,
    T1.gender,
    case when gender = 1 then 'Male'
    when gender = 2 then 'Female'
    else 'Unknown' end as Gender_Decode

FROM fcsq.Applicants AS t1
ORDER BY t1.Case_Number;
"""

pydb.create_temp_table(create_applicants_v1,'applicants_v1')

create_applicants_v2 = f"""
SELECT DISTINCT T1.case_number,
    T1.party,
    max(T1.representative_role) as Rep_Role,
    max(T1.gender_decode) as Gender_Max

FROM __temp__.applicants_v1 AS t1
GROUP BY Case_number, party;
"""

pydb.create_temp_table(create_applicants_v2,'applicants_v2')

create_applicants_v3 = f"""
SELECT t1.case_number,
    t1.party as App_Party_ID,
    t1.Rep_Role,
    t1.Gender_Max,
    case when t1.Rep_Role IS NULL then 'N'
    when t1.Rep_Role IS NOT NULL then 'Y'
    End as REPRESENTATION,
    case when Rep_Role IS NULL AND Gender_Max = 'Female' then 'Unrep_Female'
    when Rep_Role IS NULL AND Gender_Max = 'Male' then 'Unrep_Male'
    when Rep_Role IS NULL AND Gender_Max = 'Unknown' then 'Unrep_Unknown'
    when Rep_Role IS NOT NULL AND Gender_Max = 'Female' then 'Rep_Female'
    when Rep_Role IS NOT NULL AND Gender_Max = 'Male' then 'Rep_Male'
    when Rep_Role IS NOT NULL AND Gender_Max = 'Unknown' then 'Rep_Unknown'
    else '' end as App_Rep_Cat
    
    from __temp__.applicants_v2 AS t1;
"""

pydb.create_temp_table(create_applicants_v3,'applicants_v3')

## 7. Respondents temporary tables - takes the Respondents_Info table and reformats it, decoding gender and representative role values into strings
<a name="DV_respondents_temp"></a>

In [15]:
create_respondents_v1 = f"""
SELECT T1.role,
    T1.representative_role,
    T1.role_model,
    T1.party,
    T1.case_number,
    T1.gender,
    case when gender = 1 then 'Male'
    when gender = 2 then 'Female'
    else 'Unknown' end as Gender_Decode

FROM fcsq.Respondents AS t1
ORDER BY t1.Case_Number;
"""

pydb.create_temp_table(create_respondents_v1,'respondents_v1')

create_respondents_v2 = f"""
SELECT DISTINCT T1.case_number,
    T1.party,
    max(T1.representative_role) as Rep_Role,
    max(T1.gender_decode) as Gender_Max

FROM __temp__.respondents_v1 AS t1
GROUP BY Case_number, party;
"""

pydb.create_temp_table(create_respondents_v2,'respondents_v2')

create_respondents_v3 = f"""
SELECT t1.case_number,
    t1.party as Resp_Party_ID,
    t1.Rep_Role,
    t1.Gender_Max,
    case when t1.Rep_Role IS NULL then 'N'
    when t1.Rep_Role IS NOT NULL then 'Y'
    End as REPRESENTATION,
    case when Rep_Role IS NULL AND Gender_Max = 'Female' then 'Unrep_Female'
    when Rep_Role IS NULL AND Gender_Max = 'Male' then 'Unrep_Male'
    when Rep_Role IS NULL AND Gender_Max = 'Unknown' then 'Unrep_Unknown'
    when Rep_Role IS NOT NULL AND Gender_Max = 'Female' then 'Rep_Female'
    when Rep_Role IS NOT NULL AND Gender_Max = 'Male' then 'Rep_Male'
    when Rep_Role IS NOT NULL AND Gender_Max = 'Unknown' then 'Rep_Unknown'
    else '' end as Resp_Rep_Cat
    
    from __temp__.respondents_v2 AS t1;
"""

pydb.create_temp_table(create_respondents_v3,'respondents_v3')

## 8. Create dv_app_rep table - joins the case data with the applicant representation data 
<a name="DV_app_rep"></a>

In [16]:
dv_app_rep_final = f"""
SELECT t1.YEAR, 
    t1.QUARTER,
    t1.CASE_NUMBER, 
    t1.EVENT_COURT AS Court,
    t2.App_Party_ID,
    t2.Representation,
    t2.Gender_Max as App_Gender,
    t2.App_Rep_Cat          
FROM __temp__.dv_case_data_v3 t1
    LEFT JOIN __temp__.applicants_v3 t2 ON (t1.CASE_NUMBER = t2.CASE_NUMBER);

"""

pydb.create_temp_table(dv_app_rep_final,'dv_app_rep_final')

In [17]:
#dv_app_rep_final_check = "SELECT COUNT(*) as Count from __temp__.dv_app_rep_final"
#pydb.read_sql_query(dv_app_rep_final_check)

## 9. Create dv_resp_rep table - joins the case data with the respondent representation data
<a name="DV_resp_rep"></a>

In [18]:
dv_resp_rep_final = f"""
   SELECT t1.YEAR, 
        t1.QUARTER,
        t1.CASE_NUMBER, 
        t1.EVENT_COURT AS Court,
          t2.Resp_Party_ID,
          t2.Representation,
          t2.Gender_Max as Resp_Gender,
          t2.Resp_Rep_Cat
          
      FROM __temp__.dv_case_data_v3 t1
           LEFT JOIN __temp__.respondents_v3 t2 ON (t1.CASE_NUMBER = t2.CASE_NUMBER);
"""

pydb.create_temp_table(dv_resp_rep_final,'dv_resp_rep_final')

In [19]:
#dv_resp_rep_final_check = "SELECT COUNT(*) as Count from __temp__.dv_resp_rep_final"
#pydb.read_sql_query(dv_resp_rep_final_check)

## 10. Create DV_Hearing_Events table - takes the domestic violence hearing events from the hearings table and joins it with the events table data 
<a name="DV_Hearing_Events"></a>

### Drop the DV_Hearing_Events table if it already exists and remove its data from the S3 bucket

In [20]:
drop_DV_Hearing_Events = "DROP TABLE IF EXISTS fcsq.DV_Hearing_Events"
pydb.start_query_execution_and_wait(drop_DV_Hearing_Events)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_Hearing_Events").delete();

### Create the DV_Hearing_Events table in Athena

In [21]:
create_DV_Hearing_Events = f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_Hearing_Events
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_Hearing_Events') AS
SELECT {database}.hearings.EVENT,
  {database}.hearings.VACATED_FLAG,
  {database}.hearings.HEARING_TYPE,
  {database}.hearings.HEARING_DATE,
  {database}.events.RECEIPT_DATE,
  {database}.events.ERROR,
  {database}.events.CASE_NUMBER,
  {database}.events.EVENT_MODEL
FROM {database}.hearings
INNER JOIN {database}.events
ON {database}.hearings.EVENT            = {database}.events.EVENT
WHERE {database}.hearings.VACATED_FLAG IS NULL
AND {database}.events.ERROR             = 'N'
AND HEARING_DATE > date_parse('31-12-2009 00:00:00', '%d-%m-%Y %H:%i:%s')
AND substring(case_number,5,1) = 'F'
AND {database}.hearings.mojap_snapshot_date = date '{snapshot_date}' and {database}.events.mojap_snapshot_date = date '{snapshot_date}';
"""

pydb.start_query_execution_and_wait(create_DV_Hearing_Events);

#### DV_Hearing_Events validation

In [22]:
#DV_Hearing_Events_count = pydb.read_sql_query("select count(*) as count from fcsq.DV_Hearing_Events")
#DV_Hearing_Events_count

## 11. DV_Hearing_Cases table - takes the hearing events, filters by domestic violence event codes and adds a flag whether the hearing is the first in the case
<a name="DV_Hearing_Cases"></a> 

### Drop the DV_Hearing_Cases table if it already exists and remove its data from the S3 bucket

In [23]:
drop_DV_Hearings_Cases = "DROP TABLE IF EXISTS fcsq.DV_Hearings_Cases"
pydb.start_query_execution_and_wait(drop_DV_Hearings_Cases)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_Hearings_Cases").delete();

### Create the DV_Hearing_Cases table in Athena

In [24]:
create_DV_Hearings_Cases = f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_Hearings_Cases
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_Hearings_Cases') AS
select t1.case_number,
    t1.error,
    t1.event,
    t1.event_model,
    t1.hearing_date,
    t1.hearing_type,
    t1.receipt_date,
    t1.vacated_flag,
    substring(Case_Number,5,1) AS Case_Type
    from fcsq.DV_Hearing_Events AS t1
    where t1.event_model in ('C6', 'G61', 'FL402', 'FL405')
    order by t1.case_number, t1.receipt_date;
"""

pydb.start_query_execution_and_wait(create_DV_Hearings_Cases)

create_dv_hearings_cases_v2 = f"""
SELECT *,
(case when row_number() over (partition by Case_Number order by receipt_date) = 1 then 1 else 0 end) as Case_Number_ID
FROM fcsq.DV_Hearings_Cases;
"""

pydb.create_temp_table(create_dv_hearings_cases_v2,'dv_hearings_cases_v2')

#### DV_Hearing_Cases validation

In [25]:
#DV_Hearings_Cases_count = pydb.read_sql_query("select count(*) as count from __temp__.dv_hearings_cases_v2")
#DV_Hearings_Cases_count

## 12. Hearing_DV_Applicants table - joins the cases from applicant representation table to the first hearing for the case in the DV_Hearing_Cases table 
<a name="Hearing_DV_Applicants"></a>

### Drop the Hearing_DV_Applicant table if it already exists and remove its data from the S3 bucket

In [26]:
drop_Hearing_DV_Applicant = "DROP TABLE IF EXISTS fcsq.Hearing_DV_Applicant"
pydb.start_query_execution_and_wait(drop_Hearing_DV_Applicant)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Hearing_DV_Applicant").delete();

### Create the Hearing_DV_Applicant table in Athena

In [27]:
create_Hearing_DV_Applicant = f"""
CREATE TABLE IF NOT EXISTS fcsq.Hearing_DV_Applicant
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Hearing_DV_Applicants') AS
SELECT t1.*,
t2.Case_Number_ID

FROM __temp__.dv_app_rep_final t1
LEFT JOIN __temp__.dv_hearings_cases_v2 t2 ON (t1.CASE_NUMBER = t2.CASE_NUMBER)
where Case_Number_ID > 0;
"""

pydb.start_query_execution_and_wait(create_Hearing_DV_Applicant)

{'QueryExecutionId': 'd1777b43-5cc3-4fea-a0f9-5a8c05d32bfe',
 'Query': "CREATE TABLE IF NOT EXISTS fcsq.Hearing_DV_Applicant\nWITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Hearing_DV_Applicants') AS\nSELECT t1.*,\nt2.Case_Number_ID\n\nFROM mojap_de_temp_alpha_user_jdlv500.dv_app_rep_final t1\nLEFT JOIN mojap_de_temp_alpha_user_jdlv500.dv_hearings_cases_v2 t2 ON (t1.CASE_NUMBER = t2.CASE_NUMBER)\nwhere Case_Number_ID > 0",
 'StatementType': 'DDL',
 'ResultConfiguration': {'OutputLocation': 's3://aws-athena-query-results-593291632749-eu-west-1/tables/d1777b43-5cc3-4fea-a0f9-5a8c05d32bfe'},
 'QueryExecutionContext': {},
 'Status': {'State': 'SUCCEEDED',
  'SubmissionDateTime': datetime.datetime(2022, 10, 12, 13, 15, 43, 961000, tzinfo=tzlocal()),
  'CompletionDateTime': datetime.datetime(2022, 10, 12, 13, 15, 47, 33000, tzinfo=tzlocal())},
 'Statistics': {'EngineExecutionTimeInMillis': 2813,
  'DataScannedInBytes': 5272712,
  'Data

#### Hearing_DV_Applicant validation

In [28]:
#Hearing_DV_Applicant_count = pydb.read_sql_query("select count(*) as count from fcsq.Hearing_DV_Applicant")
#Hearing_DV_Applicant_count

## 13. Hearing_DV_Respondent table - joins the cases from respondent representation table to the first hearing for the case in the DV_Hearing_Cases table 
<a name="Hearing_DV_Respondents"></a>

### Drop the Hearing_DV_Respondent table if it already exists and remove its data from the S3 bucket

In [29]:
drop_Hearing_DV_Respondent = "DROP TABLE IF EXISTS fcsq.Hearing_DV_Respondent"
pydb.start_query_execution_and_wait(drop_Hearing_DV_Respondent)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Hearing_DV_Respondent").delete();

### Create the Hearing_DV_Respondent table in Athena

In [30]:
create_Hearing_DV_Respondent = f"""
CREATE TABLE IF NOT EXISTS fcsq.Hearing_DV_Respondent
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Hearing_DV_Respondents') AS
SELECT t1.*,
t2.Case_Number_ID

FROM __temp__.dv_resp_rep_final t1
LEFT JOIN __temp__.dv_hearings_cases_v2 t2 ON (t1.CASE_NUMBER = t2.CASE_NUMBER)
where Case_Number_ID > 0;
"""

pydb.start_query_execution_and_wait(create_Hearing_DV_Respondent)

{'QueryExecutionId': 'f09c7ffa-375f-4989-8c1d-2f06507ad1c0',
 'Query': "CREATE TABLE IF NOT EXISTS fcsq.Hearing_DV_Respondent\nWITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Hearing_DV_Respondents') AS\nSELECT t1.*,\nt2.Case_Number_ID\n\nFROM mojap_de_temp_alpha_user_jdlv500.dv_resp_rep_final t1\nLEFT JOIN mojap_de_temp_alpha_user_jdlv500.dv_hearings_cases_v2 t2 ON (t1.CASE_NUMBER = t2.CASE_NUMBER)\nwhere Case_Number_ID > 0",
 'StatementType': 'DDL',
 'ResultConfiguration': {'OutputLocation': 's3://aws-athena-query-results-593291632749-eu-west-1/tables/f09c7ffa-375f-4989-8c1d-2f06507ad1c0'},
 'QueryExecutionContext': {},
 'Status': {'State': 'SUCCEEDED',
  'SubmissionDateTime': datetime.datetime(2022, 10, 12, 13, 15, 52, 652000, tzinfo=tzlocal()),
  'CompletionDateTime': datetime.datetime(2022, 10, 12, 13, 15, 55, 485000, tzinfo=tzlocal())},
 'Statistics': {'EngineExecutionTimeInMillis': 2703,
  'DataScannedInBytes': 5306345,
  '

#### Hearing_DV_Respondent validation

In [31]:
#Hearing_DV_Respondent_count = pydb.read_sql_query("select count(*) as count from fcsq.Hearing_DV_Respondent")
#Hearing_DV_Respondent_count

## 14. DV_App table - Groups the hearing_dv_applicants table and produces a count per group
<a name="DV_App"></a>

### Drop the DV_App table if it already exists and remove its data from the S3 bucket

In [32]:
drop_DV_App = "DROP TABLE IF EXISTS fcsq.DV_App"
pydb.start_query_execution_and_wait(drop_DV_App)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_App").delete();

### Create the DV_App table in Athena

In [33]:
create_DV_App = f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_App
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_App') AS
SELECT
  'Domestic Violence' AS CASE_TYPE,
  Year,
  Quarter,
  'Party' AS Category,
  'Applicant' AS PARTY,
   App_Gender AS Gender,
  Representation,
  Count (*) AS Count
FROM
  fcsq.Hearing_DV_Applicant
WHERE 
  Representation <> '' /*A very small number of cases from 2011/12 look into whether these should be recoded as N (gender is also blank)*/
GROUP BY
  'Domestic Violence',
  Year,
  Quarter,
  'Party',
  'Applicant',
  App_Gender,
  Representation;
"""

pydb.start_query_execution_and_wait(create_DV_App);

#### DV_App validation

In [34]:
#DV_App_count = pydb.read_sql_query("select count(*) as count from fcsq.DV_App")
#DV_App_count

## 15. DV_Resp table - Groups the hearing_dv_respondents table and produces a count per group
<a name="DV_Resp"></a>

### Drop the DV_Resp table if it already exists and remove its data from the S3 bucket

In [35]:
drop_DV_Resp = "DROP TABLE IF EXISTS fcsq.DV_Resp"
pydb.start_query_execution_and_wait(drop_DV_Resp)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_Resp").delete();

### Create the DV_Resp table in Athena

In [36]:
create_DV_Resp = f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_resp
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_Resp') AS
SELECT
  'Domestic Violence' AS CASE_TYPE,
  Year,
  Quarter,
  'Party' AS Category,
  'Applicant' AS PARTY,
   Resp_Gender AS Gender,
  Representation,
  Count (*) AS Count
FROM
  fcsq.Hearing_DV_Respondent
WHERE 
  Representation <> '' /*A very small number of cases from 2011/12 look into whether these should be recoded as N (gender is also blank)*/
GROUP BY
  'Domestic Violence',
  Year,
  Quarter,
  'Party',
  'Applicant',
  Resp_Gender,
  Representation;
"""

pydb.start_query_execution_and_wait(create_DV_Resp);

#### DV_Resp validation

In [37]:
#DV_Resp_count = pydb.read_sql_query("select count(*) as count from fcsq.DV_Resp")
#DV_Resp_count

## 16. dv_case table - groups and formats dv_case_data_v3 table and gives a count for each group
<a name="DV_Case"></a>

### Drop the dv_case table if it already exists and remove its data from the S3 bucket

In [38]:
drop_DV_case = "DROP TABLE IF EXISTS fcsq.DV_case"
pydb.start_query_execution_and_wait(drop_DV_case)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_case").delete();

### Create the dv_case table in Athena

In [39]:
create_DV_case = f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_case
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_case') AS
SELECT *, Count(*) as Count FROM
    (SELECT
      'Domestic Violence' AS CASE_TYPE,
      Year,
      Quarter,
      'Cases' AS Category,
      ' ' AS PARTY,
      ' ' AS Gender,
      ' ' AS Representation
    FROM
      __temp__.dv_case_data_v3)

GROUP BY
  CASE_TYPE,
  Year,
  Quarter,
  Category,
  PARTY,
  Gender,
  Representation

ORDER BY 
    Year,
    Quarter;
"""

pydb.start_query_execution_and_wait(create_DV_case);

#### DV_case validation

In [40]:
#DV_case_count = pydb.read_sql_query("select count(*) as count from fcsq.DV_case")
#DV_case_count

## 17. DV_Case_Hearings table - creates a count of all the cases with a hearing per quarter
<a name="DV_Case_Hearings"></a>

### Drop the DV_Case_Hearings table if it already exists and remove its data from the S3 bucket

In [41]:
drop_DV_Case_Hearings = "DROP TABLE IF EXISTS fcsq.DV_Case_Hearings"
pydb.start_query_execution_and_wait(drop_DV_Case_Hearings)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_Case_Hearings").delete();

### Create the DV_Case_Hearings table in Athena

In [42]:
create_hearing_dv_case =f"""
SELECT DISTINCT Year, Quarter, Case_Number
FROM fcsq.Hearing_DV_Applicant;
"""

pydb.create_temp_table(create_hearing_dv_case,'hearing_dv_case')



create_DV_Case_Hearings = f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_Case_Hearings
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_Case_Hearings') AS
SELECT *, Count(*) as Count FROM
    (SELECT
      'Domestic Violence' AS CASE_TYPE,
      Year,
      Quarter,
      'Cases with a hearing' AS Category,
      ' ' AS PARTY,
      ' ' AS Gender,
      ' ' AS Representation
    FROM
      __temp__.hearing_dv_case)
GROUP BY
  CASE_TYPE,
  Year,
  Quarter,
  Category,
  PARTY,
  Gender,
  Representation
ORDER BY 
  Year,
  Quarter;
"""

pydb.start_query_execution_and_wait(create_DV_Case_Hearings);

In [43]:
#DV_Case_Hearings = pydb.read_sql_query("select count(*) as count from fcsq.DV_Case_Hearings")
#DV_Case_Hearings

## 18. Domestic Violence table - Joins the applicant/respondent representation count tables, case count table, and case hearing count tables 
<a name="Domestic_Violence"></a>

### Drop the Domestic Violence table if it already exists and remove its data from the S3 bucket

In [44]:
drop_Domestic_Violence = "DROP TABLE IF EXISTS fcsq.Domestic_Violence"
pydb.start_query_execution_and_wait(drop_Domestic_Violence)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Domestic_Violence").delete();

### Create the Domestic Violence table in Athena

In [45]:
create_Domestic_Violence = f"""
CREATE TABLE IF NOT EXISTS fcsq.Domestic_Violence
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Domestic_Violence') AS
SELECT
  *
FROM
 fcsq.DV_APP
UNION ALL
SELECT
  *
FROM
  fcsq.DV_RESP
UNION ALL
SELECT
  *
FROM
  fcsq.DV_CASE
UNION ALL
SELECT
  *
FROM
  fcsq.DV_CASE_HEARINGS;
"""

pydb.start_query_execution_and_wait(create_Domestic_Violence);

#### Domestic Violence validation

In [46]:
#Domestic_Violence_count = pydb.read_sql_query("select count(*) as count from fcsq.Domestic_Violence")
#Domestic_Violence_count

In [47]:
#pydb.read_sql_query("select * from fcsq.Domestic_Violence ")

## 19. DV_applications_data_sorted table - Orders dv_application_5 by case_number and application date
<a name="DV_applications_data_sorted"></a>

### Drop the dv_applications_data_sorted table if it already exists and remove its data from the S3 bucket

In [48]:
drop_dv_applications_data_sorted = "DROP TABLE IF EXISTS fcsq.dv_applications_data_sorted"
pydb.start_query_execution_and_wait(drop_dv_applications_data_sorted)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/dv_applications_data_sorted").delete();

### Create the dv_applications_data_sorted table in Athena

In [49]:
create_dv_applications_data_sorted = f"""
CREATE TABLE IF NOT EXISTS fcsq.dv_applications_data_sorted
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/dv_applications_data_sorted') AS
SELECT t1.*
      FROM fcsq.DV_APPS_FINAL t1
      ORDER BY t1.CASE_NUMBER,
               t1.RECEIPT_DATE,
               t1.Description;
"""

pydb.start_query_execution_and_wait(create_dv_applications_data_sorted);

In [50]:
#pydb.read_sql_query("select * from fcsq.dv_applications_data_sorted ")

#### DV_applications_data_sorted validation

In [51]:
#dv_applications_data_sorted_count = pydb.read_sql_query("select count(*) as count from fcsq.dv_applications_data_sorted")
#dv_applications_data_sorted_count

## 20. Create dv_applications temporary tables - takes the dv_applications_data sorted table and filters it so it only has the first application record per case number
<a name="DV_applications_temp"></a>

In [52]:
create_dv_applications_1 = f"""
SELECT t1.*, row_number() over (order by CASE_NUMBER, RECEIPT_DATE) as SEQ_NUM
      FROM fcsq.dv_applications_data_sorted t1;
"""

pydb.create_temp_table(create_dv_applications_1,'dv_applications_1')

create_dv_applications_2 = f"""
SELECT DISTINCT CASE_NUMBER, Description, EVENT, (MIN(Seq_Num)) AS MIN_of_Seq_Num
FROM __temp__.dv_applications_1
GROUP BY CASE_NUMBER, Description, EVENT;
"""

pydb.create_temp_table(create_dv_applications_2,'dv_applications_2')

create_dv_applications_3 = f"""
SELECT t1.CASE_NUMBER, 
    t2.RECEIPT_DATE, 
    t2.YEAR, 
    t2.QUARTER, 
    t2.EVENT, 
    t2.EVENT_COURT, 
    t2.Description
FROM __temp__.dv_applications_2 t1
LEFT JOIN __temp__.dv_applications_1 t2 ON (t1.MIN_of_Seq_NUM = t2.Seq_NUM) AND (t1.Description = 
t2.Description);
"""

pydb.create_temp_table(create_dv_applications_3,'dv_applications_3')

## 21. DV_orders_data_sorted table - Sorts the disposals table by case_number and receipt date, removing contact orders, placement revoke or vary orders and other type orders
<a name="DV_orders_data_sorted"></a>

### Drop the dv_orders_data_sorted table if it already exists and remove its data from the S3 bucket

In [53]:
drop_dv_orders_data_sorted = "DROP TABLE IF EXISTS fcsq.dv_orders_data_sorted"
pydb.start_query_execution_and_wait(drop_dv_orders_data_sorted)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/dv_orders_data_sorted").delete();

### Create the dv_orders_data_sorted table in Athena

In [54]:
create_dv_orders_data_sorted = f"""
CREATE TABLE IF NOT EXISTS fcsq.dv_orders_data_sorted
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/dv_orders_data_sorted') AS
SELECT t1.RECEIPT_DATE, 
    t1.CASE_NUMBER, 
    t1.EVENT, 
    t1.CREATING_COURT,
    CAST(EVENT AS VARCHAR(3)) AS EVENT_CODE,
    t1.FIELD_MODEL, 
    t1.VALUE
FROM fcsq.DV_Ords1 t1
ORDER BY t1.CASE_NUMBER,
    t1.RECEIPT_DATE;
"""

pydb.start_query_execution_and_wait(create_dv_orders_data_sorted);

In [55]:
#pydb.read_sql_query("select * from fcsq.dv_orders_data_sorted")

#### DV_orders_data_sorted validation

In [56]:
#dv_orders_data_sorted_count = pydb.read_sql_query("select count(*) as count from fcsq.dv_orders_data_sorted")
#dv_orders_data_sorted_count

## 22. Create dv_orders temporary tables - takes the dv_orders_data sorted table and filters it so it only has the first order record per case number
<a name="DV_orders_temp"></a>

In [57]:
create_dv_orders_0 = f"""
SELECT t1.CASE_NUMBER, 
          t1.EVENT, 
          t1.CREATING_COURT, 
          t2.RECEIPT_DATE AS APP_DATE, 
          t1.RECEIPT_DATE AS DISP_DATE, 
          t2.Description AS APP_Order_Type, 
          t1.FIELD_MODEL, 
          t1.EVENT_CODE, 
          t1.VALUE, 
          DAY(t1.RECEIPT_DATE - t2.RECEIPT_DATE) AS DATE_DIFF
      FROM fcsq.dv_orders_data_sorted t1
           INNER JOIN __temp__.dv_applications_3 t2 ON (t1.CASE_NUMBER = t2.CASE_NUMBER)
      WHERE DAY(t1.RECEIPT_DATE - t2.RECEIPT_DATE) >= 0;
"""
pydb.create_temp_table(create_dv_orders_0,'dv_orders_0')

In [58]:
#dv_orders_data_temp_count = pydb.read_sql_query("select count(*) as count from __temp__.dv_orders_0")
#dv_orders_data_temp_count

In [59]:
create_dv_orders_1 = f"""
SELECT DISTINCT t1.CASE_NUMBER, 
          t1.EVENT, 
          t1.CREATING_COURT, 
          t1.EVENT_CODE, 
          t1.APP_Order_Type, 
          t1.APP_DATE, 
          t1.DISP_DATE, 
          t1.FIELD_MODEL, 
          t1.VALUE,
          row_number() over (ORDER BY CASE_NUMBER) as SEQ_NUM
      FROM __temp__.dv_orders_0 t1
      ORDER BY t1.CASE_NUMBER,
               t1.APP_Order_Type,
               t1.APP_DATE,
               t1.DISP_DATE,
               t1.EVENT;
"""
pydb.create_temp_table(create_dv_orders_1,'dv_orders_1')

In [60]:
#dv_orders_data_temp_count = pydb.read_sql_query("select count(*) as count from __temp__.dv_orders_1")
#dv_orders_data_temp_count

In [61]:
#pydb.read_sql_query("select * from __temp__.dv_orders_1")

In [62]:
create_dv_orders_2 = f"""
SELECT DISTINCT t1.CASE_NUMBER, 
          t1.APP_Order_Type, 
          t1.APP_DATE, 
          (MIN(t1.SEQ_NUM)) AS MIN_SEQ_NUM
      FROM __temp__.dv_orders_1 t1
      GROUP BY t1.CASE_NUMBER,
               t1.APP_Order_Type,
               t1.APP_DATE;
"""
pydb.create_temp_table(create_dv_orders_2,'dv_orders_2')

In [63]:
#dv_orders_data_temp_count = pydb.read_sql_query("select count(*) as count from __temp__.dv_orders_2")
#dv_orders_data_temp_count

In [64]:
#pydb.read_sql_query("select * from __temp__.dv_orders_2")

In [65]:
create_dv_orders_3 = f"""
SELECT DISTINCT t1.CASE_NUMBER, 
          t1.APP_Order_Type, 
          t1.APP_DATE, 
          t2.DISP_DATE, 
          t2.EVENT, 
          t2.CREATING_COURT, 
          t2.EVENT_CODE, 
          t2.FIELD_MODEL, 
          t2.VALUE
      FROM __temp__.dv_orders_2 t1
           LEFT JOIN __temp__.dv_orders_1 t2 ON (t1.MIN_SEQ_NUM = t2.SEQ_NUM);
"""
pydb.create_temp_table(create_dv_orders_3,'dv_orders_3')

In [66]:
#dv_orders_data_temp_count = pydb.read_sql_query("select count(*) as count from __temp__.dv_orders_3")
#dv_orders_data_temp_count

In [67]:
#pydb.read_sql_query("select * from __temp__.dv_orders_3")

## 23. dv_apps_and_orders_match table - calculates the timeliness based on the difference between app_date and disp_date
<a name="DV_apps_and_orders_match"></a>

### Drop the dv_apps_and_orders_match table if it already exists and remove its data from the S3 bucket

In [68]:
drop_dv_apps_and_orders_match = "DROP TABLE IF EXISTS fcsq.dv_apps_and_orders_match"
pydb.start_query_execution_and_wait(drop_dv_apps_and_orders_match)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/dv_apps_and_orders_match").delete();

### Create the dv_apps_and_orders_match table in Athena

In [69]:
create_dv_apps_and_orders_match = f"""
CREATE TABLE IF NOT EXISTS fcsq.dv_apps_and_orders_match
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/dv_apps_and_orders_match') AS
SELECT t2.CASE_NUMBER, 
    t2.APP_Order_Type, 
    t2.APP_DATE, 
    t2.DISP_DATE, 
    t2.FIELD_MODEL,
    t2.EVENT_CODE AS DSP_COURT,
    DAY(t2.DISP_DATE - t2.APP_DATE)/7 AS Wait_weeks,
    YEAR(t2.DISP_DATE) AS Year,
    Month(t2.DISP_DATE) AS Month,
    t2.VALUE AS Orders
FROM __temp__.dv_orders_3 t2;
"""

pydb.start_query_execution_and_wait(create_dv_apps_and_orders_match);


In [70]:
#dv_apps_and_orders_match_count = pydb.read_sql_query("select count(*) as count from __temp__.dv_orders_3")
#dv_apps_and_orders_match_count