# Domestic Violence Extractions

## Contents
#### Setup
1. [import_packages](#import_packages) 
2. [define_key_variables](#define_key_variables) 

#### [Familyman data extractions](#familyman_extracts)
3. [DV_APPS1](#DV_APPS1) - Extracts the domestic violence application details from the event and events_fields table 
4. [DV_Ords1](#DV_Ords1) -  Extracts the domestic violence order details from the event and events_fields table 
5. [RES_ATTENDANCE_INFO](#RES_ATTENDANCE_INFO) - Extra orders info on Respondent attendance 
6. [DV_POA_CASES](#DV_POA_CASES) - This query extracts records for the event model “FL406” and field models “FL404B_8” and “FL404_67”, and tells us whether a power of arrest was attached to an order.

#### Stage 1 - [Applications](#applications)
7. [DV_Applications_1](#DV_Applications_1) - adds a comma to the start and the end of the value variable for later queries 
8. [DV_Applications_2](#DV_Applications_2) - Limits the data to only Domestic Violence applications 
9. [DV_APPLICATION_EVENTS](#DV_APPLICATION_EVENTS) - Calculates year and quarter of receipt date, deletes duplicates and calculates whether each case is a domestic violence or a childrens act case 
10. [APP_LOOKUP1](#APP_LOOKUP1) - groups the adjusted values so that there's one record per value 
11. [APP_LOOKUP2](#APP_LOOKUP2) - This query looks at the adjusted value to see whether it can find any of the four order types in the value string. If so, it puts the order type description. 
12. [APP_LOOKUP3](#APP_LOOKUP3) - This query breaks up the results from the previous query in to separate data sets for the four order types, then puts the data sets for the four order types together, so that there's one record per order type. 
13. [DV_APPS_FINAL](#DV_APPS_FINAL) - This query joins the look-up and the orders data to create a data set with one record per order type.  
14. [DV_APP_CASES](#DV_APP_CASES) - This query groups the final data by case so that you can produce the case counts 
15. [DV_APP_CASES_FINAL](#DV_APP_CASES_FINAL) - This query adds the year and quarter of the receipt date back in so that you can do the case count summmaries 
16. [DV_APP_GENDER](#DV_APP_GENDER) - Below code is to add the gender of the applicant to the final application table. However, these don't match FCSQ final application figures. This is because there are a number of instances where there is both a male and a female applicant listed under the same application event and case number. 
17. [APP_COURT_AGG1](#APP_COURT_AGG1) - This query aggregates the final data so that you can produce the court level applications summary. 

#### Stage 2 - [Orders](#orders)
18. [DV_Orders_2](#DV_Orders_2) - This query limits the data to domestic violence orders (NM and OCC)and deletes the general orders (GEN) and undertakings (UND). 
19. [Order_Lookup_1](#Order_Lookup_1) - This query aggregates the orders data to one record per value (order type string). 
20. [Order_Lookup_2](#Order_Lookup_2) - This query looks at the value string to see whether each order type is included.  If so, it puts the order type.  
21. [Order_LOOKUP3](#Order_LOOKUP3) - This query breaks up the results from the previous query in to separate data sets for the two order types, then puts the data sets for the two order types together, so that there's one record per order type. 
22. [Orders3](#Orders3) - This query joins the look up table to the data on orders and brings back the order types. 
23. [Orders4](#Orders4) - This query calculates the year and quarter so you can do the later summaries 
24. [POA_CASE_LIST](#POA_CASE_LIST) - This query aggregates the power of arrest (POA) data to one record per case. 
25. [ORD_WITH_POA](#ORD_WITH_POA) - This query joins the information on POA and respondent attendance to the data on orders. Also adding a Case_Type at this stage 
26. [DV_ORDS_FINAL](#DV_ORDS_FINAL) - This query creates the final data set on domestic violence orders, refining the POA and respondent attendance information to find whether power of arrest was attached and whether the case was exparte or on notice 
27. [DV_ORD_CASES](#DV_ORD_CASES) - This query aggregates the final orders data, taking the last order date. 
28. [DV_ORD_CASES_FINAL](#DV_ORD_CASES_FINAL) - This query calculates the year and quarter of the last order date 

#### Stage 3 - [Preparing the final output](#Preparing_the_final_output)
29. [DV_APPS](#DV_APPS) - Prepares the applications data for CSV output 
30. [DV_APP_COUNT](#DV_APP_COUNT) - Adding a Count for de duplicated applications 
31. [DV_ORDERS](#DV_ORDERS) - Prepares the orders data for CSV output 
32. [DV_CASE_STARTS](#DV_CASE_STARTS) - Prepares case start data for final CSV output 
33. [DV_CASES_Closed](#DV_CASES_Closed) - Prepares case closed data for final CSV output 
34. [DV_all_data](#DV_all_data) - Joins all data together 
35. [DV_CSV](#DV_CSV) - Final CSV output to copy into data tab of Domestic Violence workbook 

## 1. Import packages and set options 
<a name=import_packages></a>

In [None]:
import pandas as pd  # a module which provides the data structures and functions to store and manipulate tables in dataframes
import pydbtools as pydb  # A module which allows SQL queries to be run on the Analytical Platform from Python, see https://github.com/moj-analytical-services/pydbtools
import boto3  # allows you to directly create, update, and delete AWS resources from Python scripts

# sets parameters to view dataframes for tables easier
pd.set_option("display.max_columns", 100)
pd.set_option("display.width", 900)
pd.set_option("display.max_colwidth", 200)


## 2. Define key variables to be used throughout the notebook 
<a name=define_key_variables></a>

In [None]:
#this is the database we will be extracting from
database = "familyman_live_v4"

#this extracts the latest snapshot from athena
get_snapshot_date = f"SELECT mojap_snapshot_date from {database}.events order by mojap_snapshot_date desc limit 1"
snapshot_date = str(pydb.read_sql_query(get_snapshot_date)['mojap_snapshot_date'].values[0])

#this extracts the February snapshot from athena
#snapshot_date = '2023-02-09'

#this is the athena database we will be storing our tables in
fcsq_database = "fcsq"

#this is the s3 bucket we will be saving data to
s3 = boto3.resource("s3")
bucket = s3.Bucket("alpha-family-data")

#change these to the current quarter and year not the quarter being published
latest_quarter = 1
latest_year = 2024

#change these to the current quarter and year being published
pub_quarter = 4
pub_year = 2023

## Familyman data extractions
<a name="familyman_extracts"></a>

## 3. DV_APPS1 table - extracts the domestic violence application details from the event and events_fields table <a name=DV_APPS1></a>

### Drop the DV_APPS1 table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_APPS1 = "DROP TABLE IF EXISTS fcsq.DV_APPS1"
pydb.start_query_execution_and_wait(drop_DV_APPS1)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_APPS1").delete();

### Create the DV_APPS1 table

In [None]:
create_DV_APPS1_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_APPS1
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_APPS1') AS
SELECT /*csv*/ 
  TTE.RECEIPT_DATE, 
  TTE.CASE_NUMBER, 
  TTE.EVENT, 
  TTE.CREATING_COURT, 
  TTF.FIELD_MODEL, 
  TTF.VALUE, 
  TTE.Error
FROM 
  {database}.events TTE
  INNER JOIN {database}.event_fields TTF
     ON TTE.EVENT = TTF.EVENT
WHERE 
   TTE.Error= 'N' 
     AND TTF.FIELD_MODEL In ('U22_AT','G50_AT')
     AND (TTE.mojap_snapshot_date = date'{snapshot_date}' AND TTF.mojap_snapshot_date= date'{snapshot_date}');
"""
pydb.start_query_execution_and_wait(create_DV_APPS1_table);

#### DV_APPS1 validation

In [None]:
#DV_APPS1_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_APPS1")
#DV_APPS1_count

## 4. DV_Ords1 table -  extracts the domestic violence order details from the event and events_fields table ¶ <a name=DV_Ords1></a>

### Drop the DV_Ords1 table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_Ords1 = "DROP TABLE IF EXISTS fcsq.DV_Ords1"
pydb.start_query_execution_and_wait(drop_DV_Ords1)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_Ords1").delete();

### Create the DV_Ords1 table

In [None]:
create_DV_Ords1_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_Ords1
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_Ords1') AS
SELECT /*csv*/
  TTE.RECEIPT_DATE, 
  TTE.CASE_NUMBER, 
  TTE.EVENT, 
  TTE.CREATING_COURT, 
  TTF.FIELD_MODEL, 
  TTF.VALUE, 
  TTE.Error
FROM 
  {database}.events TTE
  INNER JOIN {database}.event_fields  TTF
    ON TTE.EVENT = TTF.EVENT
WHERE 
  TTE.Error= 'N' 
   AND TTF.FIELD_MODEL In ('FL404B_7','FL404_79')
    AND (TTE.mojap_snapshot_date = date'{snapshot_date}' AND TTF.mojap_snapshot_date= date'{snapshot_date}');
"""
pydb.start_query_execution_and_wait(create_DV_Ords1_table);

#### DV_Ords1 validation

In [None]:
#DV_Ords1_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_Ords1")
#DV_Ords1_count

## 5. RES_ATTENDANCE_INFO table - Extra orders info on Respondent attendance <a name=RES_ATTENDANCE_INFO></a>

### Drop the RES_ATTENDANCE_INFO table if it already exists and remove its data from the S3 bucket

In [None]:
drop_RES_ATTENDANCE_INFO = "DROP TABLE IF EXISTS fcsq.RES_ATTENDANCE_INFO"
pydb.start_query_execution_and_wait(drop_RES_ATTENDANCE_INFO)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/RES_ATTENDANCE_INFO").delete();

### Create the RES_ATTENDANCE_INFO table

In [None]:
create_RES_ATTENDANCE_INFO_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.RES_ATTENDANCE_INFO
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/RES_ATTENDANCE_INFO') AS
SELECT /*csv*/
  TTE.EVENT, 
  TTE.RECEIPT_DATE, 
  TTE.ENTRY_DATE, 
  TTE.Error, 
  TTE.CASE_NUMBER, 
  TTE.EVENT_MODEL, 
  TTF.FIELD_MODEL, 
  TTF.VALUE        
FROM 
  {database}.events TTE
  INNER JOIN {database}.event_fields TTF
    ON TTE.EVENT = TTF.EVENT
WHERE 
   TTE.Error='N' 
   AND TTF.FIELD_MODEL In ('FL404_5','FL404B_5')
   AND (TTE.mojap_snapshot_date = date'{snapshot_date}' AND TTF.mojap_snapshot_date= date'{snapshot_date}'); 
"""
pydb.start_query_execution_and_wait(create_RES_ATTENDANCE_INFO_table);

#### RES_ATTENDANCE_INFO validation

In [None]:
#RES_ATTENDANCE_INFO_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.RES_ATTENDANCE_INFO")
#RES_ATTENDANCE_INFO_count

## 6. DV_POA_CASES table - This query extracts records for the event model “FL406” and field models “FL404B_8” and “FL404_67”, and tells us whether a power of arrest was attached to an order. <a name=DV_POA_CASES></a>

### Drop the DV_POA_CASES table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_POA_CASES = "DROP TABLE IF EXISTS fcsq.DV_POA_CASES"
pydb.start_query_execution_and_wait(drop_DV_POA_CASES)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_POA_CASES").delete();

### Create the DV_POA_CASES table

In [None]:
create_DV_POA_CASES_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_POA_CASES
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_POA_CASES') AS
SELECT /*csv*/ 
  TTE.RECEIPT_DATE, 
  TTE.CASE_NUMBER, 
  TTE.EVENT, 
  TTE.EVENT_MODEL, 
  TTF.FIELD_MODEL, 
  TTF.VALUE, 
  TTE.Error      
FROM 
  {database}.events TTE
  INNER JOIN {database}.event_fields TTF
    ON TTE.EVENT = TTF.EVENT
WHERE
    (TTE.mojap_snapshot_date = date'{snapshot_date}' 
    AND 
    TTF.mojap_snapshot_date = date'{snapshot_date}')
    AND
  ((TTE.EVENT_MODEL = 'FL406' 
    AND TTE.Error= 'N') 
  OR (TTF.FIELD_MODEL = 'FL404B_8' 
    AND TTF.VALUE= 'Y' 
    AND TTE.Error = 'N') 
  OR (TTF.FIELD_MODEL = 'FL404_67'
    AND TTE.Error= 'N')); 
"""
pydb.start_query_execution_and_wait(create_DV_POA_CASES_table);

#### DV_POA_CASES validation

In [None]:
#DV_POA_CASES_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_POA_CASES")
#DV_POA_CASES_count

# Stage 1 - Applications
<a name="applications"></a>

## 7. DV_Applications_1 table - adds a comma to the start and the end of the value variable for later queries <a name=DV_Applications_1></a>

### Create the DV_Applications_1 table

In [None]:
create_DV_Applications_1_table =f"""
SELECT
receipt_date,
case_number,
event,
creating_court,
field_model,
', '|| value || ',' as Adjusted_Value,
error
from FCSQ.DV_Apps1
"""
pydb.create_temp_table(create_DV_Applications_1_table,'DV_Applications_1')

#### DV_Applications_1 validation

In [None]:
#DV_Applications_1_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_Applications_1")
#DV_Applications_1_count

## 8. DV_Applications_2 table - Limits the data to only Domestic Violence applications <a name=DV_Applications_2></a>

### Create the DV_Applications_2 table

In [None]:
create_DV_Applications_2_table =f"""
SELECT 
receipt_date,
case_number,
event,
creating_court,
field_model,
Adjusted_Value,
error
FROM __temp__.DV_Applications_1
Where strpos(Adjusted_Value, ', ENM') <> 0
Or strpos(Adjusted_Value,', ONM') <> 0
Or strpos(Adjusted_Value,', EO,') <> 0
Or strpos(Adjusted_Value,', EO ,') <> 0
Or strpos(Adjusted_Value,', ONO')<> 0;
"""

pydb.create_temp_table(create_DV_Applications_2_table,'DV_Applications_2')

#### DV_Applications_2 validation

In [None]:
#DV_Applications_2_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_Applications_2")
#DV_Applications_2_count

## 9. DV_APPLICATION_EVENTS table - Calculates year and quarter of receipt date, deletes duplicates and calculates whether each case is a domestic violence or a childrens act case <a name=DV_APPLICATION_EVENTS></a>

### Drop the DV_APPLICATION_EVENTS table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_APPLICATION_EVENTS = "DROP TABLE IF EXISTS fcsq.DV_APPLICATION_EVENTS"
pydb.start_query_execution_and_wait(drop_DV_APPLICATION_EVENTS)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_APPLICATION_EVENTS").delete();

### Create the DV_APPLICATION_EVENTS table

In [None]:
create_DV_APPLICATION_EVENTS_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_APPLICATION_EVENTS
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_APPLICATION_EVENTS') AS
SELECT DISTINCT /*YEAR*/
                EXTRACT(YEAR FROM (t1.receipt_date)) AS year,
                /*QUARTER*/
        CASE WHEN EXTRACT(Month FROM (t1.receipt_date)) <4 THEN 1
              WHEN EXTRACT(Month FROM (t1.receipt_date)) <7 THEN 2
              WHEN EXTRACT(Month FROM (t1.receipt_date))<10 THEN 3
              ELSE 4
              END AS quarter,
                t1.RECEIPT_DATE,
                t1.CASE_NUMBER,
                t1.EVENT,
                /*EVENT_COURT*/
                cast((t1.EVENT / 100000000) as int)  AS EVENT_COURT,
                t1.FIELD_model,
                t1.ADJUSTED_VALUE,
                CASE WHEN (substr(t1.CASE_NUMBER,5,1)) = 'F'
                THEN 'Domestic Violence'
                WHEN (Substr(t1.CASE_NUMBER,5,1)) IN ('C', 'P')
                THEN 'Childrens Act'
                WHEN (Substr(t1.CASE_NUMBER,5,1)) IN ('A', 'Z')
                THEN 'Adoption'
                ELSE 'Other' END AS CASE_TYPE
FROM __temp__.DV_APPLICATIONS_2 AS t1;
"""
pydb.start_query_execution_and_wait(create_DV_APPLICATION_EVENTS_table);

#### DV_APPLICATION_EVENTS validation

In [None]:
#DV_APPLICATION_EVENTS_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_APPLICATION_EVENTS")
#DV_APPLICATION_EVENTS_count

## 10. APP_LOOKUP1 table - groups the adjusted values so that there's one record per value <a name=APP_LOOKUP1></a>

### Create the APP_LOOKUP1 table

In [None]:
create_APP_LOOKUP1_table =f"""
SELECT DISTINCT t1.ADJUSTED_VALUE
FROM fcsq.DV_APPLICATION_EVENTS AS t1;
"""
pydb.create_temp_table(create_APP_LOOKUP1_table,'App_lookup1')

#### APP_LOOKUP1 validation

In [None]:
#APP_LOOKUP1_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.APP_LOOKUP1")
#APP_LOOKUP1_count

## 11. APP_LOOKUP2 table - This query looks at the adjusted value to see whether it can find any of the four order types in the value string. If so, it puts the order type description. <a name=APP_LOOKUP2></a>

### Create the APP_LOOKUP2 table

In [None]:
create_APP_LOOKUP2_table =f"""
SELECT t1.ADJUSTED_VALUE,
       /*ENM*/
       case when strpos(Adjusted_Value,'ENM') <> 0 then 'Exparte Non-Molestation' else '' end as ENM,
       /*ONM*/
       case when strpos(Adjusted_Value,'ONM') <> 0 then 'On Notice Non-Molestation' else '' end as ONM,
       /*EO*/
       case when ((strpos(Adjusted_Value,', EO') <> 0) OR (strpos(Adjusted_Value,', EO ,') <> 0))
       then 'Exparte Occupation' else '' end as EO,
       /*ONO*/
       case when strpos(Adjusted_Value,'ONO') <> 0 then 'On Notice Occupation' else '' end as ONO
FROM __temp__.APP_LOOKUP1 AS t1;
"""
pydb.create_temp_table(create_APP_LOOKUP2_table,'APP_LOOKUP2')

#### APP_LOOKUP2 validation

In [None]:
#APP_LOOKUP2_count = pydb.read_sql_query("SELECT * from __temp__.APP_LOOKUP2")
#APP_LOOKUP2_count

## 12. APP_LOOKUP3 table - This query breaks up the results from the previous query in to separate data sets for the four order types, then puts the data sets for the four order types together, so that there's one record per order type. <a name=APP_LOOKUP3></a>

### Splits previous lookup

In [None]:
create_ENM_table =f"""
SELECT
Adjusted_Value,
ENM
FROM __temp__.APP_LOOKUP2
WHERE ENM = 'Exparte Non-Molestation'
"""
pydb.create_temp_table(create_ENM_table,'ENM')
print ("ENM done ")

create_ONM_table =f"""
SELECT
Adjusted_Value,
ONM
FROM __temp__.APP_LOOKUP2
WHERE ONM = 'On Notice Non-Molestation'
"""
pydb.create_temp_table(create_ONM_table,'ONM')
print ("ONM done ")

create_EO_table =f"""
SELECT
Adjusted_Value,
EO
FROM __temp__.APP_LOOKUP2
WHERE EO = 'Exparte Occupation'
"""

pydb.create_temp_table(create_EO_table,'EO')
print ("EO done ")

create_ONO_table =f"""
SELECT
Adjusted_Value,
ONO
FROM __temp__.APP_LOOKUP2
WHERE ONO = 'On Notice Occupation'
"""
pydb.create_temp_table(create_ONO_table,'ONO')
print ("ONO done ")

### Creates new lookup

In [None]:
#Equivalent to dvint.APP_LOOKUP4 in SAS code

create_App_Lookup_3_table = f"""
SELECT
adjusted_value, ENM as description
FROM __temp__.ENM
UNION ALL
SELECT
adjusted_value, ONM as description
FROM __temp__.ONM
UNION ALL
SELECT
adjusted_value, EO as description
FROM __temp__.EO
UNION ALL
SELECT
adjusted_value, ONO as description
FROM __temp__.ONO;
"""
pydb.create_temp_table(create_App_Lookup_3_table,'App_Lookup_3')

#### App_Lookup_3 validation

In [None]:
#App_Lookup_3_count = pydb.read_sql_query("SELECT * from __temp__.App_Lookup_3")
#App_Lookup_3_count

## 13. DV_APPS_FINAL table - This query joins the look-up and the orders data to create a data set with one record per order type.  <a name=DV_APPS_FINAL></a>

### Drop the DV_APPS_FINAL table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_APPS_FINAL = "DROP TABLE IF EXISTS fcsq.DV_APPS_FINAL"
pydb.start_query_execution_and_wait(drop_DV_APPS_FINAL)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_APPS_FINAL").delete();

### Create the DV_APPS_FINAL table

In [None]:
create_DV_APPS_FINAL_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_APPS_FINAL
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_APPS_FINAL') AS
SELECT t1.YEAR,
       t1.QUARTER,
       t1.RECEIPT_DATE,
       t1.CASE_NUMBER,
       t1.EVENT,
       t1.EVENT_COURT,
       t1.FIELD_model,
       t1.ADJUSTED_VALUE,
       t1.CASE_TYPE,
       t2.DESCRIPTION
FROM fcsq.DV_APPLICATION_EVENTS AS t1 LEFT JOIN __temp__.APP_LOOKUP_3 AS t2 ON (t1.ADJUSTED_VALUE=t2.ADJUSTED_VALUE);
"""
pydb.start_query_execution_and_wait(create_DV_APPS_FINAL_table);

#### DV_APPS_FINAL validation

In [None]:
#DV_APPS_FINAL_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_APPS_FINAL")
#DV_APPS_FINAL_count

## 14. DV_APP_CASES table - This query groups the final data by case so that you can produce the case counts <a name=DV_APP_CASES></a>

### Create the DV_APP_CASES table

In [None]:
create_DV_APP_CASES_table =f"""
SELECT t1.CASE_NUMBER,
       /*MIN_RECEIPT*/
       MIN(t1.RECEIPT_DATE) AS MIN_RECEIPT,
       t1.CASE_TYPE
FROM fcsq.DV_APPS_FINAL AS t1
GROUP BY CASE_NUMBER, CASE_TYPE;
"""
pydb.create_temp_table(create_DV_APP_CASES_table,'DV_APP_CASES')

#### DV_APP_CASES validation

In [None]:
#DV_APP_CASES_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_APP_CASES")
#DV_APP_CASES_count

## 15. DV_APP_CASES_FINAL table - This query adds the year and quarter of the receipt date back in so that you can do the case count summmaries <a name=DV_APP_CASES_FINAL></a>

### Drop the DV_APP_CASES_FINAL table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_APP_CASES_FINAL = "DROP TABLE IF EXISTS fcsq.DV_APP_CASES_FINAL"
pydb.start_query_execution_and_wait(drop_DV_APP_CASES_FINAL)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_APP_CASES_FINAL").delete();

### Create the DV_APP_CASES_FINAL table

In [None]:
create_DV_APP_CASES_FINAL_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_APP_CASES_FINAL
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_APP_CASES_FINAL') AS
SELECT t1.CASE_NUMBER,
       t1.MIN_RECEIPT,
       /*YEAR*/
       EXTRACT(YEAR FROM t1.MIN_RECEIPT) AS YEAR,
       /*QUARTER*/
       CASE WHEN EXTRACT(MONTH FROM t1.MIN_RECEIPT) between 1 and 3 THEN 1
              WHEN EXTRACT(MONTH from t1.MIN_RECEIPT) between 4 and 6 THEN 2
              WHEN EXTRACT(month from t1.MIN_RECEIPT) between 7 and 9 THEN 3
              WHEN EXTRACT(month from t1.MIN_RECEIPT) between 10 and 12 THEN 4
              END AS quarter,
       t1.CASE_TYPE
FROM __temp__.DV_APP_CASES AS t1;
"""
pydb.start_query_execution_and_wait(create_DV_APP_CASES_FINAL_table);

#### DV_APP_CASES_FINAL validation

In [None]:
#DV_APP_CASES_FINAL_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_APP_CASES_FINAL")
#DV_APP_CASES_FINAL_count

In [None]:
drop_Applicants = "DROP TABLE IF EXISTS fcsq.Applicants"
pydb.start_query_execution_and_wait(drop_Applicants)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Applicants").delete();

In [None]:
create_Applicants_Table = f"""
CREATE TABLE IF NOT EXISTS fcsq.Applicants
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Applicants') AS
 SELECT DISTINCT
   {database}.roles.ROLE, 
   {database}.roles.REPRESENTATIVE_ROLE, 
   {database}.roles.ROLE_MODEL, 
   {database}.roles.PARTY, 
   {database}.roles.CASE_NUMBER, 
   {database}.parties.PERSON_GIVEN_FIRST_NAME, 
   {database}.parties.PERSON_FAMILY_NAME, 
   {database}.parties.COMPANY, 
   {database}.addresses.POSTCODE, 
   {database}.parties.GENDER, 
   {database}.roles.DELETE_FLAG
FROM 
  ({database}.roles INNER JOIN {database}.parties ON {database}.roles.PARTY = {database}.parties.PARTY) 
  INNER JOIN {database}.addresses ON {database}.roles.ADDRESS = {database}.addresses.ADDRESS
WHERE (((({database}.roles.ROLE_MODEL)= 'APLC') AND (({database}.roles.DELETE_FLAG)= 'N'))
    OR ((({database}.roles.ROLE_MODEL)= 'APLZ') AND (({database}.roles.DELETE_FLAG)= 'N')) 
    OR ((({database}.roles.ROLE_MODEL)= 'APLA') AND (({database}.roles.DELETE_FLAG)= 'N')))
    AND {database}.roles.mojap_snapshot_date = date '{snapshot_date}'
    AND {database}.parties.mojap_snapshot_date = date '{snapshot_date}'
    AND {database}.addresses.mojap_snapshot_date = date '{snapshot_date}';
"""

pydb.start_query_execution_and_wait(create_Applicants_Table);

In [None]:
#Applicants_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.Applicants")
#Applicants_count

## 16. DV_APP_GENDER table - Below code is to add the gender of the applicant to the final application table. However, these don't match FCSQ final application figures. This is because there are a number of instances where there is both a male and a female applicant listed under the same application event and case number. <a name=DV_APP_GENDER></a>

### Drop the DV_APP_GENDER table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_APP_GENDER = "DROP TABLE IF EXISTS fcsq.DV_APP_GENDER"
pydb.start_query_execution_and_wait(drop_DV_APP_GENDER)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_APP_GENDER").delete();

### Create the DV_APP_GENDER table

In [None]:
create_DV_APP_GENDER_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_APP_GENDER
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_APP_GENDER') AS
SELECT DISTINCT t1.YEAR,
       t1.QUARTER,
       t1.RECEIPT_DATE,
       t1.CASE_NUMBER,
       t1.EVENT,
       t1.EVENT_COURT,
       t1.FIELD_model,
       t1.ADJUSTED_VALUE,
       t1.DESCRIPTION,
       t2.Gender,
       Case when t2.Gender = 1
              Then 'Male'
                When t2.Gender = 2
                Then 'Female'
            Else 'Unknown'
            End as Gender2
FROM fcsq.DV_APPS_FINAL AS t1 LEFT JOIN fcsq.Applicants AS t2 ON (t1.CASE_NUMBER=t2.CASE_NUMBER);

"""
pydb.start_query_execution_and_wait(create_DV_APP_GENDER_table);

#### DV_APP_GENDER validation

In [None]:
#DV_APP_GENDER_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_APP_GENDER")
#DV_APP_GENDER_count

## 17. APP_COURT_AGG1 table - This query aggregates the final data so that you can produce the court level applications summary. <a name=APP_COURT_AGG1></a>

### Create the APP_COURT_AGG1 table

In [None]:
create_APP_COURT_AGG1_table =f"""
SELECT t1.YEAR,
       t1.QUARTER,
       t1.EVENT_COURT,
       t1.CASE_NUMBER
       
FROM fcsq.DV_APPS_FINAL AS t1
GROUP BY t1.YEAR,
       t1.QUARTER,
       t1.EVENT_COURT,
       t1.CASE_NUMBER;

"""
pydb.create_temp_table(create_APP_COURT_AGG1_table,'APP_COURT_AGG1')

#### APP_COURT_AGG1 validation

In [None]:
#APP_COURT_AGG1_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.APP_COURT_AGG1")
#APP_COURT_AGG1_count

# Stage 2 - Orders
<a name="orders"></a>

## 18. DV_Orders_2 table - This query limits the data to domestic violence orders (NM and OCC)and deletes the general orders (GEN) and undertakings (UND). <a name=DV_Orders_2></a>

### Create the DV_Orders_2 table

In [None]:
create_DV_Orders_2_table =f"""
SELECT 
* 
FROM
fcsq.DV_Ords1
Where strpos(value, 'NM') <> 0
Or strpos(value,'OCC') <> 0;
"""
pydb.create_temp_table(create_DV_Orders_2_table,'DV_Orders_2')

#### DV_Orders_2 validation

In [None]:
#DV_Orders_2_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_Orders_2")
#DV_Orders_2_count

## 19. Order_Lookup_1 table - This query aggregates the orders data to one record per value (order type string). <a name=Order_Lookup_1></a>

### Create the Order_Lookup_1 table

In [None]:
create_Order_Lookup_1_table =f"""
SELECT DISTINCT t1.VALUE
FROM __temp__.DV_Orders_2 AS t1;
"""
pydb.create_temp_table(create_Order_Lookup_1_table,'Order_Lookup_1')

#### Order_Lookup_1 validation

In [None]:
#Order_Lookup_1_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.Order_Lookup_1")
#Order_Lookup_1_count

## 20. Order_Lookup_2 table - This query looks at the value string to see whether each order type is included.  If so, it puts the order type.  <a name=Order_Lookup_2></a>

### Create the Order_Lookup_2 table

In [None]:
create_Order_Lookup_2_table =f"""
SELECT t1.VALUE,
       /*NM*/
       case when strpos(value, 'NM') <> 0 then 'Non-Molestation' else '' END AS NM,
       /*OCC*/
       case when strpos(value, 'OCC') <> 0 then 'Occupation' else '' END AS OCC
FROM __temp__.Order_Lookup_1 AS t1;
"""
pydb.create_temp_table(create_Order_Lookup_2_table,'Order_Lookup_2')

#### Order_Lookup_2 validation

In [None]:
#Order_Lookup_2_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.Order_Lookup_2")
#Order_Lookup_2_count

## 21. Order_LOOKUP3 table - This query breaks up the results from the previous query in to separate data sets for the two order types, then puts the data sets for the two order types together, so that there's one record per order type. <a name=Order_LOOKUP3></a>

### Splits previous lookup

In [None]:
create_NM_table =f"""
SELECT
Value,
NM
FROM __temp__.ORDER_LOOKUP_2
WHERE NM = 'Non-Molestation'
"""
pydb.create_temp_table(create_NM_table,'NM')
print ("NM done ")

create_OCC_table =f"""
SELECT
Value,
OCC
FROM __temp__.ORDER_LOOKUP_2
WHERE OCC = 'Occupation'
"""
pydb.create_temp_table(create_OCC_table,'OCC')
print ("OCC done ")

### Creates new lookup

In [None]:
#Equivalent to dvint.APP_LOOKUP4 in SAS code

create_Order_Lookup_3_table = f"""
SELECT
value, NM as description
FROM __temp__.NM
UNION ALL
SELECT
Value, OCC as description
FROM __temp__.OCC;
"""
pydb.create_temp_table(create_Order_Lookup_3_table,'Order_Lookup_3')

#### Order_Lookup_3 validation

In [None]:
#Order_Lookup_3_count = pydb.read_sql_query("SELECT * from __temp__.Order_Lookup_3")
#Order_Lookup_3_count

## 22. Orders3 table - This query joins the look up table to the data on orders and brings back the order types. <a name=Orders3></a>

### Create the Orders3 table

In [None]:
create_Orders3_table =f"""
SELECT t1.Receipt_Date,
       t1.Case_Number,
       t1.Event,
       t1.Field_Model,
       t2.Description
FROM __temp__.DV_Orders_2 AS t1 LEFT JOIN __temp__.Order_LookUp_3 As t2 ON (t1.VALUE=t2.VALUE);
"""
pydb.create_temp_table(create_Orders3_table,'Orders3')

#### Orders3 validation

In [None]:
#Orders3_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.Orders3")
#Orders3_count

## 23. Orders4 table - This query calculates the year and quarter so you can do the later summaries <a name=Orders4></a>

### Create the Orders4 table

In [None]:
create_Orders4_table =f"""
SELECT DISTINCT/*YEAR*/
               EXTRACT(year from t1.RECEIPT_DATE) AS YEAR,
              /*QUARTER*/
              CASE WHEN EXTRACT(Month FROM (t1.receipt_date)) <4 THEN 1
              WHEN EXTRACT(Month FROM (t1.receipt_date)) <7 THEN 2
              WHEN EXTRACT(Month FROM (t1.receipt_date))<10 THEN 3
              ELSE 4
              END AS quarter,
              t1.Receipt_Date,
              t1.Case_Number,
              t1.Event,
              /*Event_Court*/
              cast((t1.EVENT / 100000000) as int)  AS EVENT_COURT,
              t1.Field_Model,
              t1.Description
FROM __temp__.Orders3 AS t1;
"""
pydb.create_temp_table(create_Orders4_table,'Orders4')

#### Orders4 validation

In [None]:
#Orders4_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.Orders4")
#Orders4_count

## 24. POA_CASE_LIST table - This query aggregates the power of arrest (POA) data to one record per case. <a name=POA_CASE_LIST></a>

### Create the POA_CASE_LIST table

In [None]:
create_POA_CASE_LIST_table =f"""
SELECT DISTINCT t1.event
FROM fcsq.DV_POA_CASES AS t1;
"""
pydb.create_temp_table(create_POA_CASE_LIST_table,'POA_CASE_LIST')

#### POA_CASE_LIST validation

In [None]:
#POA_CASE_LIST_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.POA_CASE_LIST")
#POA_CASE_LIST_count

## 25. ORD_WITH_POA table - This query joins the information on POA and respondent attendance to the data on orders. Also adding a Case_Type at this stage <a name=ORD_WITH_POA></a>

### Create the ORD_WITH_POA table

In [None]:
create_ORD_WITH_POA_table =f"""
SELECT        t1.YEAR,
              t1.QUARTER,
              t1.Receipt_Date,
              t1.Case_Number,
              t1.Event,
              t1.Event_Court,
              t1.Field_Model,
              t1.Description,
              CASE WHEN (Substr(t1.CASE_NUMBER,5,1)) = 'F'
                THEN 'Domestic Violence'
                WHEN (Substr(t1.CASE_NUMBER,5,1)) IN ('C', 'P')
                THEN 'Childrens Act'
                WHEN (Substr(t1.CASE_NUMBER,5,1)) IN ('A', 'Z')
                THEN 'Adoption'
                ELSE 'Other' END AS CASE_TYPE,
              t2.event AS POA_IND,
              t3.VALUE AS Res_Attend
FROM __temp__.Orders4 AS t1 LEFT JOIN __temp__.POA_CASE_LIST AS t2 
ON (t1.event=t2.event) 
LEFT JOIN fcsq.RES_ATTENDANCE_INFO AS t3 ON (t1.EVENT=t3.EVENT);
"""
pydb.create_temp_table(create_ORD_WITH_POA_table,'ORD_WITH_POA')

#### ORD_WITH_POA validation

In [None]:
#ORD_WITH_POA_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.ORD_WITH_POA")
#ORD_WITH_POA_count

## 26. DV_ORDS_FINAL table - This query creates the final data set on domestic violence orders, refining the POA and respondent attendance information to find whether power of arrest was attached and whether the case was exparte or on notice <a name=DV_ORDS_FINAL></a>

### Drop the DV_ORDS_FINAL table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_ORDS_FINAL = "DROP TABLE IF EXISTS fcsq.DV_ORDS_FINAL"
pydb.start_query_execution_and_wait(drop_DV_ORDS_FINAL)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_ORDS_FINAL").delete();

### Create the DV_ORDS_FINAL table

In [None]:
create_DV_ORDS_FINAL_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_ORDS_FINAL
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_ORDS_FINAL') AS
SELECT 
Year,
Quarter,
Receipt_Date,
Case_Number,
Event,
Event_Court,
Field_Model,
Description,
Case_Type,
Res_Attend,
CASE WHEN POA_IND IS NOT NULL then 'POA' ELSE 'No POA' end as POA,
CASE WHEN Res_Attend='F' then 'Exparte' else 'On Notice' end as TYPE
FROM __temp__.ORD_WITH_POA;
"""
pydb.start_query_execution_and_wait(create_DV_ORDS_FINAL_table);

#### DV_ORDS_FINAL validation

In [None]:
#DV_ORDS_FINAL_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_ORDS_FINAL")
#DV_ORDS_FINAL_count

## 27. DV_ORD_CASES table - This query aggregates the final orders data, taking the last order date. <a name=DV_ORD_CASES></a>

### Create the DV_ORD_CASES table

In [None]:
create_DV_ORD_CASES_table =f"""
SELECT t1.CASE_NUMBER,
       /*MAX_RECEIPT*/
       MAX(t1.RECEIPT_DATE) AS MAX_RECEIPT,
       t1.CASE_TYPE
FROM fcsq.DV_Ords_Final AS t1
GROUP BY CASE_NUMBER, CASE_TYPE;

"""
pydb.create_temp_table(create_DV_ORD_CASES_table,'DV_ORD_CASES')

#### DV_ORD_CASES validation

In [None]:
#DV_ORD_CASES_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_ORD_CASES")
#DV_ORD_CASES_count

## 28. DV_ORD_CASES_FINAL table - This query calculates the year and quarter of the last order date <a name=DV_ORD_CASES_FINAL></a>

### Drop the DV_ORD_CASES_FINAL table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_ORD_CASES_FINAL = "DROP TABLE IF EXISTS fcsq.DV_ORD_CASES_FINAL"
pydb.start_query_execution_and_wait(drop_DV_ORD_CASES_FINAL)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_ORD_CASES_FINAL").delete();

### Create the DV_ORD_CASES_FINAL table

In [None]:
create_DV_ORD_CASES_FINAL_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_ORD_CASES_FINAL
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_ORD_CASES_FINAL') AS
SELECT t1.CASE_NUMBER,
       t1.MAX_RECEIPT,
       /*YEAR*/
       YEAR(t1.MAX_RECEIPT) AS YEAR,
       /*QUARTER*/
      CASE WHEN EXTRACT(Month FROM (t1.MAX_RECEIPT)) <4 THEN 1
              WHEN EXTRACT(Month FROM (t1.MAX_RECEIPT)) <7 THEN 2
              WHEN EXTRACT(Month FROM (t1.MAX_RECEIPT)) <10 THEN 3
              ELSE 4
              END AS quarter,
       t1.CASE_TYPE
FROM __temp__.DV_ORD_CASES AS t1;
"""
pydb.start_query_execution_and_wait(create_DV_ORD_CASES_FINAL_table);

#### DV_ORD_CASES_FINAL validation

In [None]:
#DV_ORD_CASES_FINAL_count = pydb.read_sql_query("SELECT count(*) as count from fcsq.DV_ORD_CASES_FINAL")
#DV_ORD_CASES_FINAL_count

# Stage 3 - Preparing the final output
<a name="Preparing_the_final_output"></a>

## 29. DV_APPS table - Prepares the applications data for CSV output <a name=DV_APPS></a>

### Create the DV_APPS table

In [None]:
create_DV_APPS_table =f"""
SELECT *, count(*) as Total FROM(
SELECT
  Year,
  Quarter,
  Case_Type,
  'Orders applied for' AS Type,
  CASE WHEN DESCRIPTION IN ('Exparte Non-Molestation', 'On Notice Non-Molestation')
    THEN 'Non-Molestation'
       WHEN DESCRIPTION IN ('Exparte Occupation', 'On Notice Occupation')
    THEN 'Occupation'
      ELSE 'Check'
   END AS Order_type,
  CASE WHEN DESCRIPTION IN ('Exparte Non-Molestation','Exparte Occupation')
     THEN 'Exparte'
       WHEN DESCRIPTION IN ('On Notice Non-Molestation','On Notice Occupation')
     THEN 'On Notice' /*Should this just be 'on'? as this is in the current csv value, capitalised for consistency*/
      ELSE 'Check'
   END AS  Exparte_or_On_Notice,
   'n/a' as Power_of_arrest
 FROM
   fcsq.DV_APPS_FINAL
 WHERE 
  YEAR > 2010)
 GROUP BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type
  ORDER BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type;
"""
pydb.create_temp_table(create_DV_APPS_table,'DV_APPS')

#### DV_APPS validation

In [None]:
#DV_APPS_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_APPS")
#DV_APPS_count

## 30. DV_APP_COUNT table - Adding a Count for de duplicated applications <a name=DV_APP_COUNT></a>

### Create the DV_APP_COUNT table

In [None]:
create_DV_APP_COUNT_table =f"""
SELECT *,COUNT(*) AS Total FROM
(SELECT  
Year, 
Quarter,
Case_Type,
'Application events' AS Type, 
'n/a' AS Order_type, 
'n/a' AS Exparte_or_On_Notice, 
'n/a' AS Power_of_Arrest
FROM fcsq.DV_APPLICATION_EVENTS
WHERE YEAR > 2010)
GROUP BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type
ORDER BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type;
"""
pydb.create_temp_table(create_DV_APP_COUNT_table,'DV_APP_COUNT')

#### DV_APP_COUNT validation

In [None]:
#DV_APP_COUNT_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_APP_COUNT")
#DV_APP_COUNT_count

## 31. DV_ORDERS table - Prepares the orders data for CSV output <a name=DV_ORDERS></a>

### Create the DV_ORDERS table

In [None]:
create_DV_ORDERS_table =f"""
SELECT *,count(*) as Total FROM(
SELECT
  Year,
  Quarter,
  Case_Type,
  'Orders made' AS Type,
  DESCRIPTION AS Order_type,
  Type AS Exparte_or_On_Notice,
  POA as Power_of_arrest  
 FROM
   fcsq.DV_Ords_Final
 WHERE 
  YEAR > 2010)
 GROUP BY 
  Year,
  Quarter,
  case_type,
  Type,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest
ORDER BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type;
  """
pydb.create_temp_table(create_DV_ORDERS_table,'DV_ORDERS')

#### DV_ORDERS validation

In [None]:
#DV_ORDERS_count = pydb.read_sql_query("SELECT * from __temp__.DV_ORDERS")
#DV_ORDERS_count

## 32. DV_CASE_STARTS table - Prepares case start data for final CSV output <a name=DV_CASE_STARTS></a>

### Create the DV_CASE_STARTS table

In [None]:
create_DV_CASE_STARTS_table =f"""
SELECT *, Count (*) as Total FROM(
  SELECT
  Year,
  Quarter,
  Case_Type,
  'Cases started' AS Type,
  'n/a' AS Order_type,
  'n/a' AS Exparte_or_On_Notice,
  'n/a' AS Power_of_arrest
FROM
  fcsq.DV_APP_CASES_FINAL
WHERE
  YEAR > 2010 AND Case_Type = 'Domestic Violence')
GROUP BY
  YEAR,
  QUARTER,
  Case_Type,
  Type,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest
ORDER BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type;
"""
pydb.create_temp_table(create_DV_CASE_STARTS_table,'DV_CASE_STARTS')

#### DV_CASE_STARTS validation

In [None]:
#DV_CASE_STARTS_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_CASE_STARTS")
#DV_CASE_STARTS_count

## 33. DV_CASES_Closed table - Prepares case closed data for final CSV output <a name=DV_CASES_Closed></a>

### Create the DV_CASES_Closed table

In [None]:
create_DV_CASES_Closed_table =f"""
SELECT *, Count (*) as Total FROM(
  SELECT
  Year,
  Quarter,
  Case_Type,
  'Cases concluded' AS Type,
  'n/a' AS Order_type,
  'n/a' AS Exparte_or_On_Notice,
  'n/a' AS Power_of_arrest
FROM
  fcsq.DV_ORD_CASES_FINAL
WHERE
  YEAR > 2010 AND CASE_TYPE = 'Domestic Violence')
GROUP BY
  YEAR,
  QUARTER,
  Case_Type,
  Type,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest
ORDER BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type;
"""
pydb.create_temp_table(create_DV_CASES_Closed_table,'DV_CASES_Closed')

#### DV_CASES_Closed validation

In [None]:
#DV_CASES_Closed_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_CASES_Closed")
#DV_CASES_Closed_count

## 34. DV_all_data table - Joins all data together <a name=DV_all_data></a>

### Create the DV_all_data table

In [None]:
create_DV_all_data_table =f"""
SELECT
  *
FROM
  __temp__.DV_APPS
UNION ALL
SELECT
  *
FROM 
  __temp__.DV_APP_COUNT
UNION ALL
SELECT
  *
FROM
  __temp__.DV_ORDERS
UNION ALL
SELECT
  *
FROM
  __temp__.DV_CASE_STARTS
UNION ALL
SELECT
  *
FROM
  __temp__.DV_CASES_Closed
"""
pydb.create_temp_table(create_DV_all_data_table,'DV_all_data')

#### DV_all_data validation

In [None]:
#DV_all_data_count = pydb.read_sql_query("SELECT count(*) as count from __temp__.DV_all_data")
#DV_all_data_count

## 35. Import domestic violence 2008-2010 data 
<a name="import_domestic_violence_old_data"></a>

### Create the dv_2008_2010_data table

In [None]:
dv_2008_2010_data = pd.read_csv("s3://alpha-family-data/CSVs/Domestic_Violence/Required data/DV_2008_2010_data.csv", low_memory=False)

In [None]:
pydb.dataframe_to_temp_table(dv_2008_2010_data, "dv_2008_2010_data")

#### dv_2008_2010_data validation

In [None]:
#dv_2008_2010_data_count = pydb.read_sql_query("SELECT * from __temp__.dv_2008_2010_data")
#dv_2008_2010_data_count

## 36. DV_CSV table - Final CSV output to copy into data tab of Domestic Violence workbook <a name=DV_CSV></a>

### Drop the DV_CSV table if it already exists and remove its data from the S3 bucket

In [None]:
drop_DV_CSV = "DROP TABLE IF EXISTS fcsq.DV_CSV"
pydb.start_query_execution_and_wait(drop_DV_CSV)
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/DV_CSV").delete();

### Create the DV_CSV table

In [None]:
create_DV_CSV_temp_table =f"""
SELECT
year, 
quarter,
'n/a' as case_type,
type,
order_type,
CAST(exparte_or_on_notice AS varchar(9)) AS exparte_or_on_notice,
CAST(power_of_arrest AS varchar(6)) AS power_of_arrest,
total

FROM 
__temp__.dv_2008_2010_data

UNION 

SELECT *
FROM __temp__.DV_all_data
WHERE NOT (year = {latest_year} AND quarter = {latest_quarter});
"""
pydb.create_temp_table(create_DV_CSV_temp_table,'DV_CSV_temp')



create_DV_CSV_table =f"""
CREATE TABLE IF NOT EXISTS fcsq.DV_CSV
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/DV_CSV') AS
SELECT
  COALESCE(CAST(Year as varchar(20)), 'n/a') AS year,
  COALESCE(CAST(Quarter as varchar(20)), 'n/a') as quarter,
  COALESCE(CAST(Case_Type as varchar(20)), 'n/a') as case_Type,
  COALESCE(CAST(Type as varchar(20)), 'n/a') as type,
  COALESCE(CAST(Order_type as varchar(20)), 'n/a') as order_type,
  COALESCE(CAST(Exparte_or_On_Notice as varchar(20)), 'n/a') as exparte_or_on_notice,
  COALESCE(CAST(Power_of_arrest as varchar(20)), 'n/a') as power_of_arrest,
  COALESCE(CAST(Total as varchar(20)), 'n/a') as total
 FROM
   __temp__.DV_CSV_temp
 ORDER BY
  Type,
  Year,
  Quarter,
  Order_type,
  Exparte_or_On_Notice,
  Power_of_arrest,
  Case_Type;
"""
pydb.start_query_execution_and_wait(create_DV_CSV_table);

#### DV_CSV validation

In [None]:
#DV_CSV_count = pydb.read_sql_query("SELECT * from fcsq.DV_CSV")
#DV_CSV_count

In [None]:
df = pydb.read_sql_query("SELECT * from fcsq.DV_CSV ORDER BY Type, Year, Quarter, Order_type, Exparte_or_On_Notice, Power_of_arrest, Case_Type")
df.to_csv(path_or_buf = 's3://alpha-family-data/CSVs/CSV_bulletin/' + str(pub_year) + " Q" + str(pub_quarter) + '/domestic_violence.csv',index=False)