# Domestic Violence DFJ

## Contents
#### Setup
1. [import_packages](#import_packages) 
2. [define_key_variables](#define_key_variables) 

#### Stage 1 - [Orders](#orders)
3. [dom_violence_orders](#Dom_violence_Orders) - filters by year>2010 and calculates the number of orders
4. [orders_case_count_A](#orders_case_count_A) - finds the last order from each case number and court
5. [orders_case_count_B](#orders_case_count_B) - 
6. [dom_violence_cases](#Dom_violence_Cases) - filters by year>2010 and produces the count of the last disposal from each case number and court
7. [dom_violence_merge ](#Dom_violence_Merge) - joins both the Dom_Violence_Orders and Dom_Violence_Cases tables
8. [dom_violence_format](#Dom_violence_Format) - formats Dom_violence_Merge table by ordering the columns and renaming column names

#### Stage 2 - [Applications](#applications)
9. [dom_violence_apps](#Dom_violence_Apps) - filters by year>2010 and calculates the number of applications
10. [apps_case_count_C](#apps_case_count_C) - finds the first application from each case number and court
11. [apps_case_count_D](#apps_case_count_D) - 
12. [dom_violence_apps_case](#Dom_violence_Apps_Case) - filters by year>2010 and produces the count of the first application from each case number and court
13. [dom_violence_apps_merge](#Dom_violence_Apps_Merge) - joins both the Dom_violence_Apps and Dom_violence_Apps_Case tables
14. [dom_violence_apps_format](#Dom_violence_Apps_Format) - formats Dom_violence_Apps_Merge table by ordering the columns and renaming column names

#### Stage 3 - [Preparing the final output](#prepare_final_output)
15. [dv_court_level_append](#dv_court_level_append) - combines both Dom_violence_Format and Dom_violence_Apps_Format tables
16. [court_lookup](#court_lookup) - creates a table with court information (e.g court codes and region)
17. [court_level_merge](#court_level_merge) - joins both the dv_court_level_append and court_lookup tables
18. [domestic_violence_dfj](#domestic_violence_dfj) - this query calculates the total number of counts and cases in each quarter and region to produce the final DFJ csv output

## 1. Import packages and set options 
<a name="import_packages"></a>

In [None]:
import pandas as pd  # a module which provides the data structures and functions to store and manipulate tables in dataframes
import pydbtools as pydb  # A module which allows SQL queries to be run on the Analytical Platform from Python, see https://github.com/moj-analytical-services/pydbtools
import boto3  # allows you to directly create, update, and delete AWS resources from Python scripts

# sets parameters to view dataframes for tables easier
pd.set_option("display.max_columns", 100)
pd.set_option("display.width", 900)
pd.set_option("display.max_colwidth", 200)

## 2. Define key variables to be used throughout the notebook 
<a name="define_key_variables"></a>

In [None]:
#this is the database we will be extracting from
database = "familyman_live_v4" 

#this extracts the latest snapshot from athena
get_snapshot_date = f"SELECT mojap_snapshot_date from {database}.events order by mojap_snapshot_date desc limit 1"
snapshot_date = str(pydb.read_sql_query(get_snapshot_date)['mojap_snapshot_date'].values[0])

#this extracts the November snapshot from athena
#snapshot_date = '2022-11-09'

#this is the athena database we will be storing our tables in
fcsq_database = "fcsq"

#this is the s3 bucket we will be saving data to
s3 = boto3.resource("s3")
bucket = s3.Bucket("alpha-family-data")

#change these to the current quarter and year not the quarter being published
latest_quarter = 3
latest_year = 2023

# Stage 1 - Orders
<a name="orders"></a>

## 3. Dom_violence_Orders table - filters by year>2010 and calculates the number of orders
<a name="Dom_violence_Orders"></a>

### Drop the Dom_violence_Orders table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Dom_violence_Orders = f"""
DROP TABLE IF EXISTS fcsq.Dom_violence_Orders;
"""
pydb.start_query_execution_and_wait(drop_Dom_violence_Orders)

# clean up previous Dom_violence_Orders files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Dom_violence_Orders/").delete();

### Create the Dom_violence_Orders table in Athena

In [None]:
create_Dom_violence_Orders = f"""
CREATE TABLE IF NOT EXISTS fcsq.Dom_violence_Orders
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Dom_violence_Orders') AS
SELECT Year, Quarter, Event_court, count(*) as count
FROM fcsq.DV_ORDS_FINAL
WHERE Year > 2010
GROUP BY Year, Quarter, Event_court
ORDER BY Year, Quarter, Event_court;
"""

pydb.start_query_execution_and_wait(create_Dom_violence_Orders);

#### Dom_violence_Orders validation

In [None]:
#Dom_violence_Orders_count = pydb.read_sql_query("select count(*) as count from fcsq.Dom_violence_Orders")
#Dom_violence_Orders_count

## 4. orders_case_count_A table - finds the last order from each case number and court 
<a name="orders_case_count_A"></a>

### Create the orders_case_count_A temporary table

In [None]:
create_orders_case_count_A = f"""
SELECT *,
ROW_NUMBER() OVER(PARTITION BY CASE_NUMBER
ORDER BY CASE_NUMBER, RECEIPT_DATE DESC, EVENT DESC) AS SEQ_NUM
FROM fcsq.DV_ORDS_FINAL
"""
pydb.create_temp_table(create_orders_case_count_A,'orders_case_count_A');

#### orders_case_count_A validation

In [None]:
#orders_case_count_A_count = pydb.read_sql_query("select count(*) as count from __temp__.orders_case_count_A")
#orders_case_count_A_count

## 5. orders_case_count_B table - 
<a name="orders_case_count_B"></a>

### Create the orders_case_count_B temporary table

In [None]:
create_orders_case_count_B = f"""
SELECT
t1.CASE_NUMBER, 
Year (t1.Receipt_Date) AS YEAR,
CASE WHEN Month(t1.Receipt_Date)<4
    THEN 1
        WHEN Month(t1.Receipt_Date)<7
        THEN 2
            WHEN Month(t1.Receipt_Date)<10
            THEN 3
ELSE 4
END AS Quarter,
t1.Receipt_Date,
t1.event_court

FROM __temp__.orders_case_count_A as t1

WHERE SEQ_NUM = 1

GROUP BY t1.CASE_NUMBER,
t1.Receipt_Date,
t1.event_court;

"""

pydb.create_temp_table(create_orders_case_count_B,'orders_case_count_B');

#### orders_case_count_B validation

In [None]:
#orders_case_count_B_count = pydb.read_sql_query("select count(*) as count from __temp__.orders_case_count_B")
#orders_case_count_B_count

## 6. Dom_violence_Cases table - filters by year>2010 and produces the count of the last order from each case number and court
<a name="Dom_violence_Cases"></a>

### Drop the Dom_violence_Cases table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Dom_violence_Cases = f"""
DROP TABLE IF EXISTS fcsq.Dom_violence_Cases;
"""
pydb.start_query_execution_and_wait(drop_Dom_violence_Cases)

# clean up previous Dom_violence_Cases files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Dom_violence_Cases/").delete();

### Create the Dom_violence_Cases table in Athena

In [None]:
create_Dom_violence_Cases = f"""
CREATE TABLE IF NOT EXISTS fcsq.Dom_violence_Cases
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Dom_violence_Cases') AS
SELECT Year, Quarter, Event_court, count(*) as cases
FROM __temp__.orders_case_count_B
WHERE Year > 2010 and substring(case_number,5,1) = 'F'
GROUP BY Year, Event_court, Quarter
ORDER BY Year, Event_court, Quarter;
"""

pydb.start_query_execution_and_wait(create_Dom_violence_Cases);

#### Dom_violence_Cases validation

In [None]:
#Dom_violence_Cases_count = pydb.read_sql_query("select count(*) as count from fcsq.Dom_violence_Cases")
#Dom_violence_Cases_count

## 7. Dom_violence_Merge table - joins both the Dom_violence_Orders and Dom_violence_Cases tables
<a name="Dom_violence_Merge"></a>

### Drop the Dom_violence_Merge table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Dom_violence_Merge = f"""
DROP TABLE IF EXISTS fcsq.Dom_violence_Merge;
"""
pydb.start_query_execution_and_wait(drop_Dom_violence_Merge)

# clean up previous Dom_violence_Merge files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Dom_violence_Merge/").delete();

### Create the Dom_violence_Merge table in Athena

In [None]:
create_Dom_violence_Merge = f"""
CREATE TABLE IF NOT EXISTS fcsq.Dom_violence_Merge 
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Dom_violence_Merge') AS
Select t1.Year,
t1.Quarter,
t1.Event_court,
COALESCE(t1.Count, 0) as Count,
COALESCE(t2.Cases, 0) as Cases,
'Domestic Violence' as Category,
'End' as Stage
from fcsq.Dom_violence_Orders t1
FULL OUTER JOIN
fcsq.Dom_violence_Cases t2 
on t1.Year = t2.Year AND t1.Quarter = t2.Quarter AND t1.Event_court = t2.Event_court
WHERE NOT (t1.year = {latest_year} AND t1.quarter = {latest_quarter})

"""

pydb.start_query_execution_and_wait(create_Dom_violence_Merge);

In [None]:
#Dom_violence_Merge_count = "SELECT COUNT(*) as Count from fcsq.Dom_violence_Merge"
#Dom_violence_Merge_count

## 8. Dom_violence_Format table - formats Dom_violence_Merge table by ordering the columns and renaming column names

### Create the Dom_violence_Format temporary table

In [None]:
create_Dom_violence_Format = f"""
SELECT Category, 
Year, 
Quarter, 
Event_court as Court, 
Stage, 
Count,
Cases
FROM fcsq.Dom_violence_Merge;
"""

pydb.create_temp_table(create_Dom_violence_Format,'Dom_violence_Format');

In [None]:
#Dom_violence_Format_count = "SELECT COUNT(*) as Count from __temp__.Dom_violence_Format"
#Dom_violence_Format_count

# Stage 2 - Applications
<a name="applications"></a>

## 9. Dom_violence_Apps table - filters by year>2010 and calculates the number of applications
<a name="Dom_violence_Apps"></a>

### Drop the Dom_violence_Apps table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Dom_violence_Apps = f"""
DROP TABLE IF EXISTS fcsq.Dom_violence_Apps;
"""
pydb.start_query_execution_and_wait(drop_Dom_violence_Apps)

# clean up previous Dom_violence_Apps files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Dom_violence_Apps/").delete();

### Create the Dom_violence_Apps table in Athena

In [None]:
create_Dom_violence_Apps = f"""
CREATE TABLE IF NOT EXISTS fcsq.Dom_violence_Apps
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Dom_violence_Apps') AS
SELECT distinct Year, Quarter, Event_court, count(*) as count
FROM (SELECT distinct Year, Quarter, Event_court, event
FROM fcsq.DV_APPS_FINAL
WHERE Year > 2010)
GROUP BY Year, Quarter, Event_court
ORDER BY Year, Quarter, Event_court;
"""

pydb.start_query_execution_and_wait(create_Dom_violence_Apps);

#### Dom_violence_Apps validation

In [None]:
#Dom_violence_Apps_count = "SELECT COUNT(*) as Count from fcsq.Dom_Violence_Apps"
#Dom_violence_Apps_count

## 10. apps_case_count_C table - finds the first application from each case number and court
<a name="apps_case_count_C"></a>

### Create the apps_orders_case_count_C temporary table

In [None]:
create_apps_case_count_C = f"""
SELECT *,
ROW_NUMBER() OVER(PARTITION BY CASE_NUMBER
ORDER BY CASE_NUMBER, RECEIPT_DATE ASC, EVENT ASC) AS SEQ_NUM
FROM fcsq.DV_APPS_FINAL;
"""

pydb.create_temp_table(create_apps_case_count_C,'apps_case_count_C');

#### apps_case_count_C validation

In [None]:
#apps_case_count_C_count = pydb.read_sql_query("select count(*) as count from __temp__.apps_case_count_C")
#apps_case_count_C_count

## 11. apps_case_count_D table - 
<a name="apps_case_count_D"></a>

### Create the apps_case_count_D temporary table

In [None]:
create_apps_case_count_D = f"""
SELECT
t1.CASE_NUMBER, 
Year (t1.Receipt_Date) AS YEAR,
CASE WHEN Month(t1.Receipt_Date)<4
    THEN 1
        WHEN Month(t1.Receipt_Date)<7
        THEN 2
            WHEN Month(t1.Receipt_Date)<10
            THEN 3
ELSE 4
END AS Quarter,
t1.Receipt_Date,
t1.event_court

FROM __temp__.apps_case_count_C as t1

WHERE SEQ_NUM = 1

GROUP BY t1.CASE_NUMBER,
t1.Receipt_Date,
t1.event_court;

"""

pydb.create_temp_table(create_apps_case_count_D,'apps_case_count_D');

#### apps_case_count_D validation

In [None]:
#apps_case_count_D_count = pydb.read_sql_query("select count(*) as count from __temp__.apps_case_count_D")
#apps_case_count_D_count

## 12. Dom_violence_Apps_Cases table - filters by year>2010 and produces the count of the first application from each case number and court
<a name="Dom_violence_Apps_Cases"></a>

### Drop the Dom_violence_Apps_Cases table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Dom_violence_Apps_Cases = f"""
DROP TABLE IF EXISTS fcsq.Dom_violence_Apps_Cases;
"""
pydb.start_query_execution_and_wait(drop_Dom_violence_Apps_Cases)

# clean up previous Dom_violence_Apps_Cases files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Dom_violence_Apps_Cases/").delete();

### Create the Dom_violence_Apps_Cases table in Athena

In [None]:
create_Dom_violence_Apps_Cases = f"""
CREATE TABLE IF NOT EXISTS fcsq.Dom_violence_Apps_Cases
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Dom_violence_Apps_Cases') AS
SELECT Year, Quarter, Event_court, count(*) as cases
FROM __temp__.apps_case_count_D
WHERE Year > 2010 and substring(case_number,5,1) = 'F'
GROUP BY Year, Event_court, Quarter
ORDER BY Year, Event_court, Quarter;
"""

pydb.start_query_execution_and_wait(create_Dom_violence_Apps_Cases);

#### Dom_violence_Apps_Cases validation

In [None]:
#Dom_violence_Apps_Cases_count = pydb.read_sql_query("select count(*) as count from fcsq.Dom_violence_Apps_Cases")
#Dom_violence_Apps_Cases_count

## 13. Dom_violence_Apps_Merge table - joins both the Dom_violence_Apps and Dom_violence_Apps_Cases tables
<a name="Dom_violence_Apps_Merge"></a>

### Drop the Dom_violence_Apps_Merge table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Dom_violence_Apps_Merge = f"""
DROP TABLE IF EXISTS fcsq.Dom_violence_Apps_Merge;
"""
pydb.start_query_execution_and_wait(drop_Dom_violence_Apps_Merge)

# clean up previous Dom_violence_Apps_Merge files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/Dom_violence_Apps_Merge/").delete();

### Create the Dom_violence_Apps_Merge table in Athena

In [None]:
create_Dom_violence_Apps_Merge = f"""
CREATE TABLE IF NOT EXISTS fcsq.Dom_violence_Apps_Merge
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/Dom_violence_Apps_Merge') AS
Select t1.Year,
t1.Quarter,
t1.Event_court,
COALESCE(t1.Count, 0) as Count,
COALESCE(t2.Cases, 0) as Cases,
'Domestic Violence' as Category,
'Start' as Stage
from fcsq.Dom_violence_Apps t1
FULL OUTER JOIN
fcsq.Dom_violence_Apps_Cases t2 
on t1.Year = t2.Year AND t1.Quarter = t2.Quarter AND t1.Event_court = t2.Event_court
WHERE NOT (t1.year = {latest_year} AND t1.quarter = {latest_quarter})

"""

pydb.start_query_execution_and_wait(create_Dom_violence_Apps_Merge);

#### Dom_violence_Apps_Merge validation

In [None]:
#Dom_violence_Apps_Merge_count = "SELECT COUNT(*) as Count from fcsq.Dom_violence_Apps_Merge"
#Dom_violence_Apps_Merge_count

## 14. Dom_violence_Apps_Format table - formats Dom_Violence_Apps_Merge table by ordering the columns and renaming column names
<a name="Dom_violence_Apps_Format"></a>

### Create the Dom_violence_Apps_Format_count temporary table

In [None]:
create_Dom_violence_Apps_Format = f"""
SELECT Category, 
Year, 
Quarter, 
Event_court as Court, 
Stage, 
Count,
Cases
FROM fcsq.Dom_Violence_Apps_Merge;
"""

pydb.create_temp_table(create_Dom_violence_Apps_Format,'Dom_violence_Apps_Format');

#### Dom_violence_Apps_Format validation

In [None]:
#Dom_violence_Apps_Format_count = "SELECT COUNT(*) as Count from __temp__.Dom_violence_Apps_Format"
#Dom_violence_Apps_Format_count

# Stage 3 - Preparing the final output
<a name="prepare_final_output"></a>

## 15. dv_court_level_append table - combines both Dom_violence_Format and Dom_violence_Apps_Format tables
<a name="dv_court_level_append"></a>

### Drop the dv_court_level_append table if it already exists and remove its data from the S3 bucket

In [None]:
drop_dv_court_level_append = f"""
DROP TABLE IF EXISTS fcsq.dv_court_level_append;
"""
pydb.start_query_execution_and_wait(drop_dv_court_level_append)

# clean up previous dv_court_level_append files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/dv_court_level_append/").delete();

### Create the dv_court_level_append table in Athena

In [None]:
create_dv_court_level_append = f"""
CREATE TABLE IF NOT EXISTS fcsq.dv_court_level_append 
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/dv_court_level_append') AS
SELECT * FROM __temp__.Dom_Violence_Apps_Format 
UNION 
SELECT * FROM __temp__.Dom_Violence_Format 
ORDER BY Year,Quarter,Court
"""

pydb.start_query_execution_and_wait(create_dv_court_level_append);

#### dv_court_level_append validation

In [None]:
#dv_court_level_append_count = pydb.read_sql_query("select * from __temp__.dv_court_level_append")
#dv_court_level_append_count

## 16. court_lookup table - creates a table with court information (e.g court codes and region)
<a name="court_lookup"></a>

### Create the court_lookup temporary table

In [None]:
create_court_lookup = f"""
SELECT 
code,
Region,
Region_Pre2014,
DFJ_New
FROM fcsq.court_mv_feb21_dfj;
"""

pydb.create_temp_table(create_court_lookup,'court_lookup');

#### court_lookup validation

In [None]:
#court_lookup_count = pydb.read_sql_query("select * from __temp__.court_lookup")
#court_lookup_count

## 17. court_level_merge table - joins both the dv_court_level_append and court_lookup tables
<a name="court_level_merge"></a>

### Drop the court_level_merge table if it already exists and remove its data from the S3 bucket

In [None]:
drop_court_level_merge = f"""
DROP TABLE IF EXISTS fcsq.court_level_merge;
"""
pydb.start_query_execution_and_wait(drop_court_level_merge)

# clean up previous court_level_merge files
bucket.objects.filter(Prefix="fcsq_processing/Domestic_Violence/court_level_merge/").delete();

### Create the court_level_merge table in Athena

In [None]:
create_court_level_merge = f"""
CREATE TABLE IF NOT EXISTS fcsq.court_level_merge
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Domestic_Violence/court_level_merge') AS
SELECT
t1.Category,
t1.Stage,
t1.Year,
t1.Quarter,
t1.Count,
t1.Cases,
t2.DFJ_New,
Case when Year < 2014 then t2.Region_Pre2014
Else t2.Region
End As Final_Region
FROM 
fcsq.dv_court_level_append t1
INNER JOIN
__temp__.court_lookup t2
ON CAST(t1.court as integer) = t2.code
where CAST(t1.court as integer) in (SELECT code from __temp__.court_lookup);
"""

pydb.start_query_execution_and_wait(create_court_level_merge);

#### court_level_merge validation

In [None]:
#court_level_merge_count = pydb.read_sql_query("select * from __temp__.court_level_merge where year=2020 and quarter=3 and dfj_new='Carlisle DFJ' order by year,quarter,dfj_new")
#court_level_merge_count

## 18. domestic_violence_dfj table - this query calculates the total number of counts and cases in each quarter and region to produce the final DFJ csv output
<a name="domestic_violence_dfj"></a>

### Create the domestic_violence_dfj athena table 

### Drop the domestic_violence_dfj table if it already exists and remove its data from the S3 bucket

In [None]:
drop_domestic_violence_dfj = f"""
DROP TABLE IF EXISTS fcsq.domestic_violence_dfj;
"""
pydb.start_query_execution_and_wait(drop_domestic_violence_dfj)

# clean up previous domestic_violence_dfj files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/domestic_violence_dfj/").delete();

### Create the domestic_violence_dfj table in Athena

In [None]:
create_domestic_violence_dfj = f"""
CREATE TABLE IF NOT EXISTS fcsq.domestic_violence_dfj
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/domestic_violence_dfj/') AS
SELECT
Category,
Stage,
Year,
Quarter,
DFJ_New,
Final_Region as region,
SUM(count) as count,
SUM(cases) as cases
FROM fcsq.court_level_merge
group by
Category,
Stage,
Year,
Quarter,
DFJ_New,
Final_Region
order by
Category,
Stage,
Year,
Quarter,
DFJ_New,
Final_Region;
"""
pydb.start_query_execution_and_wait(create_domestic_violence_dfj);

In [None]:
df = pydb.read_sql_query("select * from fcsq.domestic_violence_dfj;")
df.to_csv(path_or_buf = 's3://alpha-family-data/CSVs/DFJ/domestic_violence_dfj.csv',index=False)

#### domestic_violence_dfj validation

In [None]:
#domestic_violence_dfj_count = pydb.read_sql_query("select * from __temp__.domestic_violence_dfj ORDER BY Category,Year,Quarter,region,DFJ_New,Stage")
#domestic_violence_dfj_count