# Adoption DFJ

## Contents
#### Setup
1. [Import packages and options](#import_packages)
2. [Define key variables](#define_key_variables)

#### Stage 1 - [Disposals](#disposals)
3. [adoption_disposals_orders](#Adoption_Disposals_Orders) - filters out non-adoption and calculates the number of disposal orders
4. [adopt_disposal_case_count_A](#adopt_disposal_case_count_A) - finds the first disposal from each case number and court
5. [adopt_disposal_case_count_B](#adopt_disposal_case_count_B) - formats the adopt_disposal_case_count_A table by adding a quarter column
6. [adoption_disposals_case](#Adoption_Disposals_Case) - filters by year>2010 and produces the count of the first disposal from each case number and court
7. [adoption_disposals_merge](#adoption_disposals_merge) - joins both the adoption_disposals_orders and adoption_disposals_case tables
8. [adoption_disposals_format](#adoption_disposals_format) - formats adoption_disposals_merge table by ordering the columns and renaming column names

#### Stage 2 - [Applications](#applications)
9. [adoption_applications_ordera](#Adoption_Applications_Ordera) - filters by year>2010 and calculates the number of applications
10. [application_case_count_C](#Application_case_count_C) - finds the first application from each case number and court
11. [application_case_count_D](#Application_case_count_D) - formats the Application_case_count_D table by adding a quarter column
12. [adoption_applications_case](#Adoption_Applications_Case) - filters by year>2010 and produces the count of the first application from each case number and court
13. [adoption_applications_merge](#adoption_Applications_merge) - joins both the Adoption_Applications_Ordera and Adoption_Applications_Case tables
14. [adoption_applications_format](#adoption_Applications_format) - formats adoption_Applications_merge table by ordering the columns and renaming column names

#### Stage 3 - [Preparing the final output](#prepare_final_output)
15. [adopt_court_level_append](#adopt_court_level_append) - combines both adoption_Applications_format and adoption_Disposals_format tables
16. [court_lookup](#court_lookup) - creates a table with court information (e.g court codes and region)
17. [court_level_merge](#court_level_merge) - joins both the adopt_court_level_append and court_lookup tables
18. [adopt_dfj](#adopt_dfj) - this query calculates the total number of counts and cases in each quarter and region to produce the final DFJ csv output

## 1. Import packages and set options 
<a name="import_packages"></a>

In [None]:
import pandas as pd  # for the data structures to store and manipulate tables
import pydbtools as pydb  # see https://github.com/moj-analytical-services/pydbtools
import boto3  # for working with AWS

# few things for viewing dataframes better
pd.set_option("display.max_columns", 100)
pd.set_option("display.width", 900)
pd.set_option("display.max_colwidth", 200)

## 2. Define key variables to be used throughout the notebook 
<a name="define_key_variables"></a>

In [None]:
#this is the database we will be extracting from
database = "familyman_live_v4" 

#this extracts the latest snapshot from athena
get_snapshot_date = f"SELECT mojap_snapshot_date from {database}.events order by mojap_snapshot_date desc limit 1"
snapshot_date = str(pydb.read_sql_query(get_snapshot_date)['mojap_snapshot_date'].values[0])

#this extracts the November snapshot from athena
#snapshot_date = '2022-11-09'

#this is the athena database we will be storing our tables in
fcsq_database = "fcsq"

#this is the s3 bucket we will be saving data to
s3 = boto3.resource("s3")
bucket = s3.Bucket("alpha-family-data")

#change these to the current quarter and year not the quarter being published
latest_quarter = 3
latest_year = 2023

# Stage 1 - Disposals
<a name="disposals"></a>

## 3. Adoption_Disposals_Orders table - filters out non-adoption and calculates the number of disposal orders
<a name="Adoption_Disposals_Orders"></a>

### Drop the Adoption_Disposals_Orders table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Adoption_Disposals_Orders = f"""
DROP TABLE IF EXISTS fcsq.Adoption_Disposals_Orders;
"""
pydb.start_query_execution_and_wait(drop_Adoption_Disposals_Orders)

# clean up previous Adoption_Disposals_Orders files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/Adoption_Disposals_Orders/").delete();

### Create the Adoption_Disposals_Orders table in Athena

In [None]:
create_Adoption_Disposals_Orders = f"""
CREATE TABLE IF NOT EXISTS fcsq.Adoption_Disposals_Orders
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/Adoption_Disposals_Orders') AS
SELECT YEAR, QUARTER, Court, count(*) as Disposals
FROM __temp__.adopt_disp_details 
WHERE adoption = 'Adoption'
GROUP BY Year,Quarter,Court
ORDER BY Year,Quarter,Court;
"""

pydb.start_query_execution_and_wait(create_Adoption_Disposals_Orders);

#### Adoption_Disposals_Orders validation

In [None]:
#Adoption_Disposals_Orders_count = pydb.read_sql_query("select count(*) as count from __temp__.Adoption_Disposals_Orders")
#Adoption_Disposals_Orders_count

## 4. adopt_disposal_case_count_A table - finds the first disposal from each case number and court 
<a name="adopt_disposal_case_count_A"></a>

### Create the adopt_disposal_case_count_A temporary table

In [None]:
create_adopt_disposal_case_count_A = f"""
SELECT 
t1.CASE_NUMBER, 
Min(t1.Receipt_date) AS MinOfReceipt_date,
t1.Court 
FROM fcsq.adopt_disposals5_adoption as t1
GROUP BY t1.CASE_NUMBER, t1.Court;

"""

pydb.create_temp_table(create_adopt_disposal_case_count_A,'adopt_disposal_case_count_A');

#### adopt_disposal_case_count_A validation

In [None]:
#adopt_disposal_case_count_A_count = pydb.read_sql_query("select count(*) as count from __temp__.adopt_disposal_case_count_A")
#adopt_disposal_case_count_A_count

## 5. adopt_disposal_case_count_B table - formats the adopt_disposal_case_count_A table by adding a quarter column
<a name="adopt_disposal_case_count_B"></a>

### Create the adopt_disposal_case_count_B temporary table

In [None]:
create_adopt_disposal_case_count_B = f"""
SELECT 
t1.CASE_NUMBER, 
Year (t1.MinOfReceipt_date) AS YEAR,
CASE WHEN Month(t1.MinOfReceipt_date)<4
    THEN 1
        WHEN Month(t1.MinOfReceipt_date)<7
        THEN 2
            WHEN Month(t1.MinOfReceipt_date)<10
            THEN 3

ELSE 4
END AS Quarter,
t1.MinOfReceipt_date,
t1.Court 
FROM __temp__.adopt_disposal_case_count_A  as t1;

"""

pydb.create_temp_table(create_adopt_disposal_case_count_B,'adopt_disposal_case_count_B')

#### adopt_disposal_case_count_B validation

In [None]:
#adopt_disposal_case_count_B_count = pydb.read_sql_query("select count(*) as count from __temp__.adopt_disposal_case_count_B")
#adopt_disposal_case_count_B_count

## 6. Adoption_Disposals_Case table - filters by year>2010 and produces the count of the first disposal from each case number and court
<a name="Adoption_Disposals_Case"></a>

### Drop the Adoption_Disposals_Case table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Adoption_Disposals_Case = f"""
DROP TABLE IF EXISTS fcsq.Adoption_Disposals_Case;
"""
pydb.start_query_execution_and_wait(drop_Adoption_Disposals_Case)

# clean up previous Adoption_Disposals_Case files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/Adoption_Disposals_Case/").delete();

### Create the Adoption_Disposals_Case table in Athena

In [None]:
create_Adoption_Disposals_Case = f"""
CREATE TABLE IF NOT EXISTS fcsq.Adoption_Disposals_Case
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/Adoption_Disposals_Case') AS
SELECT Year, Quarter, Court, count(*) as Case_End
FROM __temp__.adopt_disposal_case_count_B
WHERE year>2010
GROUP BY Year, Quarter, Court
ORDER BY Year, Quarter, Court;
"""

pydb.start_query_execution_and_wait(create_Adoption_Disposals_Case);

#### Adoption_Disposals_Case validation

In [None]:
#Adoption_Disposals_Case_count = pydb.read_sql_query("select count(*) as count from fcsq.Adoption_Disposals_Case")
#Adoption_Disposals_Case_count

## 7. adoption_disposals_merge table - joins both the adoption_disposals_orders and adoption_disposals_case tables
<a name="adoption_disposals_merge"></a>

### Drop the adoption_disposals_merge table if it already exists and remove its data from the S3 bucket

In [None]:
drop_adoption_disposals_merge = f"""
DROP TABLE IF EXISTS fcsq.adoption_disposals_merge;
"""
pydb.start_query_execution_and_wait(drop_adoption_disposals_merge)

# clean up previous adoption_disposals_merge files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/adoption_disposals_merge/").delete();

### Create the adoption_disposals_merge table in Athena

In [None]:
create_adoption_disposals_merge = f"""
CREATE TABLE IF NOT EXISTS fcsq.adoption_disposals_merge
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/adoption_disposals_merge') AS
SELECT
t1.year,
t1.quarter,
t1.court,
COALESCE(t1.Disposals, 0) AS Disposals,
COALESCE(t2.Case_End, 0) AS Case_End,
'Adoption' as Category,
'End' as Stage
FROM fcsq.adoption_disposals_orders t1 
FULL OUTER JOIN 
fcsq.adoption_disposals_case t2
ON t1.year = t2.year AND t1.quarter = t2.quarter AND t1.court = t2.court
WHERE NOT (t1.year = {latest_year} AND t1.quarter = {latest_quarter});
"""

pydb.start_query_execution_and_wait(create_adoption_disposals_merge);

#### adoption_disposals_merge validation

In [None]:
#adoption_disposals_merge_count = pydb.read_sql_query("select * from fcsq.adoption_disposals_merge ORDER BY Year,Quarter,Court;")
#adoption_disposals_merge_count

## 8. adoption_disposals_format table - formats adoption_disposals_merge table by ordering the columns and renaming column names
<a name="adoption_disposals_format"></a>

### Create the adoption_disposals_format temporary table

In [None]:
create_adoption_disposals_format  = f"""
SELECT
Category,
Stage,
Year,
Quarter,
Court,
Disposals as Count,
Case_end as Cases
FROM
fcsq.adoption_disposals_merge;
"""

pydb.create_temp_table(create_adoption_disposals_format,'adoption_disposals_format');

#### adoption_disposals_format  validation

In [None]:
#adoption_disposals_format_count = pydb.read_sql_query("select count(*) as count from __temp__.adoption_disposals_format ")
#adoption_disposals_format_count

# Stage 2 - Applications
<a name="applications"></a>

## 9. Adoption_Applications_Ordera table - filters by year>2010 and calculates the number of applications 
<a name="Adoption_Applications_Ordera"></a>

### Drop the Adoption_Applications_Ordera table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Adoption_Applications_Ordera = f"""
DROP TABLE IF EXISTS fcsq.Adoption_Applications_Ordera;
"""
pydb.start_query_execution_and_wait(drop_Adoption_Applications_Ordera)

# clean up previous Adoption_Applications_Ordera files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/Adoption_Applications_Ordera/").delete();

### Create the Adoption_Applications_Ordera table in Athena

In [None]:
create_Adoption_Applications_Ordera = f"""
CREATE TABLE IF NOT EXISTS fcsq.Adoption_Applications_Ordera
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/Adoption_Applications_Ordera') AS
SELECT YEAR, QUARTER, Court, SUM(adoptions_total) as Apps
FROM fcsq.adopt_apps_6_adoptions_only
WHERE Year>2010
GROUP BY Year,Quarter,Court
ORDER BY Year,Quarter,Court;
"""

pydb.start_query_execution_and_wait(create_Adoption_Applications_Ordera);

#### Adoption_Applications_Ordera validation

In [None]:
#Adoption_Applications_Ordera_count = pydb.read_sql_query("select count(*) as count from __temp__.Adoption_Applications_Ordera")
#Adoption_Applications_Ordera_count

## 10. Application_case_count_C table - finds the first application from each case number and court
<a name="Application_case_count_C"></a>

### Create the Application_case_count_C temporary table

In [None]:
create_Application_case_count_C = f"""
SELECT 
t1.CASE_NUMBER, 
Min(t1.App_date) AS MinOfApp_date,
t1.Court
FROM fcsq.adopt_apps_6_adoptions_only as t1
GROUP BY t1.CASE_NUMBER, t1.Court;

"""

pydb.create_temp_table(create_Application_case_count_C,'Application_case_count_C');

#### Application_case_count_C validation

In [None]:
#Application_case_count_C_count = pydb.read_sql_query("select count(*) as count from __temp__.Application_case_count_C")
#Application_case_count_C_count

## 11. Application_case_count_D table - formats the Application_case_count_D table by adding a quarter column
<a name="Application_case_count_D"></a>

### Create the Application_case_count_D temporary table

In [None]:
create_Application_case_count_D = f"""
SELECT 
t1.CASE_NUMBER, 
Year (t1.MinOfApp_date) AS YEAR,
CASE WHEN Month(t1.MinOfApp_date)<4
    THEN 1
        WHEN Month(t1.MinOfApp_date)<7
        THEN 2
            WHEN Month(t1.MinOfApp_date)<10
            THEN 3

ELSE 4
END AS Quarter,
t1.MinOfApp_date,
t1.Court
FROM __temp__.application_case_count_C  as t1;

"""

pydb.create_temp_table(create_Application_case_count_D,'Application_case_count_D');

#### Application_case_count_D validation

In [None]:
#Application_case_count_D_count = pydb.read_sql_query("select count(*) as count from __temp__.Application_case_count_D")
#Application_case_count_D_count

## 12. Adoption_Applications_Case table - filters by year>2010 and produces the count of the first application from each case number and court
<a name="Adoption_Applications_Case"></a>

### Drop the Adoption_Applications_Case table if it already exists and remove its data from the S3 bucket

In [None]:
drop_Adoption_Applications_Case = f"""
DROP TABLE IF EXISTS fcsq.Adoption_Applications_Case;
"""
pydb.start_query_execution_and_wait(drop_Adoption_Applications_Case)

# clean up previous Adoption_Applications_Case files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/Adoption_Applications_Case/").delete();

### Create the Adoption_Applications_Case table in Athena

In [None]:
create_Adoption_Applications_Case = f"""
CREATE TABLE IF NOT EXISTS fcsq.Adoption_Applications_Case
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/Adoption_Applications_Case') AS
SELECT Year, Quarter, Court, count(case_number) as Case_Count
FROM __temp__.Application_case_count_D
WHERE year>2010
GROUP BY Year,Quarter,Court
ORDER BY Year,Quarter,Court;
"""

pydb.start_query_execution_and_wait(create_Adoption_Applications_Case);

#### Adoption_Applications_Case validation

In [None]:
#Adoption_Applications_Case_count = pydb.read_sql_query("select count(*) as count from __temp__.Adoption_Applications_Case")
#Adoption_Applications_Case_count

## 13. adoption_Applications_merge table - joins both the Adoption_Applications_Ordera and Adoption_Applications_Case tables
<a name="adoption_Applications_merge"></a>

### Drop the adoption_Applications_merge table if it already exists and remove its data from the S3 bucket

In [None]:
drop_adoption_Applications_merge = f"""
DROP TABLE IF EXISTS fcsq.adoption_Applications_merge;
"""
pydb.start_query_execution_and_wait(drop_adoption_Applications_merge)

# clean up previous adoption_Applications_merge files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/adoption_Applications_merge/").delete();

### Create the adoption_Applications_merge table in Athena

In [None]:
create_adoption_Applications_merge = f"""
CREATE TABLE IF NOT EXISTS fcsq.adoption_Applications_merge
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/adoption_Applications_merge') AS
SELECT
t1.year,
t1.quarter,
t1.court,
COALESCE(t1.apps, 0) AS apps,
COALESCE(t2.Case_Count, 0) AS Case_Count,
'Adoption' as Category,
'Start' as Stage
FROM fcsq.Adoption_Applications_Ordera t1 
FULL OUTER JOIN 
fcsq.Adoption_Applications_Case t2
ON t1.year = t2.year AND t1.quarter = t2.quarter AND t1.court = t2.court
WHERE NOT (t1.year = {latest_year} AND t1.quarter = {latest_quarter});
"""

pydb.start_query_execution_and_wait(create_adoption_Applications_merge);

#### adoption_Applications_merge validation

In [None]:
#adoption_Applications_merge_count = pydb.read_sql_query("select * from __temp__.adoption_Applications_merge ORDER BY Year,Quarter,Court;")
#adoption_Applications_merge_count

## 14. adoption_Applications_format table - formats adoption_Applications_merge table by ordering the columns and renaming column names
<a name="adoption_Applications_format"></a>

### Create the adoption_Applications_format temporary table

In [None]:
create_adoption_Applications_format  = f"""
SELECT
Category,
Stage,
Year,
Quarter,
Court,
apps as Count,
Case_Count as Cases
FROM
fcsq.adoption_Applications_merge;
"""

pydb.create_temp_table(create_adoption_Applications_format,'adoption_Applications_format');

#### adoption_Applications_format  validation

In [None]:
#adoption_Applications_format_count = pydb.read_sql_query("select * from __temp__.adoption_Applications_format ORDER BY Year,Quarter,Court")
#adoption_Applications_format_count

# Stage 3 - Preparing the final output
<a name="prepare_final_output"></a>

## 15. adopt_court_level_append table - combines both adoption_Applications_format and adoption_Disposals_format tables
<a name="adopt_court_level_append"></a>

### Drop the adopt_court_level_append table if it already exists and remove its data from the S3 bucket

In [None]:
drop_adopt_court_level_append = f"""
DROP TABLE IF EXISTS fcsq.adopt_court_level_append;
"""
pydb.start_query_execution_and_wait(drop_adopt_court_level_append)

# clean up previous adopt_court_level_append files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/adopt_court_level_append/").delete();

### Create the adopt_court_level_append table in Athena

In [None]:
create_adopt_court_level_append = f"""
CREATE TABLE IF NOT EXISTS fcsq.adopt_court_level_append 
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/adopt_court_level_append') AS
SELECT * FROM __temp__.adoption_Applications_format 
UNION 
SELECT * FROM __temp__.adoption_Disposals_format 
ORDER BY Year,Quarter,Court
"""

pydb.start_query_execution_and_wait(create_adopt_court_level_append);

#### adopt_court_level_append validation

In [None]:
#adopt_court_level_append_count = pydb.read_sql_query("select * from fcsq.adopt_court_level_append")
#adopt_court_level_append_count

## 16. court_lookup table - creates a table with court information (e.g court codes and region)
<a name="court_lookup"></a>

### Create the court_lookup temporary table

In [None]:
create_court_lookup = f"""
SELECT 
code,
Region,
Region_Pre2014,
DFJ_New
FROM fcsq.court_mv_feb21_dfj;
"""

pydb.create_temp_table(create_court_lookup,'court_lookup');

#### court_lookup validation

In [None]:
#court_lookup_count = pydb.read_sql_query("select * from __temp__.court_lookup")
#court_lookup_count

## 17. court_level_merge table - joins both the adopt_court_level_append and court_lookup tables
<a name="court_level_merge"></a>

### Drop the court_level_merge table if it already exists and remove its data from the S3 bucket

In [None]:
drop_court_level_merge = f"""
DROP TABLE IF EXISTS fcsq.court_level_merge;
"""
pydb.start_query_execution_and_wait(drop_court_level_merge)

# clean up previous court_level_merge files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/court_level_merge/").delete();

### Create the court_level_merge table in Athena

In [None]:
create_court_level_merge = f"""
CREATE TABLE IF NOT EXISTS fcsq.court_level_merge
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/court_level_merge') AS
SELECT
t1.Category,
t1.Stage,
t1.Year,
t1.Quarter,
t1.Count,
t1.Cases,
t2.DFJ_New,
Case when Year < 2014 then t2.Region_Pre2014
Else t2.Region
End As Final_Region
FROM 
fcsq.adopt_court_level_append t1
INNER JOIN
__temp__.court_lookup t2
ON CAST(t1.court as integer) = t2.code
where CAST(t1.court as integer) in (SELECT code from __temp__.court_lookup);
"""

pydb.start_query_execution_and_wait(create_court_level_merge);

#### court_level_merge validation

In [None]:
#court_level_merge_count = pydb.read_sql_query("select * from __temp__.court_level_merge where year=2020 and quarter=3 and dfj_new='Carlisle DFJ' order by year,quarter,dfj_new")
#court_level_merge_count

## 18. adopt_dfj table - this query calculates the total number of counts and cases in each quarter and region to produce the final DFJ csv output
<a name="adopt_dfj"></a>

### Create the adopt_dfj athena table 

### Drop the adopt_dfj table if it already exists and remove its data from the S3 bucket

In [None]:
drop_adopt_dfj = f"""
DROP TABLE IF EXISTS fcsq.adopt_dfj;
"""
pydb.start_query_execution_and_wait(drop_adopt_dfj)

# clean up previous adopt_dfj files
bucket.objects.filter(Prefix="fcsq_processing/Adoption/adopt_dfj/").delete();

### Create the adopt_dfj table in Athena

In [None]:
create_adopt_dfj = f"""
CREATE TABLE IF NOT EXISTS fcsq.adopt_dfj
WITH (format = 'PARQUET', external_location = 's3://alpha-family-data/fcsq_processing/Adoption/adopt_dfj/') AS
SELECT
Category,
Stage,
Year,
Quarter,
DFJ_New,
Final_Region as region,
SUM(count) as count,
SUM(cases) as cases
FROM fcsq.court_level_merge
group by
Category,
Stage,
Year,
Quarter,
DFJ_New,
Final_Region
order by
Category,
Stage,
Year,
Quarter,
DFJ_New,
Final_Region;
"""
pydb.start_query_execution_and_wait(create_adopt_dfj);

In [None]:
df = pydb.read_sql_query("select * from fcsq.adopt_dfj;")
df.to_csv(path_or_buf = 's3://alpha-family-data/CSVs/DFJ/adopt_dfj.csv',index=False)

#### adopt_dfj validation

In [None]:
#adopt_dfj_count = pydb.read_sql_query("select * from fcsq.adopt_dfj ORDER BY Category,Year,Quarter,region,DFJ_New,Stage")
#adopt_dfj_count