# PLD
Output a demographics table for the PLD Dashboard.

**Script**
* [scripts/dash_pld/demographics.ipynb](./scripts/dash_pld/demographics.ipynb)

**Parameters**
* None

**Input**
* coh_pt
* pld_demo
* pld_geo_medical
* pld_med_ins_pivot

**Output**
* dash_pld_demographics

In [1]:
#Import libraries for this notebook
import pandas as pd  
from drg_connect import Snowflake
import numpy as np
import pickle
import qgrid
from workbook_writer import make_xlsx
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

#Load connection variables to connect_dict
with open('../../out/conn/connect_dict.pickle', 'rb') as handle:
    connect_dict = pickle.load(handle)

#Create Eegine to connect to snowflake
snow = Snowflake(role=connect_dict['role'],
                 warehouse=connect_dict['warehouse'],
                 database=connect_dict['database'],
                 schema=connect_dict['schema'])

#Finish engine setup
engine = snow.engine
%load_ext sql_magic
%config SQL.conn_name = 'engine'  #Set the sql_magic connection engine
%config SQL.output_result = True  #Enable output to std out
%config SQL.notify_result = False #disable browser notifications

# Patient Groups
Create a variable to split the patient population how ever the client wants.  This is code that is to be customized per project. 

In this example we'll split the population by those that had a procedure.

In [3]:
%%read_sql
--Identify all patients who had a surgery OR radiation
DROP TABLE IF EXISTS dash_pld_pt_grp;
CREATE TABLE dash_pld_pt_grp AS 
    SELECT patient_id,
           CASE WHEN patient_id IN (SELECT patient_id
                                      FROM cld_proc
                                     WHERE cat1 = 'Surgery') THEN 'Surgery'
                WHEN patient_id IN (SELECT patient_id
                                      FROM cld_proc
                                     WHERE cat1 = 'Radiation') THEN 'Radiation'                     
                ELSE 'None'
           END AS pt_grp
      FROM coh_pt pt;

--Review Counts
SELECT pt_grp,
       Count(Distinct patient_id) AS pt_cnt,
       Count(*) / (SELECT Count(*)
                     FROM dash_pld_pt_grp) AS pt_pct
  FROM dash_pld_pt_grp
 GROUP BY pt_grp;

Query started at 04:25:25 PM Eastern Daylight TimeInitiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
; Query executed in 0.19 mQuery started at 04:25:37 PM Eastern Daylight Time; Query executed in 0.17 mQuery started at 04:25:47 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,pt_grp,pt_cnt,pt_pct
0,Radiation,967,0.005189
1,,184727,0.991338
2,Surgery,647,0.003472


# Continuous Coverage
Continuous coverage rules are set in this section.  The back end allows for different continuous coverage rules to be applied to differet populations.

## Import Ref
Import reference table for patient stability

### Tabs
Import the tabs and the stability rule that is associated with each tab

In [4]:
#Upload reference table from excel to snowflake and review snowflake output
df = pd.read_excel('../../in/dash/pld.xlsx', sheet_name='stability', skiprows=4, dtype=str)

#Strip white space and make referrable columns uppercase
df = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
df = df.apply(lambda x: x.str.upper() if x.dtype == "object" else x)

#Upload to snowflake
snow.drop_table("dash_pld_stability_tabs")
snow.upload_dataframe(df,"dash_pld_stability_tabs")
snow.select("SELECT * FROM dash_pld_stability_tabs LIMIT 10")
del df

DROP TABLE IF EXISTS ref_db.semi_custom.dash_pld_stability_tabs;
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Table ref_db.semi_custom.dash_pld_stability_tabs dropped! (╯°□°）╯︵ ┻━┻
Upload into ref_db.semi_custom.dash_pld_stability_tabs successful! ┬──┬◡ﾉ(°-°ﾉ)


Unnamed: 0,tab,rule
0,DEMOGRAPHICS,MEDICAL_2018
1,COMORBIDITY,COMORBIDITIES
2,PHARMACY,PHARMACY_ONLY
3,VISITS,MEDICAL_2018
4,SPECIALTY,MEDICAL_2018
5,DRUG_UTILIZATION,PHARMACY_ONLY


### Stability Rules
Upload the stability rules. There can be multiple different stability rules to upload.

In [5]:
#Upload reference table from excel to snowflake and review snowflake output
df = pd.read_excel('../../in/dash/pld.xlsx', sheet_name='stability_rules', skiprows=4, dtype=str)

#Strip white space and make referrable columns uppercase
df = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
df = df.apply(lambda x: x.str.upper() if x.dtype == "object" else x)

#Upload to snowflake
snow.drop_table("dash_pld_stability_rules")
snow.upload_dataframe(df,"dash_pld_stability_rules")
snow.select("SELECT * FROM dash_pld_stability_rules LIMIT 10")
del df

DROP TABLE IF EXISTS ref_db.semi_custom.dash_pld_stability_rules;
Table ref_db.semi_custom.dash_pld_stability_rules dropped! (╯°□°）╯︵ ┻━┻
Upload into ref_db.semi_custom.dash_pld_stability_rules successful! ┬──┬◡ﾉ(°-°ﾉ)


Unnamed: 0,rule,source,period,min_claim_cnt
0,PHARMACY_ONLY,PHARMACY,2017,1
1,PHARMACY_ONLY,PHARMACY,2018_Q4,1
2,MEDICAL_2018,MEDICAL,2018,1
3,COMORBIDITIES,MEDICAL,2016,1
4,COMORBIDITIES,MEDICAL,2017,1
5,COMORBIDITIES,MEDICAL,2018,1
6,MEDICAL_PHARMACY,PHARMACY,2017,1
7,MEDICAL_PHARMACY,PHARMACY,2018,1
8,MEDICAL_PHARMACY,PHARMACY,2018_Q4,1
9,MEDICAL_PHARMACY,MEDICAL,2016,1


## Dash Ref

In [6]:
%%read_sql
--Combine rules tables together
DROP TABLE IF EXISTS dash_pld_stability_ref;
CREATE TABLE dash_pld_stability_ref AS
    SELECT tab.tab,
           tab.rule,
           rule.source,
           rule.period,
           rule.min_claim_cnt
      FROM dash_pld_stability_tabs tab
           JOIN dash_pld_stability_rules rule
             ON tab.rule = rule.rule;

--Review table
SELECT *
  FROM dash_pld_stability_ref
 ORDER BY tab, rule;

Query started at 04:26:17 PM Eastern Daylight Time; Query executed in 0.06 mQuery started at 04:26:21 PM Eastern Daylight Time; Query executed in 0.07 mQuery started at 04:26:25 PM Eastern Daylight Time; Query executed in 0.07 m

Unnamed: 0,tab,rule,source,period,min_claim_cnt
0,COMORBIDITY,COMORBIDITIES,MEDICAL,2016,1
1,COMORBIDITY,COMORBIDITIES,MEDICAL,2017,1
2,COMORBIDITY,COMORBIDITIES,MEDICAL,2018,1
3,DEMOGRAPHICS,MEDICAL_2018,MEDICAL,2018,1
4,DRUG_UTILIZATION,PHARMACY_ONLY,PHARMACY,2017,1
5,DRUG_UTILIZATION,PHARMACY_ONLY,PHARMACY,2018_Q4,1
6,PHARMACY,PHARMACY_ONLY,PHARMACY,2017,1
7,PHARMACY,PHARMACY_ONLY,PHARMACY,2018_Q4,1
8,SPECIALTY,MEDICAL_2018,MEDICAL,2018,1
9,VISITS,MEDICAL_2018,MEDICAL,2018,1


In [7]:
%%read_sql
--Create a table with all the possible unique combinations of patient_id and tabs
DROP TABLE IF EXISTS dash_pld_stability_cnts;
CREATE TABLE dash_pld_stability_cnts AS
    SELECT coh.patient_id,
           ref.tab,
           ref.rule,
           ref.source,
           ref.period,
           ref.min_claim_cnt,
           False AS meets_cov_rule
      FROM coh_pt coh
           JOIN dash_pld_stability_ref ref;

Query started at 04:26:29 PM Eastern Daylight Time; Query executed in 0.10 mQuery started at 04:26:35 PM Eastern Daylight Time; Query executed in 0.19 m

Unnamed: 0,status
0,Table DASH_PLD_STABILITY_CNTS successfully cre...


## Med and Phar claims
Update where the medical and pharmacy claims meet the minimum thresholds for counts

In [8]:
%%read_sql
--Update medical claim counts
BEGIN;
UPDATE dash_pld_stability_cnts stab
   SET stab.meets_cov_rule = TRUE
  FROM pld_med_cnt_unpivot cnt
 WHERE cnt.patient_id = stab.patient_id
       AND stab.period = cnt.period
       AND cnt.med_claim_cnt >= stab.min_claim_cnt
       AND stab.source = 'MEDICAL';
COMMIT;

--Review Counts
SELECT source,
       tab,
       period,
       Count(*) AS row_cnt,
       Sum(case when meets_cov_rule = True  THEN 1 ELSE 0 END) AS True,
       Sum(case when meets_cov_rule = False THEN 1 ELSE 0 END) AS False
  FROM dash_pld_stability_cnts
 GROUP BY source, tab, period
 ORDER BY source, tab, period;

Query started at 04:26:47 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:26:49 PM Eastern Daylight Time; Query executed in 0.18 mQuery started at 04:27:00 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:27:02 PM Eastern Daylight Time; Query executed in 0.06 m

Unnamed: 0,source,tab,period,row_cnt,true,false
0,MEDICAL,COMORBIDITY,2016,186341,154186,32155
1,MEDICAL,COMORBIDITY,2017,186341,150807,35534
2,MEDICAL,COMORBIDITY,2018,186341,144663,41678
3,MEDICAL,DEMOGRAPHICS,2018,186341,144663,41678
4,MEDICAL,SPECIALTY,2018,186341,144663,41678
5,MEDICAL,VISITS,2018,186341,144663,41678
6,PHARMACY,DRUG_UTILIZATION,2017,186341,0,186341
7,PHARMACY,DRUG_UTILIZATION,2018_Q4,186341,0,186341
8,PHARMACY,PHARMACY,2017,186341,0,186341
9,PHARMACY,PHARMACY,2018_Q4,186341,0,186341


In [9]:
%%read_sql
--Update Pharmacy Claim Counts
BEGIN;
UPDATE dash_pld_stability_cnts stab
   SET stab.meets_cov_rule = TRUE
  FROM pld_phar_cnt_unpivot phar
 WHERE stab.patient_id = phar.patient_id
       AND stab.period = phar.period
       AND phar.phar_claim_cnt >= stab.min_claim_cnt
       AND stab.source = 'PHARMACY';
COMMIT;

--Review Counts with pharmacy added
SELECT source,
       tab,
       period,
       Count(*) AS row_cnt,
       Sum(case when meets_cov_rule = True  THEN 1 ELSE 0 END) AS True,
       Sum(case when meets_cov_rule = False THEN 1 ELSE 0 END) AS False
  FROM dash_pld_stability_cnts
 GROUP BY source, tab, period
 ORDER BY source, tab, period;

Query started at 04:27:06 PM Eastern Daylight Time; Query executed in 0.06 mQuery started at 04:27:09 PM Eastern Daylight Time; Query executed in 0.19 mQuery started at 04:27:21 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 04:27:23 PM Eastern Daylight Time; Query executed in 0.07 m

Unnamed: 0,source,tab,period,row_cnt,true,false
0,MEDICAL,COMORBIDITY,2016,186341,154186,32155
1,MEDICAL,COMORBIDITY,2017,186341,150807,35534
2,MEDICAL,COMORBIDITY,2018,186341,144663,41678
3,MEDICAL,DEMOGRAPHICS,2018,186341,144663,41678
4,MEDICAL,SPECIALTY,2018,186341,144663,41678
5,MEDICAL,VISITS,2018,186341,144663,41678
6,PHARMACY,DRUG_UTILIZATION,2017,186341,80699,105642
7,PHARMACY,DRUG_UTILIZATION,2018_Q4,186341,65584,120757
8,PHARMACY,PHARMACY,2017,186341,80699,105642
9,PHARMACY,PHARMACY,2018_Q4,186341,65584,120757


## Final table
Shrink everything down to the final table

In [10]:
%%read_sql
--Identify counts of claims 
CREATE OR REPLACE TEMP TABLE tmp_remove_patients AS
    SELECT patient_id,
           tab
      FROM dash_pld_stability_cnts
     WHERE meets_cov_rule = False
    GROUP BY patient_id, tab;

CREATE OR REPLACE TABLE dash_pld_stability AS
    SELECT patient_id,
           tab
      FROM dash_pld_stability_cnts
     GROUP BY patient_id, tab;

BEGIN;
DELETE 
  FROM dash_pld_stability stab
 USING tmp_remove_patients del
 WHERE stab.patient_id = del.patient_id
       AND stab.tab = del.tab;
COMMIT;

Query started at 04:27:28 PM Eastern Daylight Time; Query executed in 0.08 mQuery started at 04:27:33 PM Eastern Daylight Time; Query executed in 0.07 mQuery started at 04:27:37 PM Eastern Daylight Time; Query executed in 0.11 mQuery started at 04:27:43 PM Eastern Daylight Time; Query executed in 0.10 mQuery started at 04:27:49 PM Eastern Daylight Time; Query executed in 0.05 m

Unnamed: 0,status
0,Statement executed successfully.


In [12]:
%%read_sql
SELECT tab,
       Count(*) AS row_cnt, 
       Count(Distinct patient_id) AS pt_cnt
  FROM dash_pld_stability
 GROUP BY tab 

Query started at 04:27:54 PM Eastern Daylight Time; Query executed in 0.08 m

Unnamed: 0,tab,row_cnt,pt_cnt
0,SPECIALTY,144663,144663
1,DEMOGRAPHICS,144663,144663
2,PHARMACY,49096,49096
3,DRUG_UTILIZATION,49096,49096
4,COMORBIDITY,116846,116846
5,VISITS,144663,144663


# Denominators
Get counts of all patients grouped by age_group, payer_type and gender to use in future queries as denominators for future stuff

## Patient Level
Create a denominator at the patient level for easier aggregations in the future

In [13]:
%%read_sql
DROP TABLE IF EXISTS dash_pld_den;
CREATE TRANSIENT TABLE dash_pld_den AS    
    SELECT coh.patient_id,
           demo.age_bucket,
           demo.gender,
           ins.insurance,
           stab.tab,
           grp.pt_grp
      FROM coh_pt coh
           INNER JOIN dash_pld_stability stab
                   ON coh.patient_id = stab.patient_id 
            LEFT JOIN pld_demo demo
                   ON coh.patient_id = demo.patient_id
            LEFT JOIN pld_med_ins_pivot ins
                   ON ins.patient_id = demo.patient_id
            LEFT JOIN dash_pld_pt_grp grp
                   ON grp.patient_id = demo.patient_id

Query started at 04:27:59 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 04:28:02 PM Eastern Daylight Time; Query executed in 0.21 m

Unnamed: 0,status
0,Table DASH_PLD_DEN successfully created.


In [14]:
%%read_sql
--Review counts to ensure no duplication
SELECT Count(*) AS row_cnt,
       Count(Distinct patient_id) AS pt_cnt
  FROM dash_pld_den

Query started at 04:28:15 PM Eastern Daylight Time; Query executed in 0.05 m

Unnamed: 0,row_cnt,pt_cnt
0,649027,150942


## Aggregation
Denominator aggregated at the key levels for future use

In [15]:
%%read_sql
DROP TABLE IF EXISTS dash_pld_den_grp;
CREATE TRANSIENT TABLE dash_pld_den_grp AS    
    SELECT tab,
           age_bucket,
           gender,
           insurance,
           pt_grp,
           Count(Distinct patient_id) AS den
      FROM dash_pld_den 
     GROUP BY tab,
              age_bucket,
              gender,
              insurance,
              pt_grp;

Query started at 04:28:17 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:28:19 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table DASH_PLD_DEN_GRP successfully created.


# Tables
All PLD tables needed or the dashboard

## Demographics
Pull together data for the demographics tab

In [16]:
%%read_sql
DROP TABLE IF EXISTS dash_pld_demographics;
CREATE TRANSIENT TABLE dash_pld_demographics AS
    SELECT coh.age_bucket,
           coh.gender,
           coh.insurance,
           coh.pt_grp,
           geo.state_abbr AS state,
           st.interstate_regions AS region,
           Count(Distinct coh.patient_id) AS pt_cnt
      FROM dash_pld_den coh
           LEFT JOIN pld_geo_medical geo
                  ON geo.patient_id = coh.patient_id
           LEFT JOIN ref_db.analytics.state_to_region st
                  ON st.state = geo.state_abbr
     WHERE coh.patient_id IN (SELECT patient_id
                                FROM dash_pld_den
                               WHERE tab = 'DEMOGRAPHICS')
     GROUP BY coh.age_bucket,
              coh.gender,
              coh.insurance,
              coh.pt_grp,
              geo.state_abbr,
              st.interstate_regions

Query started at 04:28:25 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:28:27 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table DASH_PLD_DEMOGRAPHICS successfully created.


In [17]:
%%read_sql
CREATE OR REPLACE VIEW reportdb.reportviews.demographics_pld_vw AS
     SELECT *
       FROM ref_db.semi_custom.dash_pld_demographics;
       
GRANT SELECT ON view reportdb.reportviews.demographics_pld_vw TO ROLE tableau_restricted_role;
GRANT SELECT ON view reportdb.reportviews.demographics_pld_vw TO ROLE rwd_analytics_rw;

Query started at 04:28:32 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 04:28:35 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:28:36 PM Eastern Daylight Time; Query executed in 0.05 m

Unnamed: 0,status
0,Statement executed successfully.


## Comorbidities
Create a comorbidity table for the dashboard

In [18]:
%%read_sql
DROP TABLE IF EXISTS tmp_comorbid_cnts;
CREATE TEMP TABLE tmp_comorbid_cnts AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           com.comorbidity AS comorbidity,
           Count(Distinct den.patient_id) AS pt_cnt
      FROM dash_pld_den den
           LEFT JOIN cld_elix com
                  ON den.patient_id = com.patient_id
     WHERE den.patient_id IN (SELECT patient_id
                                FROM dash_pld_den
                               WHERE tab = 'COMORBIDITY')
     GROUP BY den.age_bucket,
              den.gender,
              den.insurance,
              den.pt_grp,
              comorbidity;

Query started at 04:28:39 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:28:41 PM Eastern Daylight Time; Query executed in 0.11 m

Unnamed: 0,status
0,Table TMP_COMORBID_CNTS successfully created.


In [19]:
%%read_sql
DROP TABLE IF EXISTS pld_dash_comorbidities;
CREATE TRANSIENT TABLE pld_dash_comorbidities AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           num.comorbidity,
           CASE WHEN num.pt_cnt IS NULL THEN 0 ELSE num.pt_cnt END AS num,
           den.den,
           CASE WHEN num.pt_cnt IS NULL THEN 0 ELSE num.pt_cnt END / den.den AS pct
      FROM dash_pld_den_grp den
           LEFT JOIN tmp_comorbid_cnts num
                  ON den.age_bucket = num.age_bucket
                     AND den.gender = num.gender
                     AND den.insurance = num.insurance
                     AND den.pt_grp = num.pt_grp
     WHERE num.comorbidity IS NOT NULL
           AND den.tab = 'COMORBIDITY';
    

Query started at 04:28:48 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:28:50 PM Eastern Daylight Time; Query executed in 0.08 m

Unnamed: 0,status
0,Table PLD_DASH_COMORBIDITIES successfully crea...


In [20]:
%%read_sql
CREATE OR REPLACE VIEW reportdb.reportviews.comorbidity_pld_vw AS
     SELECT *
       FROM ref_db.semi_custom.pld_dash_comorbidities;
       
GRANT SELECT ON view reportdb.reportviews.comorbidity_pld_vw TO ROLE tableau_restricted_role;
GRANT SELECT ON view reportdb.reportviews.comorbidity_pld_vw TO ROLE rwd_analytics_rw;

Query started at 04:28:54 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 04:28:57 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:28:58 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


## Place of Service
Identify the counts of visits to places of service for each patient

### Patient Level
A sample of patient level data 

In [22]:
%%read_sql
DROP TABLE IF EXISTS dash_pld_pos_pt;
CREATE TEMP TABLE dash_pld_pos_pt AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           pos.pos_group,
           pos.patient_id,
           Count(Distinct pos.year_of_service) AS pt_svc_cnt
      FROM dash_pld_den den
           LEFT JOIN cld_med_pos pos
                  ON pos.patient_id = den.patient_id
     WHERE pos.year_of_service BETWEEN '2017-01-01' and '2017-12-31'
           AND pos.patient_id IN (SELECT patient_id
                                    FROM dash_pld_den
                                   WHERE tab = 'VISITS')
           AND pos.pos_group IS NOT NULL
           AND den.age_bucket IS NOT NULL
           AND den.gender IS NOT NULL
           AND den.pt_grp IS NOT NULL
           AND pos.pos_group IS NOT NULL
     GROUP BY den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           pos.pos_group,
           pos.patient_id;

Query started at 04:29:00 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:29:02 PM Eastern Daylight Time; Query executed in 0.11 m

Unnamed: 0,status
0,Table DASH_PLD_POS_PT successfully created.


In [56]:
%%read_sql
CREATE OR REPLACE TEMP TABLE ref_db.semi_custom.tmp_pt_cnt AS
    SELECT DISTINCT patient_id
      FROM ref_db.semi_custom.dash_pld_pos_pt;

CREATE OR REPLACE TEMP TABLE ref_db.semi_custom.tmp_pt_cnt_sample AS
    SELECT patient_id
      FROM ref_db.semi_custom.tmp_pt_cnt
     SAMPLE (20000 rows);

CREATE OR REPLACE TABLE ref_db.semi_custom.dash_pld_pos_pt_sample AS
    SELECT *
      FROM ref_db.semi_custom.dash_pld_pos_pt
     WHERE patient_id IN (SELECT patient_id
                            FROM ref_db.semi_custom.tmp_pt_cnt_sample);

CREATE OR REPLACE VIEW reportdb.reportviews.pos_pt_vw AS
    SELECT *
      FROM ref_db.semi_custom.dash_pld_pos_pt_sample;

GRANT ALL ON reportdb.reportviews.pos_pt_vw TO ROLE rwd_analytics_rw;
GRANT ALL ON reportdb.reportviews.pos_pt_vw TO ROLE tableau_restricted_role;



Query started at 04:39:55 PM Eastern Daylight Time; Query executed in 0.06 mQuery started at 04:39:58 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 04:40:01 PM Eastern Daylight Time; Query executed in 0.07 mQuery started at 04:40:05 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 04:40:08 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:40:09 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


In [55]:
%%read_sql
SELECT Count(*)
  FROM ref_db.semi_custom.tmp_pt_cnt_sample;

Query started at 04:37:47 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,COUNT(*)
0,20000


In [53]:
%%read_sql
SELECT *
  FROM reportdb.reportviews.pos_pt_vw
 limit 10;

Query started at 04:37:17 PM Eastern Daylight Time; Query executed in 0.04 m

Unnamed: 0,age_bucket,gender,insurance,pt_grp,pos_group,patient_id,pt_svc_cnt
0,35-54,M,Commercial,Radiation,Other,156754094,17
1,35-54,F,Commercial,,Outpatient Hospital,99917492,7
2,65_PLUS,M,Medicare Advantage,,Office,211887507,5
3,65_PLUS,M,Medicare Advantage,,Outpatient Hospital,74659638,6
4,35-54,F,Medicaid,,Outpatient Hospital,21731026,8
5,65_PLUS,M,Medicare Advantage,,Other,20201447,11
6,65_PLUS,M,Medicare Advantage,,Home Health,70447680,5
7,65_PLUS,F,Medicare Advantage,,Inpatient Hospital,208282158,15
8,35-54,M,Commercial,,Office,404410613,7
9,35-54,F,Commercial,,Office,243359782,19


### Aggregation
Aggregrated data to speed up processing time

In [24]:
%%read_sql
DROP TABLE IF EXISTS tmp_dash_pos;
CREATE TEMP TABLE tmp_dash_pos AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           pos.pos_group,
           pos.patient_id,
           Count(Distinct pos.patient_id, pos.year_of_service) AS pt_svc_cnt
      FROM dash_pld_den den
           LEFT JOIN cld_med_pos pos
                  ON pos.patient_id = den.patient_id
     WHERE pos.year_of_service BETWEEN '2017-01-01' and '2017-12-31'
           AND pos.pos_group IS NOT NULL
           AND pos.patient_id IN (SELECT patient_id
                                    FROM dash_pld_den
                                   WHERE tab = 'VISITS')
     GROUP BY den.age_bucket,
              den.gender,
              den.insurance,
              den.pt_grp,
              pos.pos_group,
              pos.patient_id;

Query started at 04:29:21 PM Eastern Daylight Time; Query executed in 0.02 mQuery started at 04:29:22 PM Eastern Daylight Time; Query executed in 0.14 m

Unnamed: 0,status
0,Table TMP_DASH_POS successfully created.


In [25]:
%%read_sql  
DROP TABLE IF EXISTS dash_pld_pos;
CREATE TRANSIENT TABLE dash_pld_pos AS
    SELECT num.age_bucket,
           num.gender,
           num.insurance,
           num.pt_grp,
           num.pos_group,
           Count(Distinct num.patient_id) AS pt_cnt,
           den.den
      FROM tmp_dash_pos num
           JOIN dash_pld_den_grp den
             ON den.age_bucket = num.age_bucket
                AND den.gender = num.gender
                AND den.insurance = num.insurance 
                AND den.pt_grp = num.pt_grp
      WHERE den.tab = 'VISITS'
     GROUP BY num.age_bucket,
              num.gender,
              num.insurance,
              num.pt_grp,
              num.pos_group,
              den.den

Query started at 04:29:31 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:29:32 PM Eastern Daylight Time; Query executed in 0.05 m

Unnamed: 0,status
0,Table DASH_PLD_POS successfully created.


In [26]:
%%read_sql

CREATE OR REPLACE VIEW reportdb.reportviews.pos_pld_vw AS
     SELECT *
       FROM ref_db.semi_custom.dash_pld_pos;

GRANT SELECT ON reportdb.reportviews.pos_pld_vw TO ROLE rwd_analytics_rw;
GRANT SELECT ON reportdb.reportviews.pos_pld_vw TO ROLE tableau_restricted_role;


Query started at 04:29:35 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:29:37 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:29:39 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


## RX Groups
Summarization of the RX groups.

In [27]:
%%read_sql

DROP TABLE IF EXISTS tmp_pharmacy_cnts;
CREATE TEMP TABLE tmp_pharmacy_cnts AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           rx.class AS product_name,
           Count(Distinct den.patient_id) AS pt_cnt
      FROM dash_pld_den den
           LEFT JOIN cld_phar_grp rx
                  ON rx.patient_id = den.patient_id
     WHERE den.patient_id IN (SELECT patient_id
                                FROM dash_pld_stability
                               WHERE tab = 'PHARMACY')
     GROUP BY den.age_bucket,
              den.gender,
              den.insurance,
              den.pt_grp,
              rx.class;

Query started at 04:29:41 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:29:43 PM Eastern Daylight Time; Query executed in 0.06 m

Unnamed: 0,status
0,Table TMP_PHARMACY_CNTS successfully created.


In [28]:
%%read_sql
DROP TABLE IF EXISTS pld_dash_pharmacy_cnt;
CREATE TRANSIENT TABLE pld_dash_pharmacy_cnt AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           num.product_name,
           CASE WHEN num.pt_cnt IS NULL THEN 0 ELSE num.pt_cnt END AS num,
           den.den,
           CASE WHEN num.pt_cnt IS NULL THEN 0 ELSE num.pt_cnt END / den.den AS pct
      FROM dash_pld_den_grp den
           LEFT JOIN tmp_pharmacy_cnts num
                  ON den.age_bucket = num.age_bucket
                     AND den.gender = num.gender
                     AND den.insurance = num.insurance
                     AND den.pt_grp = num.pt_grp
     WHERE num.product_name IS NOT NULL
           AND den.tab = 'PHARMACY';

Query started at 04:29:47 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 04:29:49 PM Eastern Daylight Time; Query executed in 0.05 m

Unnamed: 0,status
0,Table PLD_DASH_PHARMACY_CNT successfully created.


In [29]:
%%read_sql
CREATE OR REPLACE VIEW reportdb.reportviews.pharmacy_cnt_vw AS
    SELECT * 
      FROM ref_db.semi_custom.pld_dash_pharmacy_cnt;

GRANT SELECT ON view reportdb.reportviews.pharmacy_cnt_vw TO ROLE tableau_restricted_role;
GRANT SELECT ON VIEW reportdb.reportviews.pharmacy_cnt_vw TO ROLE rwd_analytics_rw;

Query started at 04:29:52 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 04:29:55 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


## Health Service Utilization
Count of procedures groupes

In [63]:
%%read_sql
DROP TABLE IF EXISTS tmp_proc_cnts;
CREATE TEMP TABLE tmp_proc_cnts AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           proc.cat3 AS procedure,
           Count(Distinct den.patient_id) AS pt_cnt,
           Count(Distinct proc.patient_id, proc.year_of_service) as claim_cnt
      FROM dash_pld_den den
            LEFT JOIN cld_proc proc
                   ON proc.patient_id = den.patient_id
     WHERE den.tab = 'VISITS'
           AND year(proc.year_of_service) = 2018
     GROUP BY den.age_bucket,
              den.gender,
              den.insurance,
              den.pt_grp,
              proc.cat3;

Query started at 04:46:02 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:46:04 PM Eastern Daylight Time; Query executed in 0.08 m

Unnamed: 0,status
0,Table TMP_PROC_CNTS successfully created.


In [64]:
%%read_sql
DROP TABLE IF EXISTS pld_dash_proc;
CREATE TRANSIENT TABLE pld_dash_proc AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           num.procedure,
           CASE WHEN num.pt_cnt IS NULL THEN 0 ELSE num.pt_cnt END AS num,
           den.den,
           CASE WHEN num.pt_cnt IS NULL THEN 0 ELSE num.pt_cnt END / den.den AS pct
      FROM dash_pld_den_grp den
           LEFT JOIN tmp_proc_cnts num
                  ON den.age_bucket = num.age_bucket
                     AND den.gender = num.gender
                     AND den.insurance = num.insurance
                     AND den.pt_grp = num.pt_grp
     WHERE num.procedure IS NOT NULL
           AND den.tab = 'VISITS';

Query started at 04:46:09 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:46:10 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table PLD_DASH_PROC successfully created.


In [65]:
%%read_sql
CREATE OR REPLACE VIEW reportdb.reportviews.proc_vw AS
    SELECT *
      FROM ref_db.semi_custom.pld_dash_proc;

GRANT SELECT ON reportdb.reportviews.proc_vw TO ROLE tableau_restricted_role;
GRANT SELECT ON reportdb.reportviews.proc_vw TO ROLE rwd_analytics_rw;

Query started at 04:46:16 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 04:46:18 PM Eastern Daylight Time; Query executed in 0.08 mQuery started at 04:46:23 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


## Specialties
Figure out specialties.  
* We're going to need to make an input to limit to 10ish specialties per project. 
* We'll need to parameterize this.  
* I'll either use the current healthbase table or William Seale's table.  
* Do we want to add a process for a bit of manual mapping to get the specialties how they want it

### Upload

In [33]:
#Upload reference table from excel to snowflake and review snowflake output
df = pd.read_excel('../../in/dash/pld.xlsx', sheet_name='specialty', skiprows=4, dtype=str)

#Strip white space and make referrable columns uppercase
df = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
df = df.loc[df.specialty != 'nan',['primary_specialty','agg_specialty','specialty']]

#Upload to snowflake
snow.drop_table("dash_pld_spec_ref")
snow.upload_dataframe(df,"dash_pld_spec_ref")
snow.select("SELECT * FROM dash_pld_spec_ref")
#del df

DROP TABLE IF EXISTS ref_db.semi_custom.dash_pld_spec_ref;
Table ref_db.semi_custom.dash_pld_spec_ref dropped! (╯°□°）╯︵ ┻━┻
Upload into ref_db.semi_custom.dash_pld_spec_ref successful! ┬──┬◡ﾉ(°-°ﾉ)


Unnamed: 0,primary_specialty,agg_specialty,specialty
0,NEUROLOGY,NEUROLOGY,Neurologist
1,NEUROLOGICAL SURGERY,NEUROLOGICAL SURGERY,Neurosurgeon
2,CLINICAL NEUROPSYCHOLOGIST,CLINICAL NEUROPSYCHOLOGIST,Psychiatrist
3,CLINICAL NEUROPSYCHOLOGIST,"CLINICAL NEUROPSYCHOLOGIST, CLINICAL",Psychologist
4,NEUROLOGY,"NEUROLOGY, CLINICAL NEUROPHYSIOLOGY",Neurologist
5,CLINICAL NEUROPSYCHOLOGIST,"PSYCHOLOGIST, CLINICAL NEUROPSYCHOLOGIST",Psychologist
6,NEUROLOGY,"NEUROLOGY, INTERNAL MEDICINE",Neurologist
7,NEUROLOGY,"SLEEP MEDICINE, NEUROLOGY",Neuro Sleep
8,NEUROLOGY WITH SPECIAL QUALIFICATIONS IN CHILD...,"PEDIATRICS, NEUROLOGY WITH SPECIAL QUALIFICATI...",Neuro Pedi
9,NEUROLOGY,"INTERNAL MEDICINE, NEUROLOGY",Neurologist


### Patient Level

In [35]:
%%read_sql
DROP TABLE IF EXISTS dash_pld_spec_pt;
CREATE TEMP TABLE dash_pld_spec_pt AS
    SELECT den.age_bucket,
           den.gender,
           den.insurance,
           den.pt_grp,
           spec.specialty,
           den.patient_id,
           Count(Distinct prov.year_of_service) AS pt_svc_cnt
      FROM dash_pld_den den
           LEFT JOIN de_raven_provider prov
                  ON prov.patient_id = den.patient_id
           LEFT JOIN de_provider_affiliation aff
                  ON aff.provider_npi = prov.provider_npi
                JOIN dash_pld_spec_ref spec
                  ON spec.primary_specialty = aff.primary_specialty
                     AND spec.agg_specialty = aff.agg_specialty
     WHERE prov.year_of_service BETWEEN '2018-01-01' AND '2018-12-31'  --Need parameter
           AND den.patient_id IN (SELECT patient_id
                                    FROM dash_pld_den
                                   WHERE tab = 'SPECIALTY')
     GROUP BY den.age_bucket,
              den.gender,
              den.insurance,
              den.pt_grp,
              spec.specialty,
              den.patient_id;

Query started at 04:30:20 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:30:22 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table DASH_PLD_SPEC_PT successfully created.


In [57]:
%%read_sql
CREATE OR REPLACE TEMP TABLE ref_db.semi_custom.tmp_pt_cnt AS
    SELECT DISTINCT patient_id
      FROM ref_db.semi_custom.dash_pld_spec_pt;

CREATE OR REPLACE TEMP TABLE ref_db.semi_custom.tmp_pt_cnt_sample AS
    SELECT patient_id
      FROM ref_db.semi_custom.tmp_pt_cnt
     SAMPLE (20000 rows);

CREATE OR REPLACE TABLE ref_db.semi_custom.dash_pld_spec_pt_sample AS
    SELECT *
      FROM ref_db.semi_custom.dash_pld_spec_pt
     WHERE patient_id IN (SELECT patient_id
                            FROM ref_db.semi_custom.tmp_pt_cnt_sample);
                            
CREATE OR REPLACE VIEW reportdb.reportviews.spec_pt_vw AS
    SELECT *
      FROM ref_db.semi_custom.dash_pld_spec_pt_sample;

GRANT SELECT ON reportdb.reportviews.spec_pt_vw TO ROLE rwd_analytics_rw;
GRANT SELECT ON reportdb.reportviews.spec_pt_vw TO ROLE tableau_restricted_role;

Query started at 04:41:02 PM Eastern Daylight Time; Query executed in 0.07 mQuery started at 04:41:06 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 04:41:09 PM Eastern Daylight Time; Query executed in 0.06 mQuery started at 04:41:13 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 04:41:16 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:41:18 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


### Aggregation

In [37]:
%%read_sql
DROP TABLE IF EXISTS dash_pld_spec;
CREATE TRANSIENT TABLE dash_pld_spec AS
    SELECT num.age_bucket,
           num.gender,
           num.insurance,
           num.pt_grp,
           num.specialty,
           Count(Distinct num.patient_id) AS pt_cnt,
           den.den
      FROM dash_pld_spec_pt num
           JOIN dash_pld_den_grp den
             ON den.age_bucket = num.age_bucket
                AND den.gender = num.gender
                AND den.insurance = num.insurance  
                AND den.pt_grp = num.pt_grp
     WHERE den.tab = 'SPECIALTY'
           AND num.specialty IS NOT NULL
     GROUP BY num.age_bucket,
              num.gender,
              num.insurance,
              num.pt_grp,
              num.specialty,
              --ref.visit_bucket,
              den.den

Query started at 04:30:42 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:30:44 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table DASH_PLD_SPEC successfully created.


In [41]:
%%read_sql

CREATE OR REPLACE VIEW reportdb.reportviews.spec_pld_vw as
    SELECT * 
      FROM ref_db.semi_custom.dash_pld_spec;

GRANT SELECT ON VIEW reportdb.reportviews.spec_pld_vw TO ROLE tableau_restricted_role;
GRANT SELECT ON VIEW reportdb.reportviews.spec_pld_vw TO ROLE rwd_analytics_rw;

Query started at 04:30:53 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 04:30:55 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 04:30:57 PM Eastern Daylight Time; Query executed in 0.04 m

Unnamed: 0,status
0,Statement executed successfully.


# Output
Create an excel output to review the data

In [66]:
sheet_names = ['demo','comorbid','pos_pt','pos','rx_grp','utilize','spec_pt', 'spec']
sheet_titles = ['reportdb.reportviews.demographics_pld_vw',
                'reportdb.reportviews.comorbidity_pld_vw',
                'reportdb.reportviews.pos_pt_vw',
                'reportdb.reportviews.pos_pld_vw',
                'reportdb.reportviews.pharmacy_cnt_vw',
                'reportdb.reportviews.proc_vw',
                'reportdb.reportviews.spec_pt_vw',
                'reportdb.reportviews.spec_pld_vw']

In [67]:
demo     = snow.select("SELECT * FROM reportdb.reportviews.demographics_pld_vw ORDER BY age_bucket, gender, insurance, pt_grp")
comorbid = snow.select("SELECT * FROM reportdb.reportviews.comorbidity_pld_vw  ORDER BY age_bucket, gender, insurance, pt_grp")
pos_pt   = snow.select("SELECT * FROM reportdb.reportviews.pos_pt_vw           ORDER BY age_bucket, gender, insurance, pt_grp")
pos      = snow.select("SELECT * FROM reportdb.reportviews.pos_pld_vw          ORDER BY age_bucket, gender, insurance, pt_grp")
rx_grp   = snow.select("SELECT * FROM reportdb.reportviews.pharmacy_cnt_vw     ORDER BY age_bucket, gender, insurance, pt_grp")
utilize  = snow.select("SELECT * FROM reportdb.reportviews.proc_vw             ORDER BY age_bucket, gender, insurance, pt_grp")
spec_pt  = snow.select("SELECT * FROM reportdb.reportviews.spec_pt_vw          ORDER BY age_bucket, gender, insurance, pt_grp")
spec     = snow.select("SELECT * FROM reportdb.reportviews.spec_pld_vw         ORDER BY age_bucket, gender, insurance, pt_grp")

In [68]:
make_xlsx(data = [demo, comorbid, pos_pt, pos, rx_grp, utilize, spec_pt, spec],
           xlsx_path = '../../out/dash/pld_dash_data.xlsx',
           workbook_title = 'PLD Dashboard Data',
           sheet_names = sheet_names,
           sheet_titles = sheet_titles)