# Comorbidities
Tidy tables of the patient's custom comorbidities identified from medical claims between two dates.  All table start with `pld_comorbid`

**Script**
* [scripts/pld/comorbidities.ipynb](./scripts/pld/comorbidities.ipynb)

**Prior Script(s)**
* [scripts/extract/raven_diagnosis.ipynb](./scripts/extract/raven_diagnosis.ipynb)

**Parameters**
* `in/pld/comorbid_custom.xlsx[param]`

**Input**
* `coh_pt`
* `de_raven_diagnosis`
* `pld_comorbid_ref`: `in/pld/comorbid_custom.xlsx[ref]`
  
**Output**  
* `pld_comorbid`
* `pld_comorbid_pivot` (TBD)

**Review**
* [scripts/pld/comorbidities.html](./scripts/pld/comorbidities.html)

In [1]:
#Import libraries for this notebook
import pandas as pd  
from drg_connect import Snowflake
import numpy as np
import pickle
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

#Load connection variables to connect_dict
with open('../../out/conn/connect_dict.pickle', 'rb') as handle:
    connect_dict = pickle.load(handle)

#Create Eegine to connect to snowflake
snow = Snowflake(role=connect_dict['role'],
                 warehouse=connect_dict['warehouse'],
                 database=connect_dict['database'],
                 schema=connect_dict['schema'])

#Finish engine setup
engine = snow.engine
%load_ext sql_magic
%config SQL.conn_name = 'engine'  #Set the sql_magic connection engine
%config SQL.output_result = True  #Enable output to std out
%config SQL.notify_result = False #disable browser notifications


# Parameters
Create python variables of the parameters

 **Input**  
* `in/pld/comorbidities.xlsx[param]`

**Output**
* Python variables named after parameters with the value

In [2]:
#Create system variables from excel into script and review values in dictionary
input_df = pd.read_excel('../../in/pld/comorbidities.xlsx', sheet_name='param', skiprows=4, dtype=str)
var_dict = dict(zip(input_df.parameter, input_df.value))
for key,val in var_dict.items(): exec(key + '=val')

#Check inputs
pd.DataFrame.from_dict(var_dict, orient='index')

Unnamed: 0,0
min_dt,2015-01-01
max_dt,2018-12-31


# Reference
Upload reference table used to identify the custom comorbidities

**Input**  
  * `in/pld/comorbidities.xlsx[ref]`

**Output**  
* `coh_basic_ref`

In [3]:
#Upload reference table from excel to snowflake and review snowflake output
df = pd.read_excel('../../in/pld/comorbidities.xlsx', sheet_name='ref', skiprows=4, dtype=str)

#Strip white space and make referrable columns uppercase
df = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
df[['code_type','code']] =  \
    df[['code_type','code']].apply(lambda x: x.str.upper() if x.dtype == "object" else x)
df.code= df.code.str.replace(r'\W+','').astype('str')

#Upload to snowflake
snow.drop_table("pld_comorbid_ref")
snow.upload_dataframe(df,"pld_comorbid_ref")
snow.select("SELECT * FROM pld_comorbid_ref")

DROP TABLE IF EXISTS ref_db.semi_custom.pld_comorbid_ref;
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Table ref_db.semi_custom.pld_comorbid_ref dropped! (╯°□°）╯︵ ┻━┻
Upload into ref_db.semi_custom.pld_comorbid_ref successful! ┬──┬◡ﾉ(°-°ﾉ)


Unnamed: 0,cat1,cat2,code_type,code,description,source
0,sleep disorders,narcolepsy,ICD9,34700,Narcolepsy w/o cataplexy,DRG Treatment algorithms from Tamara Blutstein
1,sleep disorders,narcolepsy,ICD9,34701,Narcolepsy w/cataplexy,DRG Treatment algorithms from Tamara Blutstein
2,sleep disorders,narcolepsy,ICD9,3471,Narcolepsy in conditions classifed elsewhere,DRG Treatment algorithms from Tamara Blutstein
3,sleep disorders,narcolepsy,ICD9,34711,Narcolepsy w/cataplexy in conditions classifed...,DRG Treatment algorithms from Tamara Blutstein
4,sleep disorders,narcolepsy,ICD10,G474,Narcolepsy and cataplexy,DRG Treatment algorithms from Tamara Blutstein
5,sleep disorders,narcolepsy,ICD10,G4741,Narcolepsy,DRG Treatment algorithms from Tamara Blutstein
6,sleep disorders,narcolepsy,ICD10,G47411,Narcolepsy w/cataplexy,DRG Treatment algorithms from Tamara Blutstein
7,sleep disorders,narcolepsy,ICD10,G47419,Narcolepsy w/o cataplexy,DRG Treatment algorithms from Tamara Blutstein
8,sleep disorders,narcolepsy,ICD10,G4742,Narcolepsy in conditions classifed elsewhere,DRG Treatment algorithms from Tamara Blutstein
9,sleep disorders,narcolepsy,ICD10,G47421,Narcolepsy in conditions classifed elsewhere w...,DRG Treatment algorithms from Tamara Blutstein


# Comorbidities
Identify the patients comorbidities based on the parameters and inputs

**Parameters**
  * `min_dt`
  * `max_dt`
  
**Input**
  * `coh_pt`
  * `de_raven_diagnosis`
  * `pld_comorbid_ref
  
**Output**  
* `pld_comorbid`

In [4]:
%%read_sql
--Create raven diagnosis table
DROP TABLE IF EXISTS pld_comorbid; 
CREATE TRANSIENT TABLE pld_comorbid AS
      SELECT coh.patient_id,
             ref.cat1,
             ref.cat2
        FROM coh_pt coh
             JOIN de_raven_diagnosis dx
               ON coh.patient_id = dx.patient_id
             JOIN pld_comorbid_ref ref
               ON ref.code = dx.diagnosis
       WHERE year_of_service BETWEEN '{min_dt}' AND '{max_dt}'
       GROUP BY coh.patient_id, ref.cat1, ref.cat2

Query started at 01:17:29 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 01:17:31 PM Eastern Daylight Time; Query executed in 0.16 m

Unnamed: 0,status
0,Table PLD_COMORBID successfully created.


In [5]:
%%read_sql
--Review counts as a sanity check
SELECT Count(*) AS row_cnt,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct cat1) AS cat1_cnt,
       Count(Distinct cat2) AS cat2_cnt
  FROM pld_comorbid;

Query started at 01:17:40 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,row_cnt,pt_cnt,cat1_cnt,cat2_cnt
0,495544,148007,2,13


In [6]:
%%read_sql
--Quick distribution of the counts
SELECT cat1,
       cat2,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct patient_id) / (SELECT Count(*)
                                      FROM coh_pt) AS pct 
  FROM pld_comorbid
 GROUP BY cat1, cat2
 ORDER BY pt_cnt desc;

Query started at 01:17:42 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,cat1,cat2,pt_cnt,pct
0,sleep disorders,narcolepsy,144795,0.932585
1,sleep disorders,hypersomnia,115050,0.741006
2,sleep disorders,sleep related breathing disorders,68280,0.439773
3,psychiatric,depression,50463,0.325018
4,psychiatric,anxiety,45536,0.293285
5,sleep disorders,insomnia,38639,0.248863
6,sleep disorders,sleep related movement disorders,10608,0.068323
7,sleep disorders,periodic limb movement disorder,7300,0.047017
8,sleep disorders,parasomnia,4092,0.026355
9,sleep disorders,parasomnias,3939,0.02537


In [7]:
%%read_sql
--Quick distribution of the counts
SELECT cat1,
       cat2,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct patient_id) / (SELECT Count(*)
                                      FROM coh_pt) AS pct 
  FROM pld_comorbid
 GROUP BY cat1, cat2
 ORDER BY pt_cnt desc;

Query started at 01:17:44 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,cat1,cat2,pt_cnt,pct
0,sleep disorders,narcolepsy,144795,0.932585
1,sleep disorders,hypersomnia,115050,0.741006
2,sleep disorders,sleep related breathing disorders,68280,0.439773
3,psychiatric,depression,50463,0.325018
4,psychiatric,anxiety,45536,0.293285
5,sleep disorders,insomnia,38639,0.248863
6,sleep disorders,sleep related movement disorders,10608,0.068323
7,sleep disorders,periodic limb movement disorder,7300,0.047017
8,sleep disorders,parasomnia,4092,0.026355
9,sleep disorders,parasomnias,3939,0.02537


In [8]:
%%read_sql
--Quick distribution of the counts
SELECT cat1,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct patient_id) / (SELECT Count(*)
                                      FROM coh_pt) AS pct 
  FROM pld_comorbid
 GROUP BY cat1
 ORDER BY pt_cnt desc;

Query started at 01:17:46 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,cat1,pt_cnt,pct
0,sleep disorders,146691,0.944797
1,psychiatric,67057,0.431896


# Comorbidities Pivot (TBD)
Pivot the comorbidities at each level