# Comorbidities
Tidy tables of the patient's custom comorbidities identified from medical claims between two dates.  All table start with `cld_comorbid`

**Script**
* [scripts/cld/comorbidities.ipynb](./scripts/cld/comorbidities.ipynb)

**Prior Script(s)**
* [scripts/extract/raven_diagnosis.ipynb](./scripts/extract/raven_diagnosis.ipynb)

**Parameters**
* `in/cld/comorbid_custom.xlsx[param]`

**Input**
* `coh_pt`
* `de_raven_diagnosis`
* `cld_comorbid_ref`: `in/cld/comorbid_custom.xlsx[ref]`
  
**Output**  
* `cld_comorbid`
* `cld_comorbid_pivot` (TBD)

**Review**
* [scripts/cld/comorbidities.html](./scripts/cld/comorbidities.html)

In [16]:
#Import libraries for this notebook
import pandas as pd  
from drg_connect import Snowflake
import numpy as np
import pickle
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

#Load connection variables to connect_dict
with open('../../out/conn/connect_dict.pickle', 'rb') as handle:
    connect_dict = pickle.load(handle)

#Create Eegine to connect to snowflake
snow = Snowflake(role=connect_dict['role'],
                 warehouse=connect_dict['warehouse'],
                 database=connect_dict['database'],
                 schema=connect_dict['schema'])

#Finish engine setup
engine = snow.engine
%load_ext sql_magic
%config SQL.conn_name = 'engine'  #Set the sql_magic connection engine
%config SQL.output_result = True  #Enable output to std out
%config SQL.notify_result = False #disable browser notifications


The sql_magic extension is already loaded. To reload it, use:
  %reload_ext sql_magic


# Reference
Upload reference table used to identify the custom comorbidities

**Input**  
  * `in/cld/comorbidities.xlsx[ref]`

**Output**  
* `coh_basic_ref`

In [17]:
#Upload reference table from excel to snowflake and review snowflake output
df = pd.read_excel('../../in/cld/comorbidities.xlsx', sheet_name='ref', skiprows=4, dtype=str)

#Strip white space and make referrable columns uppercase
df = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
df[['code_type','code']] =  \
    df[['code_type','code']].apply(lambda x: x.str.upper() if x.dtype == "object" else x)
df.code= df.code.str.replace(r'\W+','').astype('str')

#Upload to snowflake
snow.drop_table("cld_comorbid_ref")
snow.upload_dataframe(df,"cld_comorbid_ref")
snow.select("SELECT * FROM cld_comorbid_ref")

DROP TABLE IF EXISTS ref_db.semi_custom.cld_comorbid_ref;
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Table ref_db.semi_custom.cld_comorbid_ref dropped! (╯°□°）╯︵ ┻━┻
Upload into ref_db.semi_custom.cld_comorbid_ref successful! ┬──┬◡ﾉ(°-°ﾉ)


Unnamed: 0,cat1,cat2,code_type,code,description,source
0,sleep disorders,narcolepsy,ICD9,34700,Narcolepsy w/o cataplexy,DRG Treatment algorithms from Tamara Blutstein
1,sleep disorders,narcolepsy,ICD9,34701,Narcolepsy w/cataplexy,DRG Treatment algorithms from Tamara Blutstein
2,sleep disorders,narcolepsy,ICD9,3471,Narcolepsy in conditions classifed elsewhere,DRG Treatment algorithms from Tamara Blutstein
3,sleep disorders,narcolepsy,ICD9,34711,Narcolepsy w/cataplexy in conditions classifed...,DRG Treatment algorithms from Tamara Blutstein
4,sleep disorders,narcolepsy,ICD10,G474,Narcolepsy and cataplexy,DRG Treatment algorithms from Tamara Blutstein
5,sleep disorders,narcolepsy,ICD10,G4741,Narcolepsy,DRG Treatment algorithms from Tamara Blutstein
6,sleep disorders,narcolepsy,ICD10,G47411,Narcolepsy w/cataplexy,DRG Treatment algorithms from Tamara Blutstein
7,sleep disorders,narcolepsy,ICD10,G47419,Narcolepsy w/o cataplexy,DRG Treatment algorithms from Tamara Blutstein
8,sleep disorders,narcolepsy,ICD10,G4742,Narcolepsy in conditions classifed elsewhere,DRG Treatment algorithms from Tamara Blutstein
9,sleep disorders,narcolepsy,ICD10,G47421,Narcolepsy in conditions classifed elsewhere w...,DRG Treatment algorithms from Tamara Blutstein


# Comorbidities
Identify the patients comorbidities based on the parameters and inputs

**Parameters**
  * NONE
  
**Input**
  * `coh_pt`
  * `de_raven_diagnosis`
  * `cld_comorbid_ref
  
**Output**  
* `cld_comorbid`

In [18]:
%%read_sql
--Create raven diagnosis table
DROP TABLE IF EXISTS cld_comorbid; 
CREATE TRANSIENT TABLE cld_comorbid AS
      SELECT coh.patient_id,
             dx.claim_id,
             dx.year_of_service,
             ref.cat1,
             ref.cat2
        FROM coh_pt coh
             JOIN de_raven_diagnosis dx
               ON coh.patient_id = dx.patient_id
             JOIN cld_comorbid_ref ref
               ON ref.code = dx.diagnosis
       GROUP BY coh.patient_id, dx.claim_id, dx.year_of_service, ref.cat1, ref.cat2

Query started at 03:07:34 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 03:07:36 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table CLD_COMORBID successfully created.


In [19]:
%%read_sql
--Review counts as a sanity check
SELECT Count(*) AS row_cnt,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct cat1) AS cat1_cnt,
       Count(Distinct cat2) AS cat2_cnt
  FROM cld_comorbid;

Query started at 03:07:41 PM Eastern Daylight Time; Query executed in 0.10 m

Unnamed: 0,row_cnt,pt_cnt,cat1_cnt,cat2_cnt
0,2257808,133285,2,13


In [20]:
%%read_sql
--Quick distribution of the counts
SELECT cat1,
       cat2,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct patient_id) / (SELECT Count(*)
                                      FROM coh_pt) AS pct 
  FROM cld_comorbid
 GROUP BY cat1, cat2
 ORDER BY pt_cnt desc;

Query started at 03:07:47 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,cat1,cat2,pt_cnt,pct
0,sleep disorders,narcolepsy,68409,0.387773
1,sleep disorders,hypersomnia,67046,0.380047
2,sleep disorders,sleep related breathing disorders,63861,0.361993
3,psychiatric,depression,62453,0.354012
4,psychiatric,anxiety,55934,0.317059
5,sleep disorders,insomnia,35709,0.202415
6,sleep disorders,sleep related movement disorders,11147,0.063186
7,sleep disorders,periodic limb movement disorder,5197,0.029459
8,sleep disorders,parasomnias,3575,0.020265
9,psychiatric,suicidal ideation,3443,0.019516


In [21]:
%%read_sql
--Quick distribution of the counts
SELECT cat1,
       cat2,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct patient_id) / (SELECT Count(*)
                                      FROM coh_pt) AS pct 
  FROM cld_comorbid
 GROUP BY cat1, cat2
 ORDER BY pt_cnt desc;

Query started at 03:07:50 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,cat1,cat2,pt_cnt,pct
0,sleep disorders,narcolepsy,68409,0.387773
1,sleep disorders,hypersomnia,67046,0.380047
2,sleep disorders,sleep related breathing disorders,63861,0.361993
3,psychiatric,depression,62453,0.354012
4,psychiatric,anxiety,55934,0.317059
5,sleep disorders,insomnia,35709,0.202415
6,sleep disorders,sleep related movement disorders,11147,0.063186
7,sleep disorders,periodic limb movement disorder,5197,0.029459
8,sleep disorders,parasomnias,3575,0.020265
9,psychiatric,suicidal ideation,3443,0.019516


In [22]:
%%read_sql
--Quick distribution of the counts
SELECT cat1,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct patient_id) / (SELECT Count(*)
                                      FROM coh_pt) AS pct 
  FROM cld_comorbid
 GROUP BY cat1
 ORDER BY pt_cnt desc;

Query started at 03:07:51 PM Eastern Daylight Time; Query executed in 0.04 m

Unnamed: 0,cat1,pt_cnt,pct
0,sleep disorders,107969,0.612017
1,psychiatric,81457,0.461735
