# Elixhauser
Tidy tables of the patient's Elixhauser comorbidities identified from medical claims between two dates.  All table start with `cld_elix`

**Script**
* [scripts/cld/elixhauser.ipynb](./scripts/cld/elixhauser.ipynb)

**Prior Script(s)**
* [scripts/de/raven_diagnosis.ipynb](./scripts/de/raven_diagnosis.ipynb)

**Parameters**
* `in/cld/elixhauser.xlsx[param]`

**Input**
* `coh_pt`
* `de_raven_diagnosis`
* `rwd_db.rwd_reference_library.elixhauser_comorbidities` (To be moved soon)

**Output**  
* `cld_elix`
* `cld_elix_pivot` (TBD)

**Review**
* [scripts/cld/elixhauser.html](./scripts/cld/elixhauser.html)

In [1]:
#Import libraries for this notebook
import pandas as pd  
from drg_connect import Snowflake
import numpy as np
import pickle
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

#Load connection variables to connect_dict
with open('../../out/conn/connect_dict.pickle', 'rb') as handle:
    connect_dict = pickle.load(handle)

#Create Eegine to connect to snowflake
snow = Snowflake(role=connect_dict['role'],
                 warehouse=connect_dict['warehouse'],
                 database=connect_dict['database'],
                 schema=connect_dict['schema'])

#Finish engine setup
engine = snow.engine
%load_ext sql_magic
%config SQL.conn_name = 'engine'  #Set the sql_magic connection engine
%config SQL.output_result = True  #Enable output to std out
%config SQL.notify_result = False #disable browser notifications


# Extract Data
Extract subset of raven diagnosis for the patients of interest between specified date ranges

**Parameters**
  * `min_dx_dt`
  * `max_dx_dt`
  
**Input**
  * `coh_cohort`
  * `de_raven_diagnosis`
  * `rwd_db.rwd_reference_library.elixhauser_comorbidities`
  
**Output**  
* `cld_pt_elixhauser`

| Column | Description |
| --- | --- |
| patient_id | unique patient identifier |
| comorbidity | elixhauser comorbidiity |

In [4]:
%%read_sql
--Create raven diagnosis table
DROP TABLE IF EXISTS cld_elix; 
CREATE TRANSIENT TABLE cld_elix AS
      SELECT coh.patient_id,
             elix.comorbidity
        FROM coh_pt coh
             JOIN de_raven_diagnosis dx
               ON coh.patient_id = dx.patient_id
             JOIN rwd_db.rwd_reference_library.elixhauser_comorbidities elix
               ON elix.code = dx.diagnosis
     GROUP BY coh.patient_id, elix.comorbidity;

Query started at 01:56:30 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 01:56:32 PM Eastern Daylight Time

ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002043 (02000): SQL compilation error:
Object does not exist, or operation cannot be performed. [SQL: "CREATE TABLE pld_elix AS\n      SELECT coh.patient_id,\n             elix.comorbidity\n        FROM coh_pt coh\n             JOIN de_raven_diagnosis dx\n               ON coh.patient_id = dx.patient_id\n             JOIN rwd_db.rwd.reference_library.elixhauser_comorbidities elix\n               ON elix.code = dx.diagnosis\n       WHERE year_of_service BETWEEN '2015-01-01' AND '2017-12-31'\n     GROUP BY coh.patient_id, elix.comorbidity;"] (Background on this error at: http://sqlalche.me/e/f405)

In [None]:
%%read_sql
--Review counts as a sanity check
SELECT Count(*) AS row_cnt,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct comorbidity) AS comorbidity_cnt
  FROM cld_elix;

In [None]:
%%read_sql
--Quick distribution of the counts
SELECT comorbidity,
       Count(Distinct patient_id) AS pt_cnt,
       Count(Distinct patient_id) / (SELECT Count(*)
                                      FROM coh_pt) AS pct 
  FROM cld_elix
 GROUP BY comorbidity
 ORDER BY pt_cnt desc;