# Insurance Medical
Identification of payer per the OOPS team for raven medical submits.

**Script**
* [scripts/pld/insurance_medical.ipynb](./scripts/pld/insurance_medical.ipynb)

**Prior Script(s)**
* [scripts/cld/insurance_medical.ipynb](./scripts/cld/insurance_medical.ipynb)

**Parameters**
* `in/pld/medical_insurance.xlsx[param]`

**Input**
* `cld_med_ins`
  
**Output**
* `cld_med_ins_pivot`

**Review**
* [scripts/pld/insurance_medical.html](./scripts/pld/insurance_medical.html)

## Patrick requests
Below are a few requests/questions that Patrick Cronin has for this project

* What is the plan for maintaining the OOPS table (i.e.) what about penguin
* A standard way to import the data was included
* Move all reference tables to `RWD_REF.ANALYTICS`
* Need to understand how the remits work
* Where is the cleanup function (Omnya send Patrick script and he'll figure out how to)
* rwd_db.rwd_reference_library.insurance_types replaced rwd_db.rwd_reference_library.plan_type_determination

**Omnya To do**
* Send patrick Cleanup function


In [None]:
#Import libraries for this notebook
import pandas as pd  
from drg_connect import Snowflake
import numpy as np
import pickle
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

#Load connection variables to connect_dict
with open('../../out/conn/connect_dict.pickle', 'rb') as handle:
    connect_dict = pickle.load(handle)

#Create Eegine to connect to snowflake
snow = Snowflake(role=connect_dict['role'],
                 warehouse=connect_dict['warehouse'],
                 database=connect_dict['database'],
                 schema=connect_dict['schema'])

#Finish engine setup
engine = snow.engine
%load_ext sql_magic
%config SQL.conn_name = 'engine'  #Set the sql_magic connection engine
%config SQL.output_result = True  #Enable output to std out
%config SQL.notify_result = False #disable browser notifications


# Parameters
Import the index date where the date the patient lives is

 **Input**  
* `in/extract/insurance.xlsx[raven_extract]`

**Output**
* Python variables named after parameters with the value

In [None]:
#Create system variables from excel into script and review values in dictionary
df = pd.read_excel('../../in/pld/medical_insurance.xlsx', sheet_name='param', skiprows=4, dtype=str)
var_dict = dict(zip(df.parameter, df.value))
for key,val in var_dict.items(): exec(key + '=val')

#Check inputs
pd.DataFrame.from_dict(var_dict, orient='index')

# Pivot Coverage Type
Get a pivoted count of insurance claims by count between dates

In [None]:
#Identify unique years to put in the dynamic sql
unique_values = snow.select("SELECT DISTINCT ins_group FROM cld_med_ins WHERE ins_group IS NOT NULL ORDER BY ins_group")
yr_pivot_values = ",".join("'" + x +"'" for x in unique_values.values.flatten())
unique_values = unique_values.ins_group.replace({' / ':'_'},regex=True)
yr_pivot_values2 = ",".join(x for x in unique_values.values.flatten())
yr_pivot_values
yr_pivot_values2

In [None]:
%%read_sql
--Create subset of counts by year
CREATE OR REPLACE TEMP TABLE tmp_pivot_ins AS
    SELECT patient_id,
           ins_group,
           Count(Distinct patient_id, claim_id, year_of_service) AS claim_cnt
     FROM cld_med_ins
    WHERE year_of_service BETWEEN '{med_start_dt}' And '{med_end_dt}'
   GROUP BY patient_id, ins_group;

--Pivot on year to get counts
DROP TABLE IF EXISTS pld_med_ins_pivot;
CREATE TRANSIENT TABLE pld_med_ins_pivot AS
    SELECT *
      FROM tmp_pivot_ins
           PIVOT(Sum(claim_cnt) for ins_group IN ({yr_pivot_values}))
           AS p (patient_id, {yr_pivot_values2}); 

--Add final insurance column
BEGIN;
ALTER TABLE pld_med_ins_pivot
        ADD insurance VARCHAR(50);
COMMIT;

--Update data to a single insurance type
BEGIN;
UPDATE pld_med_ins_pivot
   SET insurance = CASE WHEN medicare > 0 AND commercial > 0 THEN 'Medicare Advantage'
                        WHEN medicare > 0 THEN 'Medicare'
                        WHEN medicaid > 0 THEN 'Medicaid'
                        WHEN commercial > 0 THEN 'Commercial'
                        WHEN va_other > 0 THEN 'VA Other'
                        ELSE 'Unknown'
                    END;
COMMIT;

In [None]:
%%read_sql
--Review Counts
SELECT insurance,
       Count(*) as cnt,
       Count(*) / (SELECT Count(*)
                     FROM pld_med_ins_pivot) AS pct
  FROM pld_med_ins_pivot
 GROUP BY insurance
 ORDER BY cnt DESC;