# Pharmacy Dedup
This code takes in a master Pharmacy table and removed reversed/rejected claims as well as deduplicates and claim numbers based on a Unique Service ID

**Script**
* [scripts/dc/raven_pharmacy_dedup.ipynb](./scripts/dc/raven_pharmacy_dedup.ipynb)

**Prior Script(s)**
* [scripts/de/raven_pharmacy.ipynb](./scripts/de/raven_pharmacy.ipynb)

**Parameters**
* none

**Input**
* `de_raven_pharmacy`
* `rwd_db.rwd_reference_library.insurance_types`
  
**Output**
* `dc_rxdedup_final`

**Review**
* [scripts/dc/raven_pharmacy_dedup.html](./scripts/dc/raven_pharmacy_dedup.html)

In [1]:
#Import libraries for this notebook
import pandas as pd  
from drg_connect import Snowflake
import numpy as np
import pickle
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

#Load connection variables to connect_dict
with open('../../out/conn/connect_dict.pickle', 'rb') as handle:
    connect_dict = pickle.load(handle)

#Create Eegine to connect to snowflake
snow = Snowflake(role=connect_dict['role'],
                 warehouse=connect_dict['warehouse'],
                 database=connect_dict['database'],
                 schema=connect_dict['schema'])

#Finish engine setup
engine = snow.engine
%load_ext sql_magic
%config SQL.conn_name = 'engine'  #Set the sql_magic connection engine
%config SQL.output_result = True  #Enable output to std out
%config SQL.notify_result = False #disable browser notifications


# Master Table
Create a copy of `de_raven_pharmacy` as the basis of this analysis

In [2]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_rx_master;
CREATE TRANSIENT TABLE dc_rxdedup_rx_master AS
    SELECT phar.*, 
           ins.type_coverage       AS coverage_type, 
           ins.ici_insurance_group AS insurance_type  
      FROM de_raven_pharmacy phar 
           LEFT JOIN rwd_db.rwd_reference_library.insurance_types ins --PRC: I switched to this table as it's the most up to date
                  ON phar.type_of_payment = ins.type_coverage; 

Query started at 02:13:18 PM Eastern Daylight TimeInitiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
; Query executed in 0.09 mQuery started at 02:13:24 PM Eastern Daylight Time; Query executed in 0.15 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_MASTER successfully created.


In [3]:
%%read_sql
--Check row counts
SELECT Count(*) AS row_cnt
  FROM dc_rxdedup_rx_master

Query started at 02:13:33 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,row_cnt
0,20154748


# Data Cleaning
Clean up rejects and days supply

## Delete Rejects
Delete rejected claims from the master table

In [4]:
%%read_sql
--Drop bad rows based on rejection codes
BEGIN;
DELETE FROM dc_rxdedup_rx_master 
WHERE  reject_code_1 IS NOT NULL 
        OR reject_code_2 IS NOT NULL 
        OR reject_code_3 IS NOT NULL 
        OR reject_code_4 IS NOT NULL 
        OR reject_code_5 IS NOT NULL; 
COMMIT;

Query started at 02:13:35 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:13:37 PM Eastern Daylight Time; Query executed in 0.18 mQuery started at 02:13:47 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


In [5]:
%%read_sql
--Review counts post deletion of rows with rejection codes
SELECT Count(*) AS row_cnt
  FROM dc_rxdedup_rx_master

Query started at 02:13:49 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,row_cnt
0,14655881


Cleaning up days_supply_mod field and removing bad values from it

## Days supply 
Clean up the day supply vlaues to get rid of the leading zeros

In [6]:
%%read_sql
--PRC: need to rewrite as a regular expression
BEGIN;
UPDATE dc_rxdedup_rx_master 
SET    days_supply = CASE 
                       WHEN days_supply = '0' THEN '0' 
                       WHEN LEFT(days_supply, 2) = '00' THEN RIGHT(days_supply, 1) -- If 007, take 7 
                       WHEN LEFT(days_supply, 1) = '0' THEN RIGHT(days_supply, 2) -- If 030, take 30 
                       ELSE days_supply 
                     END ;
COMMIT;

Query started at 02:13:51 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 02:13:53 PM Eastern Daylight Time; Query executed in 0.14 mQuery started at 02:14:01 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,status
0,Statement executed successfully.


In [7]:
%%read_sql
--Review distribution to be sure values make sense
SELECT days_supply,
       Count(*) AS row_cnt,
       Count(*) / (SELECT Count(*)
                     FROM dc_rxdedup_rx_master) AS pct
  FROM dc_rxdedup_rx_master
 GROUP BY days_supply
 ORDER BY row_cnt DESC
 LIMIT 30;

Query started at 02:14:03 PM Eastern Daylight Time; Query executed in 0.05 m

Unnamed: 0,days_supply,row_cnt,pct
0,30,9123057,0.622484
1,90,1725489,0.117734
2,28,514197,0.035085
3,10,409178,0.027919
4,7,341569,0.023306
5,15,286473,0.019547
6,5,267019,0.018219
7,25,184480,0.012587
8,1,167344,0.011418
9,14,161038,0.010988


# Imput ids
Impute patient_id values for claims that do not have them

## Abondoned claims
Create table of the abondoned claims

STEPS TO ADD VALID PATIENT IDs TO THE REVERSED CLAIMS 
1. Adding Patient Id to the abandoned claims based on Non-abandoned claims 
1. reversed claims have junk patient ids - XXX -% 

  Getting all Abandoned claims

In [8]:
%%read_sql
--PRC: I'm not sure why this is a select *.  I don't think we need all the columns
DROP TABLE IF EXISTS dc_rxdedup_rx_Aban_01;
CREATE TRANSIENT TABLE dc_rxdedup_rx_Aban_01 AS
    SELECT * 
    FROM   dc_rxdedup_rx_master 
    WHERE  transaction_code = 'B2' 
           AND response_code = 'A'; 

Query started at 02:14:06 PM Eastern Daylight Time; Query executed in 0.02 mQuery started at 02:14:08 PM Eastern Daylight Time; Query executed in 0.15 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_ABAN_01 successfully created.


In [9]:
%%read_sql
--Review table size
SELECT Count(*),
       Count(Distinct patient_id) AS pt_cnt,
       Sum(CASE WHEN patient_id IS NULL THEN 1 ELSE 0 END) as null_pt_cnt
  FROM dc_rxdedup_rx_Aban_01;

Query started at 02:14:17 PM Eastern Daylight Time; Query executed in 0.12 m

Unnamed: 0,COUNT(*),pt_cnt,null_pt_cnt
0,4752,285,110



 Getting all Non-Abandoned claims

## Non abondoned claims
Create a table of the non abondoned claims

In [10]:
%%read_sql 
DROP TABLE IF EXISTS dc_rxdedup_rx_NonAban_01;
CREATE TRANSIENT TABLE dc_rxdedup_rx_NonAban_01 AS
SELECT * 
FROM   dc_rxdedup_rx_master 
WHERE  transaction_code <> 'B2' 
       AND patient_id IS NOT NULL 
;

Query started at 02:14:24 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:14:25 PM Eastern Daylight Time; Query executed in 0.12 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_NONABAN_01 successfully cr...


In [11]:
%%read_sql
--Review non abondoned row counts
SELECT Count(*) AS row_cnt
  FROM dc_rxdedup_rx_NonAban_01;

Query started at 02:14:33 PM Eastern Daylight Time; Query executed in 0.03 m

Unnamed: 0,row_cnt
0,14651115


## Patient Id for Abondoned claims
Getting the Patient IDs for a Abandoned claim 
  
1. Unique mapping key - (service_provider_id + prescription_or_service_reference_number + FILL_NUMBER + product_or_service_id
2. Chekcking to see if the Unique mapping key is unique per patient to see if that can be used as a derived key to fetch the
      patient id for the reversed claims from the non-reversed claims
     

In [12]:
%%read_sql 

DROP TABLE IF EXISTS dc_rxdedup_rx_NonAban_Abn_Mapping_01;
CREATE TRANSIENT TABLE dc_rxdedup_rx_NonAban_Abn_Mapping_01 AS
SELECT provider_id, 
       prescription_or_service_reference_number, 
       fill_number, 
       ndc, 
       Count (DISTINCT patient_id) AS Pat_Vol 
FROM   dc_rxdedup_rx_nonaban_01 
GROUP  BY provider_id, 
          prescription_or_service_reference_number, 
          fill_number, 
          ndc 
HAVING pat_vol = 1; 
  

Query started at 02:14:34 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:14:36 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_NONABAN_ABN_MAPPING_01 suc...


## Assign Id to abondoned claims
Following logic try to assign the most relevant patient id for each of the abandoned claims 
  


Getting the closest non-reversed claim to identify the valid patient id (works for 95% of the cases) 
for the claims where the prescription_or_service_reference_number is 1-0-1 from the Unique mapping key level
i.e. should be present in the above table


In [13]:
%%read_sql 

DROP TABLE IF EXISTS dc_rxdedup_rx_NonAban_Abn_Mapping;
CREATE TRANSIENT TABLE dc_rxdedup_rx_NonAban_Abn_Mapping AS
WITH abondt1 
     AS (SELECT AB2.claim_id, 
                Datediff(day, To_date(AB2.date_of_service), To_date(RB1.date_of_service)) AS DiffDate,
                AB2.date_of_service                                                       AS ABDT,
                AB2.provider_id, 
                AB2.prescription_or_service_reference_number, 
                AB2.fill_number, 
                AB2.ndc, 
                AB2.insurance_type, 
                AB2.coverage_type,  
                AB2.data_source, 
                RB1.date_of_service                                                       AS RBDT,
                RB1.claim_id                                                              AS Non_Abn_claim_id,
                RB1.patient_id                                                            AS Non_Abn_patient_id,
                RB1.response_code                                                         AS Non_Abn_Response_Code,
                --RB1.patient_gender,  
                --RB1.patient_age,  
                CASE 
                  WHEN RB1.patient_id IS NULL THEN 'Null_Exclude' 
                  WHEN inc.prescription_or_service_reference_number IS NULL THEN 'Not_Unique_Exclude'
                  ELSE 'Include' 
                END                                                                       AS Mapping_Flag
         FROM   dc_rxdedup_rx_aban_01 AB2 
                LEFT JOIN dc_rxdedup_rx_nonaban_01 RB1 
                       ON AB2.provider_id = RB1.provider_id 
                          AND AB2.prescription_or_service_reference_number = 
                              RB1.prescription_or_service_reference_number 
                          AND AB2.fill_number = RB1.fill_number 
                          AND AB2.ndc = RB1.ndc 
                LEFT JOIN dc_rxdedup_rx_nonaban_abn_mapping_01 Inc 
                       ON AB2.provider_id = Inc.provider_id 
                          AND AB2.prescription_or_service_reference_number = 
                              Inc.prescription_or_service_reference_number 
                          AND AB2.fill_number = Inc.fill_number 
                          AND AB2.ndc = Inc.ndc), 
     abondt2 
     AS (SELECT *, 
                Row_number () 
                  OVER ( 
                    partition BY claim_id 
                    ORDER BY Abs(diffdate) ASC) AS Rnk1 
         FROM   abondt1) 
SELECT * 
FROM   abondt2 

Query started at 02:14:42 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:14:43 PM Eastern Daylight Time; Query executed in 0.15 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_NONABAN_ABN_MAPPING succes...


## Find closests non-reversed claim
Getting the closest non-reversed claim to identify the valid patient id w/o fill_number

 
Done on the subset of claims that was not mapped with the above logic - the joining condition is less stringent as fill_number is excluded but only qualify if there is a record present within 30 days window for the given set of joining keys


In [14]:
%%read_sql

DROP TABLE IF EXISTS dc_rxdedup_rx_NonAban_Abn_Mapping_1;
CREATE TRANSIENT TABLE dc_rxdedup_rx_NonAban_Abn_Mapping_1 AS
WITH abondt1 
     AS (SELECT AB2.claim_id, 
                Datediff(day, To_date(AB2.abdt), To_date(RB1.date_of_service)) AS DiffDate, 
                AB2.abdt, 
                AB2.provider_id, 
                AB2.prescription_or_service_reference_number, 
                AB2.fill_number, 
                AB2.ndc, 
                AB2.insurance_type, 
                AB2.coverage_type, 
                AB2.data_source, 
                RB1.date_of_service                                            AS RBDT, 
                RB1.claim_id                                                AS Non_Abn_claim_id,
                RB1.patient_id                                              AS Non_Abn_patient_id,
                RB1.response_code                                           AS Non_Abn_Response_Code
                --RB1.patient_gender, 
                --RB1.patient_age 
         FROM   dc_rxdedup_rx_nonaban_abn_mapping AB2 
                INNER JOIN dc_rxdedup_rx_nonaban_01 RB1 
                        ON AB2.provider_id = RB1.provider_id 
                           AND AB2.prescription_or_service_reference_number = 
                               RB1.prescription_or_service_reference_number 
                           AND AB2.ndc = RB1.ndc 
         WHERE  mapping_flag = 'Null_Exclude'), 
     abondt2 
     AS (SELECT *, 
                Row_number () 
                  OVER ( 
                    partition BY claim_id 
                    ORDER BY Abs(diffdate) ASC) AS Rnk1 
         FROM   abondt1) 
SELECT * 
FROM   abondt2 
WHERE  rnk1 = 1 
       AND Abs(diffdate) <= 30; 

Query started at 02:14:52 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:14:54 PM Eastern Daylight Time; Query executed in 0.04 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_NONABAN_ABN_MAPPING_1 succ...


Getting the closest non-reversed claim to identify the valid patient id w/o fill_number and not 
having a unique prescription_or_service_reference_number within 30 days window   


In [15]:
%%read_sql 

DROP TABLE IF EXISTS dc_rxdedup_rx_NonAban_Abn_Mapping_2;
CREATE TRANSIENT TABLE dc_rxdedup_rx_NonAban_Abn_Mapping_2 AS
WITH abondt1 
     AS (SELECT AB2.claim_id, 
                Datediff(day, To_date(AB2.abdt), To_date(RB1.date_of_service)) AS DiffDate, 
                AB2.abdt, 
                AB2.provider_id, 
                AB2.prescription_or_service_reference_number, 
                AB2.fill_number, 
                AB2.ndc, 
                AB2.insurance_type, 
                AB2.coverage_type, 
                AB2.data_source, 
                RB1.date_of_service                                           AS RBDT, 
                RB1.claim_id                                                  AS Non_Abn_claim_id,
                RB1.patient_id                                                AS Non_Abn_patient_id,
                RB1.response_code                                             AS Non_Abn_Response_Code
                --RB1.patient_gender, 
                --RB1.patient_age 
         FROM   dc_rxdedup_rx_nonaban_abn_mapping AB2 
                INNER JOIN dc_rxdedup_rx_nonaban_01 RB1 
                        ON AB2.provider_id = RB1.provider_id 
                           AND AB2.prescription_or_service_reference_number = 
                               RB1.prescription_or_service_reference_number 
                           AND AB2.fill_number = RB1.fill_number 
                           AND AB2.ndc = RB1.ndc 
         WHERE  mapping_flag = 'Not_Unique_Exclude' 
                AND RB1.response_code = 'P'), 
     abondt2 
     AS (SELECT *, 
                Row_number () 
                  OVER ( 
                    partition BY claim_id 
                    ORDER BY Abs(diffdate) ASC) AS Rnk1 
         FROM   abondt1) 
SELECT * 
FROM   abondt2 
WHERE  rnk1 = 1 
       AND Abs(diffdate) <= 30 
       AND non_abn_patient_id IS NOT NULL; 

Query started at 02:14:57 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:14:59 PM Eastern Daylight Time; Query executed in 0.04 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_NONABAN_ABN_MAPPING_2 succ...


## Combine data
Combining all the subsets of reversed claims with the relevant patient id is added



Abandoned claims mapped with valid patient ID 


In [16]:
%%read_sql 

DROP TABLE IF EXISTS dc_rxdedup_rx_NonAban_Abn_Mapping_3;
CREATE TRANSIENT TABLE dc_rxdedup_rx_NonAban_Abn_Mapping_3 AS
    SELECT claim_id, 
           non_abn_patient_id--, 
           --patient_gender, 
           --patient_age 
    FROM   dc_rxdedup_rx_nonaban_abn_mapping 
    WHERE  mapping_flag = 'Include' 
    UNION 
    SELECT claim_id, 
           non_abn_patient_id--, 
           --patient_gender, 
           --patient_age 
    FROM   dc_rxdedup_rx_nonaban_abn_mapping_1 
    UNION 
    SELECT claim_id, 
           non_abn_patient_id--, 
           --patient_gender, 
           --patient_age 
    FROM   dc_rxdedup_rx_nonaban_abn_mapping_2; 



Query started at 02:15:01 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:15:03 PM Eastern Daylight Time; Query executed in 0.04 m

Unnamed: 0,status
0,Table DC_RXDEDUP_RX_NONABAN_ABN_MAPPING_3 succ...


Combining Abandoned claims and non-reversed Records to recreate table with valid patient id
  

In [17]:
%%read_sql 

create or replace table dc_rxdedup_final as

SELECT b.non_abn_patient_id, 
       --b.patient_gender, 
       --b.patient_age, 
       a.claim_id, 
       prescription_or_service_reference_number, 
       date_prescription_written, 
       gross_amount_due_submitted, 
       patient_paid_amount_submitted, 
       patient_pay_amount, 
       total_amount_paid, 
       script_id, 
       date_of_service, 
       coverage_type, 
       insurance_type, 
       ndc, 
       days_supply, 
       fill_number, 
       quantity_dispensed, 
       number_of_refills_authorized, 
       transaction_code, 
       response_code, 
       provider_id, 
       provider_npi, 
       date_authorized, 
       time_authorized, 
       --timestamp, 
       data_source 
FROM   dc_rxdedup_rx_aban_01 a 
       INNER JOIN dc_rxdedup_rx_nonaban_abn_mapping_3 b 
               ON a.claim_id = b.claim_id 
UNION 
SELECT mast.patient_id, 
       --mast.patient_gender, 
       --mast.patient_age, 
       mast.claim_id, 
       mast.prescription_or_service_reference_number, 
       mast.date_prescription_written, 
       mast.gross_amount_due_submitted, 
       mast.patient_paid_amount_submitted, 
       mast.patient_pay_amount, 
       mast.total_amount_paid, 
       mast.script_id, 
       mast.date_of_service, 
       mast.coverage_type, 
       mast.insurance_type, 
       mast.ndc, 
       mast.days_supply, 
       mast.fill_number, 
       mast.quantity_dispensed, 
       mast.number_of_refills_authorized, 
       mast.transaction_code, 
       mast.response_code, 
       mast.provider_id, 
       mast.provider_npi, 
       mast.date_authorized, 
       mast.time_authorized, 
       --mast.timestamp, 
       mast.data_source 
FROM   dc_rxdedup_rx_master mast
       LEFT JOIN dc_rxdedup_rx_aban_01 abon
              ON mast.claim_id = abon.claim_id 
WHERE  abon.claim_id IS NULL; 

Query started at 02:15:05 PM Eastern Daylight Time; Query executed in 0.29 m

Unnamed: 0,status
0,Table DC_RXDEDUP_FINAL successfully created.


# Rx Life Cycle 
LOGIC FOR TEASING OUT RX LIFE CYCLE (REJECTION/ REVERSAL/ ABANDONMENT/ APPROVAL/ DISPENSED CLAIMS) 
    
  Abandonment Flag: 
1. Patient ID + Rx Num + Fill Number + NDC  is the unique script 
2. Look for the Latest Reversed Claim and the Latest Paid claim
3. If the Latest Reversal date >= Latest Paid claim then 


Step 1: Considering Same Day Reversal (most of the cases this holds true) - Paid: Gave approval/ Abandonment Flag 


## Step 1

In [18]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_Rx_v2;
CREATE TRANSIENT TABLE dc_rxdedup_lifecycle_Rx_v2 AS
WITH aban_t1 
     AS (SELECT *, 
                Row_number () 
                  OVER ( 
                    partition BY non_abn_patient_id, ndc, prescription_or_service_reference_number, fill_number
                    ORDER BY date_of_service DESC) AS Final_Reversal 
         FROM   dc_rxdedup_final 
         WHERE  transaction_code = 'B2' 
                AND response_code = 'A'), 
     paid_t1 
     AS (SELECT *, 
                Row_number () 
                  OVER ( 
                    partition BY non_abn_patient_id, ndc, prescription_or_service_reference_number, fill_number
                    ORDER BY date_of_service DESC) AS Final_Paid 
         FROM   dc_rxdedup_final 
         WHERE  response_code = 'P'), 
     abn_paid_t1 
     AS (SELECT ab.*, 
                Datediff(day, ab.date_of_service, pd.date_of_service) AS DiffDate, 
                pd.date_of_service                                 AS paid_date_of_service, 
                pd.claim_id                                 AS paid_claim_id, 
                pd.response_code                                AS paid_response_code 
         FROM   aban_t1 ab 
                INNER JOIN paid_t1 pd 
                        ON ab.non_abn_patient_id = pd.non_abn_patient_id 
                           AND ab.ndc = pd.ndc 
                           AND ab.prescription_or_service_reference_number = pd.prescription_or_service_reference_number
                           AND ab.fill_number = pd.fill_number 
                           AND ab.date_of_service = pd.date_of_service), 
     sameday_reverse_paidt1 
     AS (SELECT non_abn_patient_id, 
                ndc, 
                prescription_or_service_reference_number, 
                fill_number, 
                paid_claim_id, 
                paid_response_code, 
                paid_date_of_service, 
                CASE 
                  WHEN diffdate = 0 THEN 'Approved' 
                END AS Modified_Response_cod 
         FROM   abn_paid_t1 
         WHERE  paid_response_code = 'P' 
         UNION 
         SELECT non_abn_patient_id, 
                ndc, 
                prescription_or_service_reference_number, 
                fill_number, 
                claim_id, 
                response_code, 
                date_of_service, 
                CASE 
                  WHEN diffdate = 0 THEN 'Abandoned' 
                END AS Modified_Response_cod 
         FROM   abn_paid_t1 
         WHERE  response_code = 'A'), 
     only_paidt1 
     AS (SELECT pd.*, 
                Datediff(day, ab.date_of_service, pd.date_of_service) AS DiffDate, 
                pd.date_of_service                                 AS paid_date_of_service, 
                'Dispensed'                                     AS Modified_Response_cod 
         FROM   paid_t1 pd 
                LEFT JOIN aban_t1 ab 
                       ON ab.non_abn_patient_id = pd.non_abn_patient_id 
                          AND ab.ndc = pd.ndc 
                          AND ab.prescription_or_service_reference_number = pd.prescription_or_service_reference_number
                          AND ab.fill_number = pd.fill_number 
         WHERE  ab.non_abn_patient_id IS NULL) SELECT paid_claim_id, 
       modified_response_cod 
FROM   sameday_reverse_paidt1 
UNION 
SELECT claim_id, 
       modified_response_cod 
FROM   only_paidt1; 

Query started at 02:15:23 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:15:25 PM Eastern Daylight Time; Query executed in 0.22 m

Unnamed: 0,status
0,Table DC_RXDEDUP_LIFECYCLE_RX_V2 successfully ...


## Step 2 
The same day reversal rule covers 98% of the records and for the remaining claim the following rule is applied  


In [19]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_Rx_v3;
CREATE TRANSIENT TABLE dc_rxdedup_lifecycle_Rx_v3 AS
WITH exclusion_recordst1 
     AS (SELECT * 
         FROM   dc_rxdedup_final a 
                LEFT JOIN dc_rxdedup_lifecycle_rx_v2 exc 
                       ON a.claim_id = exc.paid_claim_id 
         WHERE  exc.paid_claim_id IS NULL 
                AND a.response_code <> 'R'), 
     latest_record_t1 
     AS (SELECT *, 
                Dense_rank () 
                  OVER ( 
                    partition BY non_abn_patient_id, ndc, prescription_or_service_reference_number, fill_number
                    ORDER BY date_of_service DESC) AS Final_Event 
         FROM   exclusion_recordst1), 
     latest_record_t2 
     AS (SELECT DISTINCT a.claim_id, 
                         a.response_code 
         FROM   latest_record_t1 a 
         WHERE  final_event = 1) SELECT claim_id, 
       CASE 
         WHEN response_code = 'P' THEN 'Dispensed' 
         WHEN response_code = 'A' THEN 'Abandoned' 
       END AS Modified_Response_cod 
FROM   latest_record_t2 
UNION ALL 
SELECT claim_id, 
       'Rejected' AS Modified_Response_cod 
FROM   dc_rxdedup_final 
WHERE  response_code = 'R'; 

Query started at 02:15:38 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:15:40 PM Eastern Daylight Time; Query executed in 0.13 m

Unnamed: 0,status
0,Table DC_RXDEDUP_LIFECYCLE_RX_V3 successfully ...


## Step 3
Intermediate Transcation status Mapped Table


In [20]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_Rx_v4;
CREATE TRANSIENT TABLE dc_rxdedup_lifecycle_Rx_v4 AS
    SELECT * 
    FROM   dc_rxdedup_lifecycle_rx_v2 
    UNION 
    SELECT * 
    FROM   dc_rxdedup_lifecycle_rx_v3; 


Query started at 02:15:47 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:15:49 PM Eastern Daylight Time; Query executed in 0.09 m

Unnamed: 0,status
0,Table DC_RXDEDUP_LIFECYCLE_RX_V4 successfully ...


## Step 4
Adding the Modified Response codes to the Base Rx Table 

In [21]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_Rx_v5;
CREATE TRANSIENT TABLE dc_rxdedup_lifecycle_Rx_v5 AS
SELECT DISTINCT a.*, 
                CASE 
                  WHEN b.modified_response_cod IS NULL 
                       AND ( response_code = 'A' 
                             AND transaction_code = 'B2' ) THEN 'Abandoned' 
                  ELSE b.modified_response_cod 
                END AS Modified_Response_cod 
FROM   dc_rxdedup_final a 
       LEFT JOIN dc_rxdedup_lifecycle_rx_v4 b 
              ON a.claim_id = b.paid_claim_id; 


Query started at 02:15:54 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:15:56 PM Eastern Daylight Time; Query executed in 0.29 m

Unnamed: 0,status
0,Table DC_RXDEDUP_LIFECYCLE_RX_V5 successfully ...


## Step 5
To address the situations wherein the reversed claims are paid on the same day.

The assumption is that, the abandonment to approval ratio shoule be 1:1, i.e, for an abandoned claims there would be paid claim (from payer) and respective reversal from pharmacy. 

So, if we do see more paid claims than the reversal, then the last transaction is assumed to be a dispensed eventhough we see reversal and approval (paid from payer) on the same day. 

For such instances, the latestet transaction is identified as the one with latest date/ time autorized at uniqe script payer/ channel level

Last TRANSACTION for Rx Number + Fill number combination



In [22]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_Rx_v6;
CREATE TRANSIENT TABLE dc_rxdedup_lifecycle_Rx_v6 AS
WITH aban_t1 
     AS (SELECT *, 
                Dense_rank () 
                  OVER ( 
                    partition BY non_abn_patient_id, ndc, prescription_or_service_reference_number, fill_number
                    ORDER BY date_of_service DESC) AS Final_transaction 
         FROM   dc_rxdedup_final 
         WHERE  ( transaction_code = 'B2' 
                  AND response_code = 'A' ) 
                 OR response_code = 'P'), 
     final_transactiont1 
     AS (SELECT DISTINCT ab.claim_id, 
                         ab.non_abn_patient_id, 
                         ab.prescription_or_service_reference_number, 
                         ab.fill_number, 
                         ab.date_of_service, 
                         ab.response_code, 
                         ab.transaction_code, 
                         ab.ndc 
         FROM   aban_t1 ab 
         WHERE  final_transaction = 1), 
     abandonment_check_t1 
     AS (SELECT ab.non_abn_patient_id, 
                ab.prescription_or_service_reference_number, 
                ab.fill_number, 
                ab.date_of_service, 
                ab.ndc, 
                Count (DISTINCT CASE 
                                  WHEN ab.response_code = 'A' THEN claim_id 
                                END) AS Abandonment_claims, 
                Count (DISTINCT CASE 
                                  WHEN ab.response_code = 'P' THEN claim_id 
                                END) AS Paid_claims 
         FROM   final_transactiont1 ab 
         GROUP  BY ab.non_abn_patient_id, 
                   ab.prescription_or_service_reference_number, 
                   ab.fill_number, 
                   ab.date_of_service, 
                   ab.ndc), 
     morepaidvsabnt1 
     AS (SELECT *, 
                CASE 
                  WHEN paid_claims > abandonment_claims THEN 'Dispensed' 
                END AS Dispense_Abandonment_Flag 
         FROM   abandonment_check_t1 
         WHERE  paid_claims > abandonment_claims), 
     morepaidvsabnt1_finaltransctiont1 
     AS (SELECT cl.non_abn_patient_id, 
                claim_id, 
                cl.date_of_service, 
                cl.prescription_or_service_reference_number, 
                cl.ndc, 
                cl.fill_number, 
                days_supply, 
                response_code, 
                transaction_code, 
                gross_amount_due_submitted, 
                patient_pay_amount, 
                cl.total_amount_paid, 
                --cl.insurance_type, 
                --cl.coverage_type, 
                cl.modified_response_cod, 
                dispense_abandonment_flag, 
                --cl.timestamp, 
                Row_number () 
                  OVER ( 
                    partition BY cl.non_abn_patient_id, cl.prescription_or_service_reference_number
                  , cl.ndc, cl.fill_number, 
                  cl.date_of_service--, --cl.coverage_type --cl.insurance_type 
                    ORDER BY cl.date_authorized, cl.time_authorized DESC) AS Final_transaction_rnk
         FROM   dc_rxdedup_lifecycle_rx_v5 cl 
                INNER JOIN morepaidvsabnt1 ab 
                        ON ab.non_abn_patient_id = cl.non_abn_patient_id 
                           AND ab.ndc = cl.ndc 
                           AND ab.prescription_or_service_reference_number = cl.prescription_or_service_reference_number
                           AND ab.fill_number = cl.fill_number 
                           AND ab.date_of_service = cl.date_of_service 
         WHERE  cl.response_code = 'P' 
                AND dispense_abandonment_flag <> modified_response_cod) 
SELECT * 
FROM   morepaidvsabnt1_finaltransctiont1 
WHERE  final_transaction_rnk = 1; 

Query started at 02:16:13 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:16:15 PM Eastern Daylight Time; Query executed in 0.23 m

Unnamed: 0,status
0,Table DC_RXDEDUP_LIFECYCLE_RX_V6 successfully ...


## Step 6
Adding the modified response_code from the v6 table 

In [23]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_Rx_v7;
CREATE TRANSIENT TABLE dc_rxdedup_lifecycle_Rx_v7 AS
SELECT DISTINCT cl.*, 
                CASE 
                  WHEN rx.dispense_abandonment_flag IS NOT NULL THEN rx.dispense_abandonment_flag 
                  ELSE cl.modified_response_cod 
                END AS Final_RESPONSE_CODE 
FROM   dc_rxdedup_lifecycle_rx_v5 cl 
       LEFT JOIN dc_rxdedup_lifecycle_rx_v6 rx 
              ON cl.claim_id = rx.claim_id; 

Query started at 02:16:29 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 02:16:31 PM Eastern Daylight Time; Query executed in 0.41 m

Unnamed: 0,status
0,Table DC_RXDEDUP_LIFECYCLE_RX_V7 successfully ...


## Step 7
Adding sequence Numbers to the transaction (to track the sequence of events)

In [24]:
%%read_sql

DROP TABLE IF EXISTS dc_rxdedup_lifecycle_Rx_v8;
CREATE TRANSIENT TABLE dc_rxdedup_lifecycle_Rx_v8 AS
WITH scriptidt1 
     AS (SELECT *, 
                CASE 
                  WHEN final_response_code = 'Dispensed' THEN 4 
                  WHEN final_response_code = 'Abandoned' THEN 3 
                  WHEN final_response_code = 'Approved' THEN 2 
                  WHEN final_response_code = 'Rejected' THEN 1 
                END AS Event_Score 
         FROM   dc_rxdedup_lifecycle_rx_v7 cl 
         WHERE  final_response_code IS NOT NULL 
                AND script_id IS NOT NULL) 
SELECT DISTINCT *, 
                Dense_rank () 
                  OVER ( 
                    partition BY non_abn_patient_id, ndc 
                    ORDER BY date_of_service ASC, script_id ASC, event_score ASC) AS Transcation_Sequence_Number,
                Dense_rank () 
                  OVER ( 
                    partition BY non_abn_patient_id 
                    ORDER BY date_of_service ASC)                                        AS Event_Sequence_Number
FROM   scriptidt1; 


Query started at 02:16:56 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 02:16:58 PM Eastern Daylight Time; Query executed in 0.21 m

Unnamed: 0,status
0,Table DC_RXDEDUP_LIFECYCLE_RX_V8 successfully ...


# Create final table
FINAL Refined Rx table for the LoT analysis 
1. Dispensed claims only
2. Taking one record/ prescription = uniquescript_id - ranking is done to consider only one payer involved


In [25]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_final;
CREATE TRANSIENT TABLE dc_rxdedup_final AS
WITH t1 AS (SELECT * 
              FROM dc_rxdedup_lifecycle_rx_v8 
             WHERE final_response_code = 'Dispensed' ) , 
t2 AS (SELECT   * , 
                row_number () OVER (partition BY script_id ORDER BY 
                CASE 
                  WHEN insurance_type= 'Medicare' THEN 6 
                  WHEN insurance_type= 'Medicaid' THEN 5 
                  WHEN insurance_type= 'Commercial' THEN 4 
                  WHEN insurance_type= 'VA / Other' THEN 3 
                  WHEN insurance_type ilike '%Other Plan Type%' THEN 2 
                  WHEN insurance_type = 'None' THEN 1 
                END DESC , total_amount_paid DESC ) AS rnk 
             FROM     t1 ) 
    SELECT * 
    FROM   t2 
    WHERE  rnk = 1;
    
BEGIN;
ALTER TABLE dc_rxdedup_final RENAME COLUMN non_abn_patient_id TO patient_id;
COMMIT;

Query started at 02:17:10 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 02:17:14 PM Eastern Daylight Time; Query executed in 0.21 mQuery started at 02:17:26 PM Eastern Daylight Time; Query executed in 0.08 mQuery started at 02:17:31 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:33 PM Eastern Daylight Time; Query executed in 0.04 m

Unnamed: 0,status
0,Statement executed successfully.


In [26]:
%%read_sql
--Review final counts
SELECT Count(*) AS row_cnt,
       Count(DISTINCT patient_id) AS pt_cnt
FROM   dc_rxdedup_final;

Query started at 02:17:35 PM Eastern Daylight Time; Query executed in 0.08 m

Unnamed: 0,row_cnt,pt_cnt
0,12120273,121006


# Drop Tables
Drop intermediate tables

In [27]:
%%read_sql
DROP TABLE IF EXISTS dc_rxdedup_final_rx;
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_rx_v2;
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_rx_v3;
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_rx_v4;
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_rx_v5;
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_rx_v6;
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_rx_v7;
DROP TABLE IF EXISTS dc_rxdedup_lifecycle_rx_v8;
DROP TABLE IF EXISTS dc_rxdedup_rx_aban_01;
DROP TABLE IF EXISTS dc_rxdedup_rx_final_01;
DROP TABLE IF EXISTS dc_rxdedup_rx_master;
DROP TABLE IF EXISTS dc_rxdedup_rx_nonaban_01;
DROP TABLE IF EXISTS dc_rxdedup_rx_nonaban_abn_mapping;
DROP TABLE IF EXISTS dc_rxdedup_rx_nonaban_abn_mapping_01;
DROP TABLE IF EXISTS dc_rxdedup_rx_nonaban_abn_mapping_1;
DROP TABLE IF EXISTS dc_rxdedup_rx_nonaban_abn_mapping_2;
DROP TABLE IF EXISTS dc_rxdedup_rx_nonaban_abn_mapping_3;


Query started at 02:17:40 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:42 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:44 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:45 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:47 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:49 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:51 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:17:53 PM Eastern Daylight Time; Query executed in 0.12 mQuery started at 02:18:00 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:18:02 PM Eastern Daylight Time; Query executed in 0.05 mQuery started at 02:18:05 PM Eastern Daylight Time; Query executed in 0.04 mQuery started at 02:18:07 PM Eastern Daylight Time; Query executed in 0.03 mQuery started at 02:18:09 PM Eastern Daylight Time; Query executed in 0.03 mQuery starte

Unnamed: 0,status
0,DC_RXDEDUP_RX_NONABAN_ABN_MAPPING_3 successful...
