# Objective
Client requirements:
1. Growth hormone-related disorders
2. Growth hormone - US only
3. Hypoparathyroidism - US and Europe
4. Metrics: 
    Market share(value and volume), by brand and by indication, currently and historical trends. 
    Data on the payer landscape/contracts by brand and by indication would also be very helpful. 
    For HP, TRx and NRx trends would be helpful.

## Analysis plan
Look into Albatross, Pelican and Raven

# Initialization steps

## Importing relevant modules

In [2]:
import pandas as pd
from drg_connect import Snowflake
import qgrid 
from datetime import timedelta, datetime
import math

import warnings
warnings.filterwarnings('ignore')

## Connecting to snowflake

In [3]:
##defining parameters of snowflake
snow = Snowflake(role = 'RWD_ANALYTICS_RW',database='SANDBOX_ANALYTICS',schema = 'SANDBOX')
engine = snow.engine

%reload_ext sql_magic
%config SQL.output_result = True  #Enable output to std out
%config SQL.notify_result = False #disable browser notifications
%config SQL.conn_name = 'engine'  #Set the sql_magic connection engine

# Reference tables
For relevant ICD codes, number of EHR records and number of claims

## ICD grouper table

In [3]:
snow.select("select * from RWD_DB.RWD.ICD_GROUPER limit 3")

Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...


Unnamed: 0,id,level_1,level_1_description,level_2,level_2_description,level_3,level_3_description,level_4,icd9_mapped_codes,icd9_description,level_4_short_description_icd10,level_4_long_description_icd10,create_ts,update_ts
0,1,A00 - B999,Certain infectious and parasitic diseases,A00-A09,Intestinal infectious diseases,A00-A009,Cholera,A00,No map,No map,Cholera,Cholera,2017-11-02,2017-11-02
1,2,A00 - B999,Certain infectious and parasitic diseases,A00-A09,Intestinal infectious diseases,A00-A009,Cholera,A000,0010,Cholera due to vibrio cholerae,Cholera due to Vibrio cholerae 01 biovar cholerae,Cholera due to Vibrio cholerae 01 biovar cholerae,2017-11-02,2017-11-02
2,3,A00 - B999,Certain infectious and parasitic diseases,A00-A09,Intestinal infectious diseases,A00-A009,Cholera,A001,0011,Cholera due to vibrio cholerae el tor,Cholera due to Vibrio cholerae 01 biovar eltor,Cholera due to Vibrio cholerae 01 biovar eltor,2017-11-02,2017-11-02


In [4]:
%%read_sql

select
    *
from
    RWD_DB.RWD.ICD_GROUPER
where
    level_4_short_description_icd10 ilike '%hypoparathyroidism%'

Query started at 01:24:33 PM India Standard Time; Query executed in 0.06 m

Unnamed: 0,id,level_1,level_1_description,level_2,level_2_description,level_3,level_3_description,level_4,icd9_mapped_codes,icd9_description,level_4_short_description_icd10,level_4_long_description_icd10,create_ts,update_ts
0,4300,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E20,No map,No map,Hypoparathyroidism,Hypoparathyroidism,2017-11-02,2017-11-02
1,4301,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E200,No map,No map,Idiopathic hypoparathyroidism,Idiopathic hypoparathyroidism,2017-11-02,2017-11-02
2,4302,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E201,27549,Other disorders of calcium metabolism,Pseudohypoparathyroidism,Pseudohypoparathyroidism,2017-11-02,2017-11-02
3,4303,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E208,No map,No map,Other hypoparathyroidism,Other hypoparathyroidism,2017-11-02,2017-11-02
4,4304,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E209,2521,Hypoparathyroidism,Hypoparathyroidism unspecified,Hypoparathyroidism unspecified,2017-11-02,2017-11-02
5,4811,E00 - E8989,Endocrine nutritional and metabolic diseases,E89-E89,Postprocedural endocrine and metabolic complic...,E89-E898,Postproc endocrine and metabolic comp and diso...,E892,No map,No map,Postprocedural hypoparathyroidism,Postprocedural hypoparathyroidism,2017-11-02,2017-11-02
6,27440,P00 - P969,Certain conditions originating in the perinata...,P70-P74,Transitory endocrine and metabolic disorders s...,P71-P719,Transitory neonatal disorders of calcium and m...,P714,7754,Hypocalcemia and hypomagnesemia of newborn,Transitory neonatal hypoparathyroidism,Transitory neonatal hypoparathyroidism,2017-11-02,2017-11-02


In [6]:
%%read_sql

select
    *
from
    RWD_DB.RWD.ICD_GROUPER
where
    level_2_description ilike '%disorders%'
    and level_2_description ilike '%endocrine%'
    and level_2_description ilike '%glands%'

Query started at 01:33:08 PM India Standard Time; Query executed in 0.06 m

Unnamed: 0,id,level_1,level_1_description,level_2,level_2_description,level_3,level_3_description,level_4,icd9_mapped_codes,icd9_description,level_4_short_description_icd10,level_4_long_description_icd10,create_ts,update_ts
0,4300,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E20,No map,No map,Hypoparathyroidism,Hypoparathyroidism,2017-11-02,2017-11-02
1,4301,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E200,No map,No map,Idiopathic hypoparathyroidism,Idiopathic hypoparathyroidism,2017-11-02,2017-11-02
2,4302,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E201,27549,Other disorders of calcium metabolism,Pseudohypoparathyroidism,Pseudohypoparathyroidism,2017-11-02,2017-11-02
3,4303,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E208,No map,No map,Other hypoparathyroidism,Other hypoparathyroidism,2017-11-02,2017-11-02
4,4304,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E20-E209,Hypoparathyroidism,E209,2521,Hypoparathyroidism,Hypoparathyroidism unspecified,Hypoparathyroidism unspecified,2017-11-02,2017-11-02
5,4305,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E21-E215,Hyperparathyroidism and other disorders of par...,E21,No map,No map,Hyperparathyroidism and other disorders of par...,Hyperparathyroidism and other disorders of par...,2017-11-02,2017-11-02
6,4306,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E21-E215,Hyperparathyroidism and other disorders of par...,E210,25201,Primary hyperparathyroidism,Primary hyperparathyroidism,Primary hyperparathyroidism,2017-11-02,2017-11-02
7,4307,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E21-E215,Hyperparathyroidism and other disorders of par...,E211,25202,Secondary hyperparathyroidism non-renal,Secondary hyperparathyroidism not elsewhere cl...,Secondary hyperparathyroidism not elsewhere cl...,2017-11-02,2017-11-02
8,4308,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E21-E215,Hyperparathyroidism and other disorders of par...,E212,25208,Other hyperparathyroidism,Other hyperparathyroidism,Other hyperparathyroidism,2017-11-02,2017-11-02
9,4309,E00 - E8989,Endocrine nutritional and metabolic diseases,E20-E35,Disorders of other endocrine glands,E21-E215,Hyperparathyroidism and other disorders of par...,E213,25200,Hyperparathyroidism unspecified,Hyperparathyroidism unspecified,Hyperparathyroidism unspecified,2017-11-02,2017-11-02


### Relevant ICD codes?
Exclude E21, E22? Because they refer to over-production?

## Albatross
Looking for hypoparathyroidism

In [7]:
%%read_sql
create or replace table st_adpkd1 as
select 
    left(encrypted_key_1, 8)||left(encrypted_key_2, 8) as patient_id,
    icd9,
    HIPAA_ICD9,
    icd10,
    HIPAA_ICD10,
    recordeddttm,
    left(recordeddttm, 4) as year
    
from RWD_DB.RWD.albatross_EHR_problems
where 
   left (icd10, 3) in ('E20')
or ICD9 in ('2521')

Query started at 02:26:20 PM India Standard Time; Query executed in 14.04 m

Unnamed: 0,status
0,Table ST_ADPKD1 successfully created.


In [9]:
snow.select ("select count (distinct patient_id) from st_adpkd1")

Unnamed: 0,COUNT (DISTINCT PATIENT_ID)
0,17037


## Pelican
Using table: PELICAN_DIAGNOSIS

### Swarali notes: 
There are multiple diagnosis tables. Not sure if we should spend time on getting all relevant records at this stage. Let me know what you think. 

## Raven

In [4]:
%%read_sql
create or replace table ST_gh1_diag as
select
left(encrypted_key_1, 8)||left(encrypted_key_2, 8) as patient_id,
claim_number,
diagnosis,
diagnosis_sequence,
year_of_service
from
RWD_DB.RWD.RAVEN_CLAIMS_SUBMITS_DIAGNOSIS
where
left (diagnosis, 3) in ('E20')
or diagnosis in ('2521')
and year_of_service > '2016-12-31'
and year_of_service < '2018-01-01'

Query started at 12:00:33 PM GMT Daylight TimeInitiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...


KeyboardInterrupt: 

In [6]:
snow.select ("select count (distinct patient_id) from ST_gh1_diag where left(patient_id, 5) != 'XXX -' and diagnosis_sequence = '1'")

Unnamed: 0,COUNT (DISTINCT PATIENT_ID)
0,28532
