## Create unique index for all the types of medical code (diag, proc, revenue, and GPI) using the provided input training table:



The final index table created will be used to generate the IHAN format training data and OOT (Out-of-Time) data.
For the creation OOT modelling data,we will need to use the corresponding input training index table to create the IHAN format data.
The final index table will also be used in the Interpretation part to retrieve the actual code and description of the code for the output table.


1.Update the ihan_preg_high_risk_mother_idx_dict_build_tbl.sh file:

    a.Provide the correct Target (SQL filename : example ihan_preg_high_rsik_mother_idx_dict_build_tbl.sql ) properly.
    b.If required, give a new log file name.
    c.Provide the correct input parameters:
        *iter: Specify the iteration version, for example, iter4_v1_training.

        *input_tbl: Provide the input training table, for example, NON_CRTFD_AIFS.DL_TS_STAR.piad_tmp_wgs_cob_high_risk_mom_final_df_iter_4_v1_traning.
        
2.Enter the Snowflake password when prompted.

3.Note down the generated final index table name

Output: Final index table (ex :NON_CRTFD_AIFS.DL_TS_STAR.ihan_high_risk_preg_mother_medical_code_idx_final_iter4_v1_training)

## ihan_preg_high_risk_mother_data_givenIndexTbl.sql (Explaining the steps involved in SQL code for Medical Code and Static Feature Extraction for Pregnancy High-Risk Prediction)

1. Creation of Medical Diagnosis (DX) Code Table(&dx_t1)
2. Creation of Procedure (PROC) Code Table (&proc_t1 )
3. Creation of Revenue (RVNU) Code Table (&rvnu_t1)
4. Creation of Medical RX (GPI) Code Table (&gpi_t1)
5. Merging Diagnosis Codes (&dx_t1) and Index (&ihan_final_idx) to create &dx_t3
6. Merging Procedure Codes (&proc_t1) and Index (&ihan_final_idx) to create &proc_t3
7. Merging Revenue Codes (&rvnu_t1) and Index (&ihan_final_idx) to create &rvnu_t3
8. Merging GPI Codes (&gpi_t1) and Index (&ihan_final_idx) to create &gpi_t3
9. Creating Final Diagnosis, Procedure, Revenue, and GPI Code Tables
10. Creating Static Feature along with Target and other features Table (&static_tbl)
11. Merging All Tables (&ihan_tbl&iter) : The final step involves merging the static feature table (&static_tbl) with the four medical code tables (&final_dx_data, &final_proc_data, &final_rvnu_data, &final_gpi_data) to create a comprehensive table &ihan_tbl&iter.


The final output of this sql script is the &ihan_tbl&iter table, which contains the complete set of features (static and medical codes) for each MCID, along with the target variable.


## Create the training or OOT data in IHAN format using the generated index table

1.Update the ihan_preg_high_risk_mother_data_givenIndexTbl.sh file:

    a. Provide the correct Target (SQL filename : example ihan_preg_high_risk_mother_data_givenIndexTbl.sql) properly.
    b. If required, give a new log file name.
    c. Provide the correct input parameters:

        1.iter: Specify the iteration version, for example, iter4_v1_training.
        2.input_tbl: Provide the input training table name, for example: NON_CRTFD_AIFS.DL_TS_STAR.piad_tmp_wgs_cob_high_risk_mom_final_df_iter_4_v1_traning.
        3.ihan_final_idx: Provide the training index table name, for example: NON_CRTFD_AIFS.DL_TS_STAR.ihan_high_risk_preg_mother_medical_code_idx_final_iter4_v1_training.
    
2.Enter the Snowflake password when prompted.

3.Note down the generated input training table name or test table name.

The script ihan_preg_high_risk_mother_data_givenIndexTbl.sh will generate the IHAN format training or test data table based on the provided index table.
Make a note of the generated table name, as you will need to use it in the subsequent steps.

Output : Training data or test data table

Example : Training table : NON_CRTFD_AIFS.DL_TS_STAR.ihan_high_risk_preg_mother_final_data_iter_4_v1_training
          
          OOT table : NON_CRTFD_AIFS.DL_TS_STAR.ihan_high_risk_preg_mother_final_data_iter_4_v1_OOT
          
 Columns :
 
MCID : Subscrbier key 
 
Target : Target column
 
Datlist1 & Dos1 - Diag medical code

Datlist2 & Dos2 - Proc medical code

Datlist3 & Dos3 - Revenue medical code

Datlist4 & Dos4 - GPI medical code
          

## Input Data Preparation for IHAN Modeling (input_data_prep.ipynb)


1.Balancing the input data using a specified balance ratio.

2.Splitting the balanced data into training, validation, and test sets based on the fraction variable .Save the train and validation files in csv format

3.Save the OOT data in csv format

4.Save the Dictionary Index Table in csv format .It can be used to interpret the model's outputs.

## High-Risk Newborn Mothers Prediction IHAN Model 

Example : ihan_binary_high_risk_preg_IHAN201.ipynb


Interpretable Hierarchical Attention Network for Medical Condition Identification
IHAN (Interpretable Hierarchical Attention Network) is a deep learning model we developed that can not only predict, but also interpret. IHAN uses a hierarchical attention structure that matches naturally with the medical history data structure and reflects patients’ encounter (date of service) sequence. The model attention structure consists of 3 levels: (1) attention on the medical code types (diagnosis codes, procedure codes, lab test results, and prescription drugs), (2) attention on the sequential medical encounters within a type, (3) attention on the individual medical codes within an encounter and type.

The model uses the medical data at its most raw form, the sequentially ordered medical codes, without any further data aggregation, transformation, or mapping. This greatly simplifies data preparation process, mitigates chance for error and eliminates post modeling work needed for traditional model explanation, and more importantly, generates results that medical personal can appreciate.

In V1, only sequential codes are allowed. In V2, we also allow sequential code-value pairs and static features



#Python environment
ihan_env.yml: the python environment file Create the same environment by running: conda env create -f ihan_env.yml

1.Import the necessary packages

2.Intialize the number of unique medical codes

3.Load the train and validation data from the specified paths

4.Set the training hyper parameters

5.Train the model

6.Evaluate on out-of-time (OOT) test data using trained model

7.Interpret the model

8.Save the interpretation results to Snowflake