# ICIS Claim Data Processing Tutorial

## Step-by-Step Guide for ICIS Claim Data Processing

**Author**: Seokhoon Joo  

## Table of Contents
* [1. Setup and Data Loading](#1.-Setup-and-Data-Loading)
    * [1.1 Import Required Libraries](#1.1-Import-Required-Libraries)
    * [1.2 Load ICIS Claim Data](#1.2-Load-ICIS-Claim-Data)
    * [1.3 Load Main Disease Classification Data](#1.3-Load-Main-Disease-Classification-Data)
    * [1.4 Initialize ICIS Processor](#1.4-Initialize-ICIS-Processor)
* [2. Step-by-Step Processing](#2.-Step-by-Step-Processing)
    * [2.1 Data Validation](#2.1-Data-Validation)
    * [2.2 Data Cleansing](#2.2-Data-Cleansing)
    * [2.3 Data Preparation](#2.3-Data-Preparation)
    * [2.4 Data Calculations](#2.4-Data-Calculations)
    * [2.5 Merge Calculated Data](#2.5-Merge-Calculated-Data)
* [3. Complete Pipeline Processing](#3.-Complete-Pipeline-Processing)
    * [3.1 Pipeline Execution](#3.1-Pipeline-Execution)
    * [3.2 Results Validation](#3.2-Results-Validation)
    * [Appendix: Error Handling](#Appendix:-Error-Handling)

## 1. Setup and Data Loading

### 1.1 Import Required Libraries

In [1]:
import pandas as pd
from underwriter.icis import ICIS

### 1.2 Load ICIS Claim Data

In [None]:
claim = pd.read_csv('data/claim.csv')
print("Initial claim data:")
print("Shape:", claim.shape)
print("\nColumns:", claim.columns.tolist())
print("\nFirst few rows:")
display(claim.head())

### 1.3 Load Main Disease Classification Data

In [None]:
main = pd.read_csv('data/main.csv')
print("\nMain disease classification data:")
print("Shape:", main.shape)
print("\nColumns:", main.columns.tolist())
print("\nFirst few rows:")
display(main.head())

### 1.4 Initialize ICIS Processor

In [4]:
icis = ICIS(claim=claim, main=main)

## 2. Step-by-Step Processing

### 2.1 Data Validation

In [None]:
print("\n2.1 Data Validation")
print("-----------------")
try:
    icis.validate_columns()
    print("✓ Column validation successful")
except ValueError as e:
    print(f"✗ Validation error: {e}")

### 2.2 Data Cleansing

In [None]:
print("\n2.2 Data Cleansing")
print("----------------")

print("• Initial claim data shape:", icis.claim.shape)
display(icis.claim.head())

print("\n1) Removing duplicates...")
icis.drop_duplicates()
print("• Shape after deduplication:", icis.claim.shape)
display(icis.claim.head())

print("\n2) Forward filling KCD codes...")
icis.fill_kcd_forward()
print("• Shape after forward fill:", icis.filled.shape)
display(icis.filled.head())

print("\n3) Filtering by claim date...")
icis.filter_by_clm_date()
print("• Shape after date filtering:", icis.filled.shape)
display(icis.filled.head())

### 2.3 Data Preparation

In [None]:
print("\n2.3 Data Preparation")
print("------------------")

print("1) Setting medical care types...")
icis.set_type()
print("• Data with medical care types:")
display(icis.filled[['id', 'clm_date', 'type']].head())

print("\n2) Modifying hospital end dates...")
icis.set_hos_edate_mod()
print("• Data with modified hospital end dates:")
display(icis.filled[['id', 'hos_edate', 'hos_edate_mod']].head())

print("\n3) Converting to long format...")
icis.melt()
print("• Melted data shape:", icis.melted.shape)
display(icis.melted.head())

print("\n4) Processing KCD information...")
icis.set_sub_kcd()
icis.merge_main_info()
icis.filter_sub_kcd()
print("• Shape after KCD processing:", icis.melted.shape)
display(icis.melted.head())

### 2.4 Data Calculations

In [None]:
print("\n2.4 Data Calculations")
print("------------------")

print("1) Setting date ranges...")
icis.set_date_range()

print("\n2) Calculating hospitalization days...")
icis.calc_hos_day()
print("• Hospitalized data shape:", icis.hospitalized.shape)
display(icis.hospitalized.head())

print("\n3) Calculating surgery counts...")
icis.calc_sur_cnt()
print("• Surgery data shape:", icis.underwent.shape)
display(icis.underwent.head())

print("\n4) Calculating elapsed days...")
icis.calc_elp_day()
print("• Elapsed days data shape:", icis.elapsed.shape)
display(icis.elapsed.head())

### 2.5 Merge Calculated Data

In [None]:
print("\n2.5 Final Merge")
print("-------------")
step_result = icis.merge_calculated()
print("• Final result shape:", step_result.shape)
print("• Final columns:", step_result.columns.tolist())
display(step_result.head())

## 3. Complete Pipeline Processing

### 3.1 Pipeline Execution

In [None]:
print("\n3.1 Pipeline Execution")
print("--------------------")

# Initialize new ICIS instance
icis_pipeline = ICIS(claim=claim, main=main)

# Process ICIS claim data using complete pipeline
print("Processing ICIS claim data using icis.process()...")
pipeline_result = icis_pipeline.process()
print("\n✓ Processing completed successfully!")
print("• Final result shape:", pipeline_result.shape)

### 3.2 Results Comparison

In [None]:
print("\n3.2 Results Validation")
print("--------------------")
# Compare results
print("\nResults Comparison:")
print("• Step-by-step shape:", step_result.shape)
print("• Pipeline shape:", pipeline_result.shape)

are_equal = step_result.equals(pipeline_result)
print(f"\n✓ Results are identical: {are_equal}")

if not are_equal:
    print("\nDifferences in columns:")
    print(set(step_result.columns) ^ set(pipeline_result.columns))

### Appendix: Error Handling

In [None]:
print("\nAppendix: Error Handling")
print("----------------")
# Example of error handling with invalid data
print("Testing error handling with invalid input...")

try:
    # Create invalid data for testing
    invalid_claim = claim.drop(columns=['id'])
    invalid_icis = ICIS(claim=invalid_claim, main=main)
    invalid_result = invalid_icis.process()
except ValueError as e:
    print(f"\n✓ Validation error caught successfully: {e}")
except RuntimeError as e:
    print(f"\n✓ Processing error caught successfully: {e}")
except Exception as e:
    print(f"\n✓ Unexpected error caught successfully: {e}")