# RSNA Intracranial Hemorrhage Detection
In this competition, your challenge is to build an algorithm to detect acute intracranial hemorrhage and its subtypes.

## Intracranial hemorrhage

![subtype](sample/subtypes-of-hemorrhage.png)

## Opening
- Datatype
    - <a href='#hemo_label'>Label</a>: CSV file
    - <a href='#Image_Folder'>Image Folder</a>: DICOM file
        
- Working with dicom files
    - <a href='#image'>`Image-wise steps`</a>
        - Step1: Control File Meta Information
        - Step2-1: Control dicom image
        - Step2-2: Transforming to Hounsfield Units
        - Step2-3: Image Windowing
    - <a href='#slice'>`Slice-wise steps`</a>
        - Step1-1: Load CT-scans per patient
        - Step1-2: Visualize
        - Step2: Slices Windowing
    - <a href='#voxel'>`Voxel-wise steps`</a>
        - Step1: The voxel size
        - Step2: Slice Thickness
        - Step3: Resampling the voxel size

In [2]:
from IPython.display import Image
import os
import numpy as np
import pandas as pd
import scipy
import seaborn as sns
import pydicom

<a id='hemo_label'></a>
## Label
- The probability of whether that sub-type of hemorrhage (or any hemorrhage in the case of any) exists in the indicated image.
- Subtypes
    - epidural_hemorrhage
    - intraparenchymal_hemorrhage
    - intraventricular_hemorrhage
    - subarachnoid_hemorrhage
    - subdural_hemorrhage
    - any

In [3]:
!pwd

/home/CT


In [7]:
!tree ./input/rsna-intracranial-hemorrhage-detection -L 1

[01;34m./input/rsna-intracranial-hemorrhage-detection[00m
├── stage_2_sample_submission.csv
├── [01;34mstage_2_test[00m
├── [01;34mstage_2_train[00m
└── stage_2_train.csv

2 directories, 2 files


In [13]:
basepath = "./input/rsna-intracranial-hemorrhage-detection/"
os.listdir(basepath)

['stage_2_train.csv',
 'stage_2_train',
 'stage_2_sample_submission.csv',
 'stage_2_test']

- `Id` : `[Image Id]_[Sub-type Name]` 
- `Label` : `probability of whether that sub-type of hemorrhage`

In [16]:
train = pd.read_csv(basepath + "stage_2_train.csv")

In [18]:
train.tail(6)

Unnamed: 0,ID,Label
4516836,ID_4a85a3a3f_epidural,0
4516837,ID_4a85a3a3f_intraparenchymal,0
4516838,ID_4a85a3a3f_intraventricular,0
4516839,ID_4a85a3a3f_subarachnoid,0
4516840,ID_4a85a3a3f_subdural,0
4516841,ID_4a85a3a3f_any,0


In [20]:
train.Label.isnull().sum()

0

<a id='Image_Folder'></a> 
## Image Folder
- The name of each image is given by: `ID_[Image Id].dcm`
- Example
    - ID_4a85a3a3f.dcm

In [21]:
!tree ./input/rsna-intracranial-hemorrhage-detection

[01;34m./input/rsna-intracranial-hemorrhage-detection[00m
├── stage_2_sample_submission.csv
├── [01;34mstage_2_test[00m
│   ├── ID_000000e27.dcm
│   ├── ID_000009146.dcm
│   ├── ID_00007b8cb.dcm
│   ├── ID_000134952.dcm
│   └── ID_000176f2a.dcm
├── [01;34mstage_2_train[00m
│   ├── ID_000012eaf.dcm
│   ├── ID_000039fa0.dcm
│   ├── ID_00005679d.dcm
│   ├── ID_00008ce3c.dcm
│   └── ID_0000950d7.dcm
└── stage_2_train.csv

2 directories, 12 files


In [22]:
IMG_EXTENSION = ['.dcm', '.DCM']

In [23]:
def check_extension(filename):
    return any(filename.endswith(extension) for extension in ['.dcm', '.DCM'])

def load_scans_path(folder_path):
    """
    find 'IMG_EXTENSION' file paths in folder.
    return list
    
    
    """
    
    img_paths = []
    assert os.path.isdir(folder_path), '%s is not a valid directory'

    for root, _, fnames in sorted(os.walk(folder_path)):
        for fname in fnames:
            if check_extension(fname):
                path = os.path.join(root, fname)
                img_paths.append(path)
    return img_paths[:]

In [24]:
#trainpath = "/home/Samples/3/44173799/Thick CT"
trainpath = "./input/rsna-intracranial-hemorrhage-detection/stage_2_train"

train_img_paths = load_scans_path(trainpath)

In [25]:
train_img_paths

['./input/rsna-intracranial-hemorrhage-detection/stage_2_train/ID_00005679d.dcm',
 './input/rsna-intracranial-hemorrhage-detection/stage_2_train/ID_000039fa0.dcm',
 './input/rsna-intracranial-hemorrhage-detection/stage_2_train/ID_000012eaf.dcm',
 './input/rsna-intracranial-hemorrhage-detection/stage_2_train/ID_00008ce3c.dcm',
 './input/rsna-intracranial-hemorrhage-detection/stage_2_train/ID_0000950d7.dcm']

### Working with dicom files

In [26]:
example_path = train_img_paths[0]

In [27]:
example = pydicom.read_file(example_path, force=True) # reading even if no File Meta Information header is found

In [28]:
print(example)

Dataset.file_meta -------------------------------
(0002, 0000) File Meta Information Group Length  UL: 176
(0002, 0001) File Meta Information Version       OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID         UI: CT Image Storage
(0002, 0003) Media Storage SOP Instance UID      UI: 9999.59849366137338388474655966667577915843
(0002, 0010) Transfer Syntax UID                 UI: Explicit VR Little Endian
(0002, 0012) Implementation Class UID            UI: 1.2.40.0.13.1.1.1
(0002, 0013) Implementation Version Name         SH: 'dcm4che-1.4.38'
-------------------------------------------------
(0008, 0018) SOP Instance UID                    UI: ID_00005679d
(0008, 0060) Modality                            CS: 'CT'
(0010, 0020) Patient ID                          LO: 'ID_18f2d431'
(0020, 000d) Study Instance UID                  UI: ID_b5c26cda09
(0020, 000e) Series Instance UID                 UI: ID_203cd6ec46
(0020, 0010) Study ID                            SH: ''
(0020, 003

In [29]:
ex_img = example.pixel_array

In [30]:
ex_img.shape, type(ex_img), ex_img.dtype

((512, 512), numpy.ndarray, dtype('int16'))