competitions/kaggle/26680/logos/header.png)
# SIIM-FISABIO-RSNA COVID-19 Detection &nbsp;📌📌📌

## <font color=Green>Identify and localize COVID-19 abnormalities on chest radiographs</font> &nbsp;⛳⛳⛳

#### <font color=#07d2f3>The contents of this notebook:</font>
##### &nbsp; 1. Display .dcm medical images on the notebook  &nbsp;🎈
##### &nbsp; 2. Convert .dcm to .jpg  &nbsp;🎈
##### &nbsp; 3. Label distribution  &nbsp;🎈
##### &nbsp; 4. Example of displaying training set labels  &nbsp;🎈
##### &nbsp; 5. Training set regression box distribution  &nbsp;🎈
##### &nbsp; 6. Related papers  &nbsp;🎈

## 1. Display .dcm medical images on the notebook

In [None]:
import pydicom
import pylab

filePath='../input/siim-covid19-detection/test/00508faccd39/d39fc1121992/951211f8e1bb.dcm'
ds=pydicom.read_file(filePath)

print(ds.dir("pat"))

In [None]:
pix = ds.pixel_array
print(pix)
print(pix.shape)

In [None]:
pylab.imshow(ds.pixel_array, cmap=pylab.cm.bone)
pylab.show()

## 2. Convert .dcm to .jpg

This part will take you about an hour and can be skipped. The jpg picture has been generated for you (siim-covid19-dete-jpg).
Refer: https://www.kaggle.com/xhlulu/siim-covid-19-convert-to-jpg-256px

In [None]:
!conda install gdcm -c conda-forge -y

In [None]:
import os
from PIL import Image
import pandas as pd
from tqdm.auto import tqdm

In [None]:
def read_xray(path, voi_lut = True, fix_monochrome = True):
    # Original from: https://www.kaggle.com/raddar/convert-dicom-to-np-array-the-correct-way
    dicom = pydicom.read_file(path)
    
    # VOI LUT (if available by DICOM device) is used to transform raw DICOM data to 
    # "human-friendly" view
    if voi_lut:
        data = apply_voi_lut(dicom.pixel_array, dicom)
    else:
        data = dicom.pixel_array
               
    # depending on this value, X-ray may look inverted - fix that:
    if fix_monochrome and dicom.PhotometricInterpretation == "MONOCHROME1":
        data = np.amax(data) - data
        
    data = data - np.min(data)
    data = data / np.max(data)
    data = (data * 255).astype(np.uint8)
        
    return data

In [None]:
def resize(array, size, keep_ratio=False, resample=Image.LANCZOS):
    # Original from: https://www.kaggle.com/xhlulu/vinbigdata-process-and-resize-to-image
    im = Image.fromarray(array)
    
    if keep_ratio:
        im.thumbnail((size, size), resample)
    else:
        im = im.resize((size, size), resample)
    
    return im

In [None]:
train = pd.read_csv('../input/siim-covid19-detection/train_image_level.csv')

In [None]:
path = '../input/siim-covid19-detection/train/ae3e63d94c13/288554eb6182/e00f9fe0cce5.dcm'
dicom = pydicom.read_file(path)

In [None]:
image_id = []
dim0 = []
dim1 = []
splits = []

for split in ['test', 'train']:
    save_dir = f'/kaggle/tmp/{split}/'

    os.makedirs(save_dir, exist_ok=True)
    
    for dirname, _, filenames in tqdm(os.walk(f'../input/siim-covid19-detection/{split}')):
        for file in filenames:
            # set keep_ratio=True to have original aspect ratio
            xray = read_xray(os.path.join(dirname, file))
            im = resize(xray, size=256)  
            im.save(os.path.join(save_dir, file.replace('dcm', 'jpg')))

            image_id.append(file.replace('.dcm', ''))
            dim0.append(xray.shape[0])
            dim1.append(xray.shape[1])
            splits.append(split)

In [None]:
%%time
!tar -zcf train.tar.gz -C "/kaggle/tmp/train/" .
!tar -zcf test.tar.gz -C "/kaggle/tmp/test/" .

In [None]:
df = pd.DataFrame.from_dict({'image_id': image_id, 'dim0': dim0, 'dim1': dim1, 'split': splits})
df.to_csv('meta.csv', index=False)

## 3. Label distribution

# Follow up. . .

## 4. Example of displaying training set labels

## 5. Training set regression box distribution

## 6. Related papers