# Demo - Intereacting With MIDRC CT Scan Images

In this demo we will review how to import MIDRC imaging data, how to convert CT scan images from dicom (dcm) formats to png and jpeg formats, and how to view these CT scan images. This demo will also show how to extract file and patient metadata from the header of dicom (dcm) files.

### Import Data And Packages
Import the packages pydicom, pillow, and dicom_csv, as well as pandas, os and numpy. If any of these packages are not already installed to your workspace you can run one of the following:
- 'pip install < package >' in the workspace terminal
- '!pip install < package >' in a notebook cell

In [None]:
!pip install pydicom -q
!pip install pillow -q
!pip install dicom-csv -q

In [None]:
import pydicom
import numpy as np
from PIL import Image
import pandas as pd
import os
from dicom_csv import join_tree
import zipfile

Import data objects of CT scan images using the gen3SDK and unzip the files

In [None]:
!gen3 --commons_url data.midrc.org drs-pull object dg.MD1R/52ed5c59-1910-499b-a80e-00329209e148

In [None]:
zip_image_path = 'A840445/1.3.6.1.4.1.14519.5.2.1.99.1071.22152686345791690835528908062918/1.3.6.1.4.1.14519.5.2.1.99.1071.32717876047095240098568067022786.zip'

def unzip_all(zip_filepath, extract_to_dir):
    with zipfile.ZipFile(zip_filepath, 'r') as zip_ref:
        zip_ref.extractall(extract_to_dir)

extract_to_dir = 'COVID-19-NY-SBU'
unzip_all(zip_image_path, extract_to_dir)

All data objects are now stored under the folder 'COVID-19-NY-SBU'

### View Image

Read the dcm image using the relative file path.

In [None]:
image_path = 'COVID-19-NY-SBU/1.3.6.1.4.1.14519.5.2.1.99.1071.32717876047095240098568067022786/1-105.dcm'
ds = pydicom.dcmread(image_path)

Get the pixel arrays for the image.

In [None]:
new_image = ds.pixel_array.astype(float)
new_image

Scale the image's pixel array and convert to a uint8 integer.

In [None]:
scaled_image = (np.maximum(new_image, 0) / new_image.max()) * 255.0
scaled_image = np.uint8(scaled_image)
scaled_image

Use the Image package to convert the image array and show the image.

In [None]:
final_image = Image.fromarray(scaled_image)
final_image.show()

### Convert Images
Convert images form dcm format to jpeg and png formats and place converted image format to the original image folder.

In [None]:
def view_dicom_image(image_path):
    
    ds = pydicom.dcmread(image_path)
    
    new_image = ds.pixel_array.astype(float)
    
    scaled_image = np.uint8((np.maximum(new_image, 0) / new_image.max()) * 255.0)
    
    final_image = Image.fromarray(scaled_image)

    final_image.show()

def dcm_to_png(image_path):
    
    ds = pydicom.dcmread(image_path)
    
    new_image = ds.pixel_array.astype(float)
    
    scaled_image = np.uint8((np.maximum(new_image, 0) / new_image.max()) * 255.0)
    
    final_image = Image.fromarray(scaled_image)

    final_image.save(image_path.rsplit('/', 1)[1][:-3] + 'png')
    

def dcm_to_jpeg(image_path):
    
    ds = pydicom.dcmread(image_path)
    
    new_image = ds.pixel_array.astype(float)
    
    scaled_image = np.uint8((np.maximum(new_image, 0) / new_image.max()) * 255.0)
    
    final_image = Image.fromarray(scaled_image)

    final_image.save(image_path.rsplit('/', 1)[1][:-3] + 'jpg')    


Convert dicom image to png and save.

In [None]:
image_path = 'COVID-19-NY-SBU/1.3.6.1.4.1.14519.5.2.1.99.1071.32717876047095240098568067022786/1-103.dcm'
dcm_to_png(image_path)

Convert dicom image to jpg and save.

In [None]:
image_path = 'COVID-19-NY-SBU/1.3.6.1.4.1.14519.5.2.1.99.1071.32717876047095240098568067022786/1-088.dcm'
dcm_to_jpeg(image_path)

Display a few dicom images.

In [None]:
image_path = 'COVID-19-NY-SBU/1.3.6.1.4.1.14519.5.2.1.99.1071.32717876047095240098568067022786/1-001.dcm'
view_dicom_image(image_path)

In [None]:
image_path = 'COVID-19-NY-SBU/1.3.6.1.4.1.14519.5.2.1.99.1071.32717876047095240098568067022786/1-006.dcm'
view_dicom_image(image_path)

In [None]:
image_path = 'COVID-19-NY-SBU/1.3.6.1.4.1.14519.5.2.1.99.1071.32717876047095240098568067022786/1-053.dcm'
view_dicom_image(image_path)

### Extract Metadata

The following function will extract the file and patient metadata from the header of each dicom (.dcm) file within a given folder and place the collected metadata into a pandas dataframe.

In [None]:
def extract_metadata(base_folder):
  
    df = pd.DataFrame()
    file_folders = os.listdir(path = base_folder)
    
    for folder in file_folders:
            path = base_folder + '/' + folder
            meta = join_tree(path, verbose=2)
            df = pd.concat([df, meta])
        
    return df

In [None]:
base_folder =  'COVID-19-NY-SBU'
metadata = extract_metadata(base_folder)
metadata

Included in this metadata are import pieces of file and patient data, such as the body part examined and the patient's sex

In [None]:
metadata.columns[40:60]

In [None]:
metadata.BodyPartExamined

In [None]:
metadata.PatientSex