## This notebook performs the following steps for inference:
* Read pixel data from dicom images and check whether data has valid attributes that agree with our model training
* Preprocess images with normalization and standardization + make image 224, 224, 3 to suit the model input format
* Load our built and trained model and predict whether the patient has pneumonia from his or her chest X-ray

In [1]:
# Import dependencies
import numpy as np
import pandas as pd
import pydicom
%matplotlib inline
import matplotlib.pyplot as plt
import keras 
from skimage.transform import rescale 

Using TensorFlow backend.


In [2]:
sample = pydicom.dcmread('test1.dcm')

In [3]:
print(sample)

(0008, 0016) SOP Class UID                       UI: Secondary Capture Image Storage
(0008, 0018) SOP Instance UID                    UI: 1.3.6.1.4.1.11129.5.5.110503645592756492463169821050252582267888
(0008, 0060) Modality                            CS: 'DX'
(0008, 1030) Study Description                   LO: 'No Finding'
(0010, 0020) Patient ID                          LO: '2'
(0010, 0040) Patient's Sex                       CS: 'M'
(0010, 1010) Patient's Age                       AS: '81'
(0018, 0015) Body Part Examined                  CS: 'CHEST'
(0018, 5100) Patient Position                    CS: 'PA'
(0020, 000d) Study Instance UID                  UI: 1.3.6.1.4.1.11129.5.5.112507010803284478207522016832191866964708
(0020, 000e) Series Instance UID                 UI: 1.3.6.1.4.1.11129.5.5.112630850362182468372440828755218293352329
(0028, 0002) Samples per Pixel                   US: 1
(0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
(0028, 0010) Rows       

In [4]:
def check_dicom(filename): 
    
    """Reads in a .dcm file and returns a numpy array of just the imaging data. Also check for attributes."""
    
    print('Load file {} ...'.format(filename))
    ds = pydicom.dcmread(filename)       
    img = ds.pixel_array
    
    if ds.StudyDescription:
        print(f'Radiologist thinks: {ds.StudyDescription}')      
    
    ## The chosen attributes to check depends on what data you want your algo to be used on
    ## In our case, we only care about whether it is a chest X-ray taken in AP or PA position
    if ds.Modality not in ['DX', 'CT']:
        print('Invalid modality...')
        return None
    
    if ds.PatientPosition not in ['AP', 'PA']:
        print('Invalid patient position...')
        return None
    
    if ds.BodyPartExamined not in ['CHEST', 'RIBCAGE']:
        print('Invalid body part examined...')
        return None
    
    return img
    
def preprocess_image(img,img_mean,img_std,img_size):
    
    """Takes the numpy array output by check_dicom and runs the appropriate pre-processing needed for our model input"""
    
    img = img / 255.0
    gray_img = (img - img_mean) / img_std
    gray_img = rescale(gray_img, 0.21875, anti_aliasing=True)
    
    proc_img = np.zeros((224, 224, 3))
    
    proc_img[:, :, 0] = gray_img
    proc_img[:, :, 1] = gray_img
    proc_img[:, :, 2] = gray_img
    proc_img = np.resize(proc_img, (1, 224, 224, 3))
    
    return proc_img
  
def load_model(model_path, weight_path):
    
    """Loads in our trained model w/ weights and compiles it"""
    
    json_file = open(model_path, 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    
    my_model = keras.models.model_from_json(loaded_model_json)
    
    my_model.load_weights(weight_path)
    
    return my_model

# This function uses our device's threshold parameters to 
def predict_image(model, img, thresh): 
    
    """Predict whether or not the image shows the presence of pneumonia using our trained model"""
    
    result = model.predict(img)
    predict = result[0]
    prediction = 'No Pneumonia'
    
    if (predict > thresh):
        prediction = 'Pneumonia'
    
    return prediction 

In [5]:
test_dicoms = ['test1.dcm','test2.dcm','test3.dcm','test4.dcm','test5.dcm','test6.dcm']

model_path = 'my_model.json'
weight_path = 'xray_class_my_model.best.hdf5'

IMG_SIZE=(1,224,224,3) # This might be different if you did not use vgg16
img_mean = 0.53
img_std = 0.24

my_model = load_model(model_path, weight_path)
thresh = 0.6703287

## Use the .dcm files to test your prediction
for i in test_dicoms:
    
    img = np.array([])
    img = check_dicom(i)
    
    if img is None:
        continue
        
    img_proc = preprocess_image(img,img_mean,img_std,IMG_SIZE)
    pred = predict_image(my_model,img_proc,thresh)
    
    print(f'Algorithm predicts: {pred}')

Load file test1.dcm ...
Radiologist thinks: No Finding
Algorithm predicts: No Pneumonia
Load file test2.dcm ...
Radiologist thinks: Cardiomegaly
Algorithm predicts: No Pneumonia
Load file test3.dcm ...
Radiologist thinks: Effusion
Algorithm predicts: Pneumonia
Load file test4.dcm ...
Radiologist thinks: No Finding
Algorithm predicts: No Pneumonia
Load file test5.dcm ...
Radiologist thinks: No Finding
Algorithm predicts: No Pneumonia
Load file test6.dcm ...
Radiologist thinks: No Finding
Invalid patient position...


**Seems like we were able to get some predictions from our trained network. Detecting Pneumonia is hard even for experienced radiologists so it is okay if our model is not perfect. One thing to notice is that test6.dcm does not have the correct Patient Position attribute that our model was trained on. Even though it looks similar to what we were expecting from 'AP' or 'PA' positioned patients, it's better that we do not use our model to make any inference on it. And we can see that our model confuses itself with pneumonia and effusion, which is quite common in a clinical setting since the distinct features of effusion and pneumonia are quite similar.**