<div class='alert alert-info' style='text-align: center'><h1>Standardizing chest x-ray dataset exports</h1>
    - yet another chest x-ray processing notebook -
</div>

#### In this competition, the DICOMs come from multiple modalities, with different bit depths and various post-processing filters applied.
#### This causes equalization/normalization anomalies to appear. Some images are much brighter .. some more blurred etc.

- The goal here is to make the images look a little more similar across image types.
- The important thing is to perform this before crunching the pixels down to 8 bit.

Here's a notebook that exports the entire SIIM-FISABIO COVID-19 DICOM dataset with this processing applied -> https://www.kaggle.com/davidbroberts/export-processed-jpg-512

This is the dataset I created using the above notebook -> https://www.kaggle.com/davidbroberts/siimcovid-jpg-512-processed

In [None]:
# You may need to uncomment and run the conda install. Then restart the notebook if GDCM pukes.
#!conda install gdcm -c conda-forge -y

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
from os import path
import pydicom
from skimage.filters import unsharp_mask
from skimage import exposure

In [None]:
# Function to get some test images
def get_some_images(process):
    
    count = 0
    base_path = "../input/siim-covid19-detection/"
    fig = plt.figure(figsize=(15, 6))

    for study_dir in sorted(os.listdir(base_path + "/test")):
        for series_dir in os.listdir(base_path + "/test/" + study_dir):
            for image in os.listdir(base_path + "/test/" + study_dir + "/" + series_dir):
                im = image.split(".")
                file_in = base_path + "/test/" + study_dir + "/" + series_dir + "/" + image
            
                img = pydicom.dcmread(file_in)
                pixels = img.pixel_array
                
                if img.PhotometricInterpretation == "MONOCHROME1":
                    pixels = np.amax(pixels) - pixels
                    
                if process:
                    pixels = remove_borders(pixels)      
                    pixels = process_image(pixels)
                 
                pixels = pixels - np.min(pixels)
                pixels = pixels / np.max(pixels)
                pixels = (pixels * 255).astype(np.uint8)
                
                count += 1         
                fig.add_subplot(3, 5, count)
                plt.imshow(pixels, cmap='gray')

        if count > 14:
            break
    plt.show()

In [None]:
# Apply unsharp mask and hist equalization
def process_image(pixels):

    # Tweak the radius and amount for more/less sharpening
    unsharp = unsharp_mask(pixels, radius=5, amount=2)
    equalized = exposure.equalize_hist(unsharp)
    
    return equalized

#### The next function is designed to crop pixel rows off image edges that are all the same color.

- More information can be found about this technique on this notebook I made -> https://www.kaggle.com/davidbroberts/cropping-chest-x-rays
- This notebook shows how to retain BB info when cropping -> https://www.kaggle.com/davidbroberts/bounding-boxes-on-cropped-images

*note - This notebook does not save the BB info, you'll need to add it to your export notebook, or this one

In [None]:
# Try to remove borders
def remove_borders(pixels):
    x = 0
    y = 0
    img_orig = pixels
    w = pixels.shape[1]
    h = pixels.shape[0]

    for i in range(h):
        if not np.all(pixels[i] == pixels[i][0]):
            y = i
            break
              
    for i in range(h-1, 0, -1):
        if not np.all(pixels[i] == pixels[i][0]):
            h = i
            break
            
    pixels = pixels[y:h,x:w] 
    pixels = np.rot90(pixels)
    
    w = pixels.shape[1]
    h = pixels.shape[0]

    for i in range(h):
        if not np.all(pixels[i] == pixels[i][0]):
            y = i
            break
              
    for i in range(h-1, 0, -1):
        if not np.all(pixels[i] == pixels[i][0]):
            h = i
            break
            
    pixels = pixels[y:h,x:w]
    img_cropped = np.rot90(pixels, 3)
    return img_cropped

#### Take a look at a sample of images without and then with processing
- The first set will be exported by pydicom .. using *some* algoritm with default settings. This is ugly!
- We'll add processing to the same images using the `process=True` argument.

In [None]:
get_some_images(process=False)

In [None]:
get_some_images(process=True)

<div class="alert alert-info">
    <h2> - Now that's a good looking dataset!</h2>
</div>

- By setting the process=True flag, we enable unsharp masking and histogram leveling.
- If you look at the sets as a single image, you can see more standard colors and some cropping has happened in the second set.
- Notice the image in the lower right corner, it was useless before we leveled it.
- This works well on most of the images in the comp dataset. Some are already over processed though.

In [None]:
#Load specific a image manually to verify and tweak processing settings.
img = pydicom.dcmread('../input/siim-covid19-detection/test/00d63957bc3a/07919a1b758c/dbae9b9b9500.dcm')

pixels = img.pixel_array

if img.PhotometricInterpretation == "MONOCHROME1":
    pixels = np.amax(pixels) - pixels

cropped = remove_borders(pixels)
cropped = process_image(cropped)

plt.figure(figsize=(15,5))
plt.subplot(1, 2, 1)
plt.title('Before')
plt.imshow(pixels,cmap="gray");

plt.subplot(1, 2, 2)
plt.title('After')
plt.imshow(cropped,cmap="gray");

### Conclusion

- By making a simple processing pipeline, we can create more cohesive image sets.

**Here are some other processing notebooks I made:**
- Applying filters to x-rays -> https://www.kaggle.com/davidbroberts/applying-filters-to-chest-x-rays
- Rib supression on Chest X-Rays -> https://www.kaggle.com/davidbroberts/rib-suppression-poc
- Manual DICOM VOI LUT -> https://www.kaggle.com/davidbroberts/manual-dicom-voi-lut
- Apply Unsharp Mask to Chest X-Rays -> https://www.kaggle.com/davidbroberts/unsharp-masking-chest-x-rays
- Cropping Chest X-Rays -> https://www.kaggle.com/davidbroberts/cropping-chest-x-rays
- Bounding Boxes on Cropped Images -> https://www.kaggle.com/davidbroberts/bounding-boxes-on-cropped-images
- Visualizing Chest X-Ray bit planes -> https://www.kaggle.com/davidbroberts/visualizing-chest-x-ray-bitplanes
- DICOM full range pixels as CNN input -> https://www.kaggle.com/davidbroberts/dicom-full-range-pixels-as-cnn-input