# Save RSNA dataset as 1024x1024 PNGs.

This notebook is adapted (with minor modifications) from:  

https://www.kaggle.com/code/christofhenkel/se-resnext50-full-gpu-decoding

The notebook uses DALI for fast processing of dicom images, and should run in around 6 hours. The resulting dataset is stored in:  

https://www.kaggle.com/datasets/lucasrr/rsna-1024x1024-pngs-small


## Introduction

This kernels uses the recent pip wheel of DALI for decoding dicoms using GPU. It works for all JPEG2000 and most of the JPEG-lossless formated images.

The decoding work strongly is based on the kernels of Theo Viel (@theoviel) and David Austin (@tivfrvqhs5)

***WARNING***: Allthough the GPU decoding works for all train images, a few of the JPEG-lossless formated DICOMS (TransferSyntaxUID == '1.2.840.10008.1.2.4.70') of the hidden test set cannot be decoded. So its crucial to have a CPU fallback in place so the notebook wont throw an exception in the submission re-run

## Requirements

We start with installing pip requirements.

In [1]:
!pip install -q timm==0.6.5 --no-index --find-links=/kaggle/input/rsna-bc-pip-requirements
!pip install -q albumentations==1.2.1 --no-index --find-links=/kaggle/input/rsna-bc-pip-requirements
!pip install -q pylibjpeg-libjpeg==1.3.1 --no-index --find-links=/kaggle/input/rsna-bc-pip-requirements
!pip install -q pydicom==2.0.0 --no-index --find-links=/kaggle/input/rsna-bc-pip-requirements
!pip install -q python-gdcm==3.0.20 --no-index --find-links=/kaggle/input/rsna-bc-pip-requirements
!pip install -q dicomsdl==0.109.1 --no-index --find-links=/kaggle/input/rsna-bc-pip-requirements

[0m

Then we install the latest DALI packaging which we will use for GPU decoding

In [2]:
!pip install -q /kaggle/input/nvidia-dali-nightly-cuda110-1230dev/nvidia_dali_nightly_cuda110-1.23.0.dev20230203-7187866-py3-none-manylinux2014_x86_64.whl

[0m

Next, we import all the packages we need and patch a function to allow for INT16 support

In [3]:
import timm
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os, sys
from copy import copy
import gc
import shutil 
import time

import glob
from scipy.special import expit

import albumentations as A
import cv2
cv2.setNumThreads(0)

import dicomsdl
import pydicom
from pydicom.filebase import DicomBytesIO

from os.path import join

from tqdm import tqdm

from joblib import Parallel, delayed
import multiprocessing as mp

from types import SimpleNamespace
from typing import Any, Dict

import torch
import torch.nn.functional as F
from torch import nn
from torch.nn.parameter import Parameter
from torch.utils.data import Dataset, DataLoader
from torch.cuda.amp import GradScaler, autocast


import nvidia.dali.fn as fn
import nvidia.dali.types as types
from nvidia.dali import pipeline_def
from nvidia.dali.types import DALIDataType

In [4]:
#we need to patch DALI for Int16 support

from nvidia.dali.backend import TensorGPU, TensorListGPU
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.ops as ops
from nvidia.dali import types
from nvidia.dali.plugin.base_iterator import _DaliBaseIterator
from nvidia.dali.plugin.base_iterator import LastBatchPolicy
import torch
import torch.utils.dlpack as torch_dlpack
import ctypes
import numpy as np
import torch.nn.functional as F
import pydicom

to_torch_type = {
    types.DALIDataType.FLOAT:   torch.float32,
    types.DALIDataType.FLOAT64: torch.float64,
    types.DALIDataType.FLOAT16: torch.float16,
    types.DALIDataType.UINT8:   torch.uint8,
    types.DALIDataType.INT8:    torch.int8,
    types.DALIDataType.UINT16:  torch.int16,
    types.DALIDataType.INT16:   torch.int16,
    types.DALIDataType.INT32:   torch.int32,
    types.DALIDataType.INT64:   torch.int64
}


def feed_ndarray(dali_tensor, arr, cuda_stream=None):
    """
    Copy contents of DALI tensor to PyTorch's Tensor.

    Parameters
    ----------
    `dali_tensor` : nvidia.dali.backend.TensorCPU or nvidia.dali.backend.TensorGPU
                    Tensor from which to copy
    `arr` : torch.Tensor
            Destination of the copy
    `cuda_stream` : torch.cuda.Stream, cudaStream_t or any value that can be cast to cudaStream_t.
                    CUDA stream to be used for the copy
                    (if not provided, an internal user stream will be selected)
                    In most cases, using pytorch's current stream is expected (for example,
                    if we are copying to a tensor allocated with torch.zeros(...))
    """
    dali_type = to_torch_type[dali_tensor.dtype]

    assert dali_type == arr.dtype, ("The element type of DALI Tensor/TensorList"
                                    " doesn't match the element type of the target PyTorch Tensor: "
                                    "{} vs {}".format(dali_type, arr.dtype))
    assert dali_tensor.shape() == list(arr.size()), \
        ("Shapes do not match: DALI tensor has size {0}, but PyTorch Tensor has size {1}".
            format(dali_tensor.shape(), list(arr.size())))
    cuda_stream = types._raw_cuda_stream(cuda_stream)

    # turn raw int to a c void pointer
    c_type_pointer = ctypes.c_void_p(arr.data_ptr())
    if isinstance(dali_tensor, (TensorGPU, TensorListGPU)):
        stream = None if cuda_stream is None else ctypes.c_void_p(cuda_stream)
        dali_tensor.copy_to_external(c_type_pointer, stream, non_blocking=True)
    else:
        dali_tensor.copy_to_external(c_type_pointer)
    return arr

Next I set major variables which handle the public run and the re-run on the hidden test set, and also allow for simulating the size of the hidden test set by setting RAM_CHECK = True

In [5]:
# Params

COMP_FOLDER = '/kaggle/input/rsna-breast-cancer-detection/'
DATA_FOLDER = COMP_FOLDER + 'train_images/'

N_CORES = mp.cpu_count()
MIXED_PRECISION = False
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'

train_df = pd.read_csv('/kaggle/input/rsna-breast-cancer-detection/train.csv')

# file names:
train_df["fns"] = train_df['patient_id'].astype(str) + '/' + train_df['image_id'].astype(str) + '.dcm'

print(f'Len df : {len(train_df)}')
print(f'Unique patients: {train_df["patient_id"].nunique()}')

Len df : 54706
Unique patients: 11913


Next, we define the function for GPU-based decoding using DALI and processing the dicom images

In [6]:
def convert_dicom_to_jpg(file, save_folder=""):
    patient = file.split('/')[-2]
    image = file.split('/')[-1][:-4]
    dcmfile = pydicom.dcmread(file)

    if dcmfile.file_meta.TransferSyntaxUID == '1.2.840.10008.1.2.4.90':
        with open(file, 'rb') as fp:
            raw = DicomBytesIO(fp.read())
            ds = pydicom.dcmread(raw)
        offset = ds.PixelData.find(b"\x00\x00\x00\x0C")  #<---- the jpeg2000 header info we're looking for
        hackedbitstream = bytearray()
        hackedbitstream.extend(ds.PixelData[offset:])
        with open(save_folder + f"{patient}_{image}.jpg", "wb") as binary_file:
            binary_file.write(hackedbitstream)
            
    if dcmfile.file_meta.TransferSyntaxUID == '1.2.840.10008.1.2.4.70':
        with open(file, 'rb') as fp:
            raw = DicomBytesIO(fp.read())
            ds = pydicom.dcmread(raw)
        offset = ds.PixelData.find(b"\xff\xd8\xff\xe0")  #<---- the jpeg lossless header info we're looking for
        hackedbitstream = bytearray()
        hackedbitstream.extend(ds.PixelData[offset:])
        with open(save_folder + f"{patient}_{image}.jpg", "wb") as binary_file:
            binary_file.write(hackedbitstream)

            
@pipeline_def
def jpg_decode_pipeline(jpgfiles):
    jpegs, _ = fn.readers.file(files=jpgfiles)  # nvidia.dali.fn
    images = fn.experimental.decoders.image(jpegs, device='mixed', output_type=types.ANY_DATA, dtype=DALIDataType.UINT16)
    return images

def parse_window_element(elem):
    if type(elem)==list:
        return float(elem[0])
    if type(elem)==str:
        return float(elem)
    if type(elem)==float:
        return elem
    if type(elem)==pydicom.dataelem.DataElement:
        try:
            return float(elem[0])
        except:
            return float(elem.value)
    return None

def linear_window(data, center, width):
    lower, upper = center - width // 2, center + width // 2
    data = torch.clamp(data, min=lower, max=upper)
    return data 

def process_dicom(img, dicom):
    try:
        invert = getattr(dicom, "PhotometricInterpretation", None) == "MONOCHROME1"
    except:
        invert = False
        
    center = parse_window_element(dicom["WindowCenter"]) 
    width = parse_window_element(dicom["WindowWidth"])
        
    if (center is not None) & (width is not None):
        img = linear_window(img, center, width)

    img = (img - img.min()) / (img.max() - img.min())
    if invert:
        img = 1 - img
    return img

In [7]:
cfg = SimpleNamespace(**{})
cfg.img_size = 1024
# cfg.backbone = 'seresnext50_32x4d'
# cfg.pretrained=False
# cfg.in_channels = 1
cfg.classes = ['cancer']
cfg.batch_size = 8
cfg.data_folder = "/tmp/output/"
cfg.val_aug = A.CenterCrop(always_apply=False, p=1.0, height=cfg.img_size, width=cfg.img_size)
cfg.device = DEVICE



We will process the dicoms in chunks so the disk space does not become an issue. 

In [8]:
SAVE_SIZE = int(cfg.img_size * 1.125)
SAVE_FOLDER = cfg.data_folder
os.makedirs(SAVE_FOLDER, exist_ok=True)
N_CHUNKS = len(train_df["fns"]) // 2000 if len(train_df["fns"]) > 2000 else 1
CHUNKS = [(len(train_df["fns"]) / N_CHUNKS * k, len(train_df["fns"]) / N_CHUNKS * (k + 1)) for k in range(N_CHUNKS)]
CHUNKS = np.array(CHUNKS).astype(int)
JPG_FOLDER = "/tmp/jpg/"

In [9]:
print(N_CHUNKS)

27


In [10]:
SAVE_FOLDER = '/kaggle/working/pngs/'
!mkdir {SAVE_FOLDER}

In [11]:
t0 = time.time()

for i_chunk, chunk in enumerate(CHUNKS):
    print(f'chunk {i_chunk+1} of {len(CHUNKS)} chunks')
    os.makedirs(JPG_FOLDER, exist_ok=True)

    print(f"Converting dicom to jpg...")
    _ = Parallel(n_jobs=2)(
        delayed(convert_dicom_to_jpg)(f'{DATA_FOLDER}/{img}', save_folder=JPG_FOLDER)
        for img in train_df["fns"].tolist()[chunk[0]: chunk[1]]
    )
    print(f"jpg files saved to {JPG_FOLDER}")
    
    jpgfiles = glob.glob(JPG_FOLDER + "*.jpg")


    pipe = jpg_decode_pipeline(jpgfiles, batch_size=1, num_threads=2, device_id=0)
    pipe.build()

    for i, f in enumerate(tqdm(jpgfiles)):
        
        patient, dicom_id = f.split('/')[-1][:-4].split('_')
        dicom = pydicom.dcmread(DATA_FOLDER + f"/{patient}/{dicom_id}.dcm")
        try:
            out = pipe.run()
            # Dali -> Torch
            img = out[0][0]
            img_torch = torch.empty(img.shape(), dtype=torch.int16, device="cuda")
            feed_ndarray(img, img_torch, cuda_stream=torch.cuda.current_stream(device=0))
            img = img_torch.float()

            #apply dicom preprocessing
            img = process_dicom(img, dicom)

            #resize the torch image
            img = F.interpolate(img.view(1, 1, img.size(0), img.size(1)), (SAVE_SIZE, SAVE_SIZE), mode="bilinear")[0, 0]

            img = (img * 255).clip(0,255).to(torch.uint8).cpu().numpy()
            out_file_name = SAVE_FOLDER + f"{patient}_{dicom_id}.png"
            saved = cv2.imwrite(out_file_name, img)
            if not saved:
                print(f"Could not save {out_file_name}")
    
        except Exception as e:
            print(i, e)
            pipe = jpg_decode_pipeline(jpgfiles[i+1:], batch_size=1, num_threads=2, device_id=0)
            pipe.build()
            continue

    print(f"pngs saved to {SAVE_FOLDER}")
    print(f"Elapsed time: {time.time()-t0:.2f}s")
    
    shutil.rmtree(JPG_FOLDER)
    
#     break
    
print(f'DALI Raw image load complete')

chunk 1 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [06:01<00:00,  5.61it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 476.92s
chunk 2 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:47<00:00,  5.83it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 945.90s
chunk 3 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:33<00:00,  6.08it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 1394.89s
chunk 4 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:38<00:00,  5.98it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 1852.67s
chunk 5 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:40<00:00,  5.95it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 2313.83s
chunk 6 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:43<00:00,  5.89it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 2781.42s
chunk 7 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2027/2027 [05:36<00:00,  6.03it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 3233.08s
chunk 8 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:42<00:00,  5.91it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 3696.84s
chunk 9 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:51<00:00,  5.77it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 4175.55s
chunk 10 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:31<00:00,  6.11it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 4631.96s
chunk 11 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:42<00:00,  5.91it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 5091.04s
chunk 12 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:36<00:00,  6.02it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 5545.99s
chunk 13 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:44<00:00,  5.88it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 6005.27s
chunk 14 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2027/2027 [05:54<00:00,  5.72it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 6488.57s
chunk 15 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:53<00:00,  5.73it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 6960.08s
chunk 16 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:57<00:00,  5.67it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 7443.71s
chunk 17 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:49<00:00,  5.79it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 7916.06s
chunk 18 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:34<00:00,  6.06it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 8365.84s
chunk 19 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:34<00:00,  6.05it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 8819.79s
chunk 20 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:45<00:00,  5.87it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 9287.10s
chunk 21 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2027/2027 [05:37<00:00,  6.00it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 9741.65s
chunk 22 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:46<00:00,  5.85it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 10206.46s
chunk 23 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:56<00:00,  5.68it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 10682.46s
chunk 24 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [06:00<00:00,  5.62it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 11167.17s
chunk 25 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:46<00:00,  5.85it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 11638.95s
chunk 26 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2026/2026 [05:59<00:00,  5.64it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 12121.81s
chunk 27 of 27 chunks
Converting dicom to jpg...
jpg files saved to /tmp/jpg/


100%|██████████| 2027/2027 [05:31<00:00,  6.11it/s]


pngs saved to /kaggle/working/pngs/
Elapsed time: 12572.86s
DALI Raw image load complete
