# Implement and test image reader

## Imports and configuration

In [1]:
!ln -s ./../lung_cancer_detection

ln: ./lung_cancer_detection: File exists


In [2]:
from pathlib import Path

import numpy as np
import pandas as pd
from monai.config import print_config
from monai.transforms import LoadImage, LoadImaged
from monai.data.image_reader import NumpyReader, ImageReader
from lung_cancer_detection.data.image_reader import LIDCReader

pd.set_option('display.max_columns', None)

DATA_DIR = Path("/Volumes/LaCie/data/lung-cancer-detection/lidc-idri/processed")

In [3]:
%load_ext autoreload
%autoreload 2

In [4]:
print_config()

MONAI version: 0.4.0
Numpy version: 1.19.4
Pytorch version: 1.7.1
MONAI flags: HAS_EXT = False, USE_COMPILED = False
MONAI rev id: 0563a4467fa602feca92d91c7f47261868d171a1

Optional dependencies:
Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: NOT INSTALLED or UNKNOWN VERSION.
scikit-image version: 0.18.0
Pillow version: 8.0.1
Tensorboard version: NOT INSTALLED or UNKNOWN VERSION.
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: 0.8.2
ITK version: 5.1.2
tqdm version: 4.56.0
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: NOT INSTALLED or UNKNOWN VERSION.

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies



## Specify requirements

In [5]:
?LoadImage

Loading an image requires an `monai.data.image_reader.ImageReader` object at runtime.

In [21]:
??NumpyReader

The default `NumpyReader` does not work for our use case because we want to support the `monai` transformations which relies on metadata from medical images such as orientation.

In [7]:
??ImageReader

In order to create our own `ImageReader`, we have to implement three methods:

- `verify_suffix`: takes a list of filenames and returns a boolean which states whether the file names are supported
- `read`: Read raw image data from specified file(s)
- `get_data`: Extracts data array and meta data from loaded image and returns them

Several particularities of our dataset have to be considered:

- The filename is always a patient ID
- Loading image and mask requires separate calls to the `ImageReader` class
- In addition to reading data from the `npy` files, we need to extract metadata from the dataframe

## Initialize image reader

In [10]:
reader = LIDCReader(DATA_DIR)

In [11]:
filename = "images/unknown.npy"
reader.verify_suffix(filename)

True

In [14]:
img, meta = reader.read("images/LIDC-IDRI-0001.npy")
print(img.shape)
print(type(meta))

(512, 512, 133)
<class 'pandas.core.series.Series'>


In [16]:
meta

StudyID                    1.3.6.1.4.1.14519.5.2.1.6279.6001.298806137288...
SeriesID                   1.3.6.1.4.1.14519.5.2.1.6279.6001.179049373636...
SliceThickness                                                      2.500000
SliceSpacing                                                        2.500000
PixelSpacing                                                        0.703125
ContrastUsed                                                            True
ImagePositionPatient                  [-166.000000, -171.699997, -10.000000]
ImageOrientationPatient    [1.000000, 0.000000, 0.000000, 0.000000, 1.000...
Rows                                                                     512
Columns                                                                  512
RescaleIntercept                                                -1024.000000
RescaleSlope                                                        1.000000
WindowCenter                                                            -600

In [18]:
meta.name

'LIDC-IDRI-0001'

In [20]:
img, meta = reader.read("images/example.npy")

FileNotFoundError: [Errno 2] No such file or directory: '/Volumes/LaCie/data/lung-cancer-detection/lidc-idri/processed/images/example.npy'