How to represent different types of data with tensors. Types covered: images, tabular, time series, text.

## Working with images

#### Loading

In [16]:
import imageio

img_arr = imageio.imread("../data/p1ch4/image-dog/bobby.jpg")
img_arr.shape # H x W x C

(1280, 855, 3)

In [17]:
import torch

img = torch.from_numpy(img_arr)
out = img.permute(2, 0, 1) # Permute to fit C x H x W dimension ordering
out.shape

# Note: permute doesn't create a new image, but alters the size and stride information at the level
# of the original tensor

torch.Size([3, 1280, 855])

In [18]:
"""
To creat a dataset of multiple images to use as input for our neural networks, we store the images
in a batch along the first dimension to obtain an N x C x H x W tensor.
"""

# An efficient way to create a batch is pre-allocation followed by loading from a directory

batch_size = 3
batch = torch.zeros(batch_size, 3, 256, 256, dtype=torch.uint8)

In [19]:
import os

data_dir = '../data/p1ch4/image-cats/'
filenames = [name for name in os.listdir(data_dir)
             if os.path.splitext(name)[-1] == ".png"] # Condition ensures images used are of a desired format.

for i, filename in enumerate(filenames):

    img_arr = imageio.imread(os.path.join(data_dir, filename))
    img_t = torch.from_numpy(img_arr)
    img_t = img_t.permute(2, 0, 1)
    img_t = img_t[:3]
    batch[i] = img_t

#### Normalising

In [20]:
batch[0]

tensor([[[ 90,  91,  93,  ..., 191, 191, 191],
         [ 91,  91,  93,  ..., 191, 191, 191],
         [ 91,  92,  93,  ..., 192, 192, 192],
         ...,
         [206, 210, 213,  ..., 220, 219, 218],
         [209, 214, 214,  ..., 221, 220, 219],
         [212, 212, 212,  ..., 219, 218, 218]],

        [[108, 109, 111,  ..., 201, 201, 201],
         [109, 109, 111,  ..., 201, 201, 201],
         [109, 110, 111,  ..., 202, 202, 202],
         ...,
         [198, 202, 205,  ..., 214, 213, 212],
         [201, 206, 206,  ..., 213, 212, 211],
         [204, 204, 204,  ..., 211, 210, 210]],

        [[120, 121, 123,  ..., 210, 210, 210],
         [121, 121, 123,  ..., 210, 210, 210],
         [121, 122, 123,  ..., 211, 211, 211],
         ...,
         [198, 202, 205,  ..., 214, 213, 212],
         [201, 206, 206,  ..., 213, 212, 211],
         [204, 204, 204,  ..., 211, 210, 210]]], dtype=torch.uint8)

In [21]:
# Best training performance is observed when input data values fall in the ranges [0, 1] or [-1, 1]

batch = batch.float()
batch /= 255.0
batch[0]

tensor([[[0.3529, 0.3569, 0.3647,  ..., 0.7490, 0.7490, 0.7490],
         [0.3569, 0.3569, 0.3647,  ..., 0.7490, 0.7490, 0.7490],
         [0.3569, 0.3608, 0.3647,  ..., 0.7529, 0.7529, 0.7529],
         ...,
         [0.8078, 0.8235, 0.8353,  ..., 0.8627, 0.8588, 0.8549],
         [0.8196, 0.8392, 0.8392,  ..., 0.8667, 0.8627, 0.8588],
         [0.8314, 0.8314, 0.8314,  ..., 0.8588, 0.8549, 0.8549]],

        [[0.4235, 0.4275, 0.4353,  ..., 0.7882, 0.7882, 0.7882],
         [0.4275, 0.4275, 0.4353,  ..., 0.7882, 0.7882, 0.7882],
         [0.4275, 0.4314, 0.4353,  ..., 0.7922, 0.7922, 0.7922],
         ...,
         [0.7765, 0.7922, 0.8039,  ..., 0.8392, 0.8353, 0.8314],
         [0.7882, 0.8078, 0.8078,  ..., 0.8353, 0.8314, 0.8275],
         [0.8000, 0.8000, 0.8000,  ..., 0.8275, 0.8235, 0.8235]],

        [[0.4706, 0.4745, 0.4824,  ..., 0.8235, 0.8235, 0.8235],
         [0.4745, 0.4745, 0.4824,  ..., 0.8235, 0.8235, 0.8235],
         [0.4745, 0.4784, 0.4824,  ..., 0.8275, 0.8275, 0.

In [22]:
"""
May also want to compute the mean sand stdev of the input data and scale it
so that tehe output has zero mean and unit stdev across each channel

Torch provides functions for calculating these for tensors
"""

n_channels = batch.shape[1]
for c in range(n_channels):
    mean = torch.mean(batch[:, c])
    std = torch.std(batch[:, c])
    batch[:, c] = (batch[:, c] - mean) / std
    
# NOTE: it's good practice to compute the mean and stdev on all training data in 
# advance and then subtract nad divide by these fixed, precomputed quanities

### 3D images

In some domains, sequences of images are stacked along the head-to-foot axis. E.g. the slices in CT scans.

By stacking individual 2D slices into a 3D tensor, we can built _volumetric data_ representing the 3D anatomy of a subject. Storing volumetric data is just like storing image data, except that an extra dimension, _depth_, comes after the standard channel dimension, resulting in a 5D tensor of shape `N x C x D x H x W`.

In [29]:
# Loading the specialised format
# Volumetric data can be downloaded from: https://github.com/deep-learning-with-pytorch/dlwpt-code/tree/master/data/p1ch4/volumetric-dicom/2-LUNG%203.0%20%20B70f-04083

import imageio
dir_path = "../data/p1ch4/volumetric-dicom/2-LUNG 3.0  B70f-04083"
vol_arr = imageio.volread(dir_path, 'DICOM')
vol_arr.shape

Reading DICOM (examining files): 1/99 files (1.0%99/99 files (100.0%)
  Found 1 correct series.
Reading DICOM (loading data): 85/99  (85.999/99  (100.0%)


(99, 512, 512)

### Representing tabular data