In [1]:
import numpy as np
import torch
import imageio
import os

### Reading a Single Image
We will use the library, *Imageio*. *TorchVision* is also a great default for image loading.

In [2]:
img_arr = imageio.imread('../chap04/data/pug_img.jpg')
img_arr.shape

  img_arr = imageio.imread('../chap04/data/pug_img.jpg')


(1070, 1500, 3)

Imageio's function `imread`outputs a NumPy-like array that we will use to obtain a tensor. However, this array output is laid out as HEIGHT x WIDTH x RGB CHANNELS. PyTorch needs image data tensors to be laid out as CHANNELS x HEIGHT x WIDTH

### Changing the Image Array's Layout
We can use the PyTorch method `permute` to easily reorganize the layout of a tensor

In [3]:
img = torch.from_numpy(img_arr)
img = img.permute(2, 0, 1) #select channels, then height, then width

### Reading Multiple Images
We have 3 images of cats. Rather than creating 3 tensors corresponding to the 3 images and then using `stack` to combine them into one tensor, it's more efficient to allocate a tensor of appropriate size and fill it with the images.

In [4]:
#getting a list of the files first so I can have a parametric batch size
data_dir = '../chap04/data/cats/'
file_names = [name for name in os.listdir(data_dir) if os.path.splitext(name)[-1] == '.png']
batch_size = len(file_names)


In [5]:
#initializing the holder tensor
batch = torch.zeros(batch_size, 3, 256, 256, dtype=torch.uint8) # samples, channels, height, width

In [27]:
for i, filename in enumerate(file_names):
    img_arr = imageio.imread(os.path.join(data_dir, filename))
    img_t = torch.from_numpy(img_arr)
    img_t = img_t.permute(2,0,1) # the proper pytorch layout
    
    #png images will sometimes have an alpha channel. To ensure we are only grabbing the RGB channels, we do the following:
    img_t = img_t[:3]
    batch[i] = img_t # assign the image to the holder tensor

batch = batch.float() # converting the bytes to floats for the next step
batch.dtype

  img_arr = imageio.imread(os.path.join(data_dir, filename))


torch.float32

### Normalizing the data
NNs work best when data is normalized from 0 to 1 or from -1 to 1. The best way to do so is to compute the mean and standard deviation of the input data and scale it so that the output has zero mean and unit standard deviation across each channel.

*Note* This does not mean that we need a range of 0 to one. We can also have a *mean* of 0, and a *standard deviation* of 1.

In [28]:
n_channels = batch.shape[1] #looking at the RGB channels

for c in range(n_channels):
    mean = torch.mean(batch[:, c]) # mean of each channel for all images
    std = torch.std(batch[:, c]) # Standard deviation of each channel for all images
    batch[:, c] = (batch[:, c] - mean) / std

print(batch.mean(), batch.std())

tensor(-2.8147e-08) tensor(1.0000)
