https://www.ibm.com/support/knowledgecenter/en/SSGNPV_2.0.0/dsx/markd-jupyter.html
1. PyTorch provides a core data structure, the tensor, which is a multidimensional array
that shares many similarities with NumPy arrays.
2. Deep learning exchanges the need to handcraft features for an increase in data and
computational requirements.
3. data processing is needed before the training data even reaches our model
First we need to physically get the data, most often from some sort of storage as the data source. 
Then we need to convert each sample from our data into a something PyTorch can actually handle: tensors. This bridge between our custom data (in whatever format it might be) and a standardized
PyTorch tensor is the Dataset class PyTorch provides in torch.utils.data.Dataset
https://github.com/pytorch/pytorch/tree/master/torch/utils/data

<b>torch.utils.data.Dataset class</b><br>
torch.utils.data.Dataset is an abstract class representing a dataset. Your custom dataset should inherit Dataset and override the following magic methods:<br>
**__len__** so that len(dataset) returns the size of the dataset.<br>
**__getitem__** to support the indexing such that dataset[i] can be used to get ith sample

In [1]:
import torch
import imageio

We’ll use imageio throughout the notebook because it handles different data types with a uniform API. For many purposes, using **TorchVision** is a great default choice to deal with image and video data. We go with imageio here for somewhat lighter exploration.

In [2]:
img_arr = imageio.imread('D:/Research/Pytorch/images/0.jpg')
img_arr.shape, type(img_arr) 

((1558, 1722, 3), imageio.core.util.Array)

In [7]:
img = torch.from_numpy(img_arr)
# Changing the layout using permute method. Given an input tensor H × W × C as obtained previously,
# we get a proper layout by having channel 2 first and then channels 0 and 1
out = img.permute(2, 0, 1)
out.shape, type(out)

(torch.Size([3, 1558, 1722]), torch.Tensor)

To create a dataset of multiple images to use as an input for our neural networks, we store the images in a batch along the first dimension to obtain an N × C × H × W tensor.As a slightly more efficient alternative to using stack to build up the tensor, we can preallocate a tensor of appropriate size and fill it with images loaded from a directory, like so

In [11]:
batch_size = 8
batch = torch.zeros(batch_size, 3, 250, 250, dtype=torch.uint8)
batch

tensor([[[[0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          ...,
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0]],

         [[0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          ...,
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0]],

         [[0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          ...,
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0]]],


        [[[0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          ...,
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0]],

         [[0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0

In [12]:
import os
import cv2

data_dir = 'D:/Research/Pytorch/images'
filenames = [name for name in os.listdir(data_dir) if os.path.splitext(name)[-1] == '.jpg']
print(filenames)
for i, filename in enumerate(filenames):
#     img_arr = imageio.imread(os.path.join(data_dir, filename))
    img = cv2.imread(os.path.join(data_dir, filename))
     #     resize image to 500*500 dimession
#     print(img.shape)
    res = cv2.resize(img, dsize=(250, 250), interpolation=cv2.INTER_CUBIC) 
    img_t = torch.from_numpy(res)
    img_t = img_t.permute(2, 0, 1)
#     print(img_t.shape)
    #     Here we keep only the first three channels. Sometimes images also have an alpha channel 
    # indicating transparency, but our network only wants RGB input.
    img_t = img_t[:3]
    batch[i] = img_t

['0.jpg', '1.jpg', '10.jpg', '2.jpg', '3.jpg', '4.jpg', '7.jpg', '9.jpg']


In [13]:
batch

tensor([[[[ 41,  39,  41,  ...,   3,   3,   3],
          [ 37,  37,  40,  ...,   3,   3,   3],
          [ 41,  41,  40,  ...,  19,   2,   3],
          ...,
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0]],

         [[ 41,  39,  41,  ...,   3,   3,   3],
          [ 37,  37,  40,  ...,   3,   3,   3],
          [ 41,  41,  40,  ...,  19,   2,   3],
          ...,
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0]],

         [[ 41,  39,  41,  ...,   3,   3,   3],
          [ 37,  37,  40,  ...,   3,   3,   3],
          [ 41,  41,  40,  ...,  19,   2,   3],
          ...,
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0]]],


        [[[ 11,  11,  12,  ...,   1,   3,   3],
          [ 12,  11,  12,  ...,   1

**Normalization** <br>

Neural networks exhibit the **best training performance** when the input data ranges roughly from 0 to 1, or from -1 to 1 so a typical thing we’ll want to do is cast a tensor to floating-point and normalizethe values of the pixels. Casting to floating-point is easy, but normalization is trickier,
as it depends on what range of the input we decide should lie between 0 and 1 (or -1
and 1). 
1. One possibility is to just divide the values of the pixels by 255 (the maximum representable number in 8-bit unsigned)

2. Another possibility is to compute the mean and standard deviation of the input data and scale it so that the output has zero mean and unit standard deviation across each channel:



In [14]:
batch

tensor([[[[ 41,  39,  41,  ...,   3,   3,   3],
          [ 37,  37,  40,  ...,   3,   3,   3],
          [ 41,  41,  40,  ...,  19,   2,   3],
          ...,
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0]],

         [[ 41,  39,  41,  ...,   3,   3,   3],
          [ 37,  37,  40,  ...,   3,   3,   3],
          [ 41,  41,  40,  ...,  19,   2,   3],
          ...,
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0]],

         [[ 41,  39,  41,  ...,   3,   3,   3],
          [ 37,  37,  40,  ...,   3,   3,   3],
          [ 41,  41,  40,  ...,  19,   2,   3],
          ...,
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0],
          [  0,   0,   0,  ...,   0,   0,   0]]],


        [[[ 11,  11,  12,  ...,   1,   3,   3],
          [ 12,  11,  12,  ...,   1

In [15]:
batch.shape

torch.Size([8, 3, 250, 250])

In [22]:
# approach-1
# batch = batch.float()
# batch /= 255.0
# approach-2
batch = batch.float()
n_channels = batch.shape[1]
for c in range(n_channels):
    print(batch[:,c].mean())
    print(batch[:,c].std())
    mean = torch.mean(batch[:, c])
    std = torch.std(batch[:, c])
    batch[:, c] = (batch[:, c] - mean) / std

tensor(-2.8906e-07)
tensor(1.)
tensor(-1.2656e-06)
tensor(1.)
tensor(-1.7188e-06)
tensor(1.0000)


In [25]:
print(batch[:,2].mean())
print(batch[:,2].std())

tensor(1.5625e-08)
tensor(1.)


NOTE:  Here, we normalized just a single batch of images because we do not
know yet how to operate on an entire dataset. In working with images, it is good
practice to compute the mean and standard deviation on all the training data
in advance and then subtract and divide by these fixed, precomputed quantities.