### 15. Problem: Inconsistency of `transforms.Normalize` across plaforms

#### Problem Statement
* The problem we are trying to investigate is that `torchvision.transforms.Normalize` shows different behaviour across platforms

* When the transform `Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))` is used on images with a single color channel(eg: grayscale images) it runs fine or shows error like `RuntimeError: output with shape [1, 28, 28] doesn't match the broadcast shape [3, 28, 28]` depending on the platform you use

This is the code snippet we are focusing on

import torch
from torchvision import datasets, transforms

transform = transforms.Compose([transforms.ToTensor(),
                            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
                              ])

# Download and load the test data
dataset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=False, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=64, shuffle=True)

img, label = next(iter(dataloader))
img.shape

**TL;DR: The discrepancy is caused by differences in torchvision versions installed**

#### Resources

* Three platforms are taken into consideration:
    1. **Udacity workspace** (torch 0.4.0 torchvision 0.2.1)
        <p style="font-size:20;font-weight:bolder;text-align:center">Udacity Workspace</p>
        <img src="../../Screenshots/normalize_udacity_workspace.png">
        
    2. **Local machine** (torch 1.0.1 torchvision 0.2.2)
        <p style="font-size:20;font-weight:bolder;text-align:center">Local Machine</p>
        <img src="../../Screenshots/normalize_local_machine.png">
    
	3. **Google Colab** (torch 1.1.0 torchvision 0.3.0)
        <p style="font-size:20;font-weight:bolder;text-align:center">Google Colaboratory</p>
        <img src="../../Screenshots/normalize_colab.png">
    
    
* Dataset used - **Fashion MNIST**

#### Get a sample image

transform = transforms.Compose([transforms.ToTensor()])
dataset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=False, transform=transform)

img, label = dataset[0]
print(img.shape, type(img))

#### Investigate source code

Use this code to get source code:
>>> import inspect
>>> print(inspect.getsource(transforms.transforms.Normalize)) # Normalize - is a class hence starts with capital letter
>>> print(inspect.getsource(transforms.functional.normalize)) # normalize - is a function thus uses lowercase

Nb: For more Python conventions on naming, see [PEP-8 Style Guide](https://www.python.org/dev/peps/pep-0008/#naming-conventions)

Let's take a deep dive into the source code for `Normalize` class in `torchvision.transforms.transforms`:

##### 1) `transforms.transforms.Normalize`

NB: Some of the comments and parts irrevelant for our discussion has been removed for brevity and for keeping the focus on the problem at hand

class Normalize(object):
    """
    Args:
        mean (sequence): Sequence of means for each channel.
        std (sequence): Sequence of standard deviations for each channel.
    """

    def __init__(self, mean, std, inplace=False):
        self.mean = mean
        self.std = std
        self.inplace = inplace

    def __call__(self, tensor):
        """
        Args:
            tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
        Returns:
            Tensor: Normalized Tensor image.
        """
        return F.normalize(tensor, self.mean, self.std, self.inplace)

* **Confused how `__init__` and `__call__` differs? See [this](https://stackoverflow.com/a/54881816/9734484) answer @ stack overflow**


* **the arguments are named self.---- because they are called inside python classes**

* `tensor` in argument to `__call__` is the tensor on which we call `Normalize`
* The class calls `F.normalize(tensor, self.mean, self.std, self.inplace)` internally

**Here F doesn't refer to the torch.nn.functional class**, instead to `torchvision.transforms.functional`


    Inputs:
        tensor - Tensor image of size (C, H, W) to be normalized C- Channel, H - Height, W- Width
		mean - Sequence of means for each channel
        std - Sequence of standard deviations for each channel
        inplace - if the operation should update the original tensor or return a new tensor
        

##### 2) `normalize` function in `transforms.functional`

I'll just refresh on what `normalize` does, it computes 
<p style="font-size:20px;text-align:center">$\frac{x - mean}{std}$</p>

> for all x ( values ) in the tensor

Here, `mean` and `std` denote the mean and standard deviation to be applied


* Now, this `normalize` function has been coded in two different ways between versions

**A) Colab and Local Machine**

def normalize1(tensor, mean, std, inplace=False):
    """Normalize a tensor image with mean and standard deviation.
    Args:
        tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
        mean (sequence): Sequence of means for each channel.
        std (sequence): Sequence of standard deviations for each channel.

    Returns:
        Tensor: Normalized Tensor image.
    """
    mean = torch.as_tensor(mean, dtype=torch.float32, device=tensor.device) # create mean tensor
    std = torch.as_tensor(std, dtype=torch.float32, device=tensor.device) # create standard deviation tensor
    tensor.sub_(mean[:, None, None]).div_(std[:, None, None]) # sub mean from the tensor and divide by std (inplace operation)
    return tensor

Let's see what's happening

print(f"Image tensor shape{img.shape}\n") # sample image(tensor) to work on 

mean = (0.5, 0.5, 0.5)
std = (0.5, 0.5, 0.5)

mean = torch.as_tensor(mean, dtype=torch.float32, device=tensor.device) # tensor.device - cpu or gpu
std = torch.as_tensor(std, dtype=torch.float32, device=tensor.device)

print(f"mean tensor: {mean} \nshape of mean: {mean.shape}")

print(mean[:, None, None].shape)
print(mean[:, None, None])

*The below code is the part that causes the error( for some )*

tensor0 = img.clone() # create a new image clone as we are going to do inplace operation

tensor0.sub_(mean[:, None, None]) 

		Here, `sub_` throws error as mean[:, None, None] is of shape [3, 1, 1] which gets broadcasted to [3, 28, 28] as shape of tensor is [1, 28, 28] but still can't match the dims.
		
		Nb: Inplace operations in pytorch are always postfixed with a _ , like .add_() or .mul_()
		

**B) Udacity workspace**

def normalize2(tensor, mean, std):
    """Normalize a tensor image with mean and standard deviation.
    Args:
        tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
        mean (sequence): Sequence of means for each channel.
        std (sequence): Sequence of standard deviations for each channely.

    Returns:
        Tensor: Normalized Tensor image.
    """
    for t, m, s in zip(tensor, mean, std):
        t.sub_(m).div_(s)
    return tensor

			
		Here, `sub_` runs to completion w/o any error due to the use of the `zip` function. 
		Let's see how the zip function works 

fruits = ['apple', 'banana', 'orange']
colors = ['red', 'yellow', 'orange']
vitamins = ['A', 'B', 'C']

for fruit, color, vitamin in zip(fruits, colors, vitamins):
    print(fruit, color, vitamin)

Now, if the arguments to `zip` are of different lengths

fruits = ['apple', 'banana']
colors = ['red', 'yellow', 'orange']
vitamins = ['A', 'B', 'C']

for fruit, color, vitamin in zip(fruits, colors, vitamins):
    print(fruit, color, vitamin)

**the code runs to completion without errors but iterates only upto the smallest arguments( in terms of length ) length**

Taking inspiration from the examples above

tensor1 = img.clone()

print(f"Shapes:\n\ttensor1: {tensor1.shape}, Mean: {mean.shape} , Standard Dev: {std.shape}\n")
print(f"Few values in tensor: {tensor[0][0][:5]}\n")

for t, m, s in zip(tensor1, mean, std):
        t.sub_(m).div_(s)
        


Here, `m` and `s` are scalars and can be broadcasted to a tensor of **[28, 28]**

<p style="font-size:20px">AND THIS IS WHY THE `Normalize` RAN FINE ON UDACITY'S WORKSPACE BUT NOT ON COLAB OR MY LOCAL MACHINE FOR THAT MATTER</p>

<hr>