<a href="https://colab.research.google.com/github/ajayrfhp/LearningDeepLearning/blob/main/pooling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install d2l

Installing collected packages: tzdata, scipy, qtpy, pyparsing, jedi, pandas, matplotlib, qtconsole, jupyter, d2l
  Attempting uninstall: scipy
    Found existing installation: scipy 1.11.4
    Uninstalling scipy-1.11.4:
      Successfully uninstalled scipy-1.11.4
  Attempting uninstall: pyparsing
    Found existing installation: pyparsing 3.1.1
    Uninstalling pyparsing-3.1.1:
      Successfully uninstalled pyparsing-3.1.1
  Attempting uninstall: pandas
    Found existing installation: pandas 1.5.3
    Uninstalling pandas-1.5.3:
      Successfully uninstalled pandas-1.5.3
  Attempting uninstall: matplotlib
    Found existing installation: matplotlib 3.7.1
    Uninstalling matplotlib-3.7.1:
      Successfully uninstalled matplotlib-3.7.1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requir

## Pooling
- Condense information from an area using the pooling operator
- Pooling layer has no parameters
- Max pooling is generally preferred over average pooling in all cases.
   - Objects in image might be present in different regions, averaging over regions where object is present and not present with equal weights, does not make sense.
- Deep learning frameworks typically default to same pooling and stride values

In [None]:
import torch
from torch import nn
import torchvision
from d2l import torch as d2l
import numpy as np

In [None]:
x = torch.tensor(([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]]))

def pool(x, p, padding_type='same'):
  """
  Args
    x - (c, h, w)
    p - int pooling size
  Returns
    o - (h-p+1, w-p+1)
  """
  if x.ndim == 2:
    x = x.unsqueeze(dim=0)
  c, h, w = x.shape[0], x.shape[1], x.shape[2]
  o = torch.zeros((c, h, w))
  if padding_type == 'valid':
    o = torch.zeros((c, h-p+1, w-p+1))

  for i in range(o.shape[1]):
    for j in range(o.shape[2]):
      o[:, i, j] = x[:, i:i+p, j:j+p].max()

  return o

pool(x, 2)

tensor([[[5., 6., 6.],
         [8., 9., 9.],
         [8., 9., 9.]]])

In [None]:
pool(x, 2, padding_type='valid')

tensor([[[5., 6.],
         [8., 9.]]])

In [None]:
x3d = torch.arange(0, 27).reshape((3, 3, 3))
pool(x3d,p=2)

tensor([[[22., 23., 23.],
         [25., 26., 26.],
         [25., 26., 26.]],

        [[22., 23., 23.],
         [25., 26., 26.],
         [25., 26., 26.]],

        [[22., 23., 23.],
         [25., 26., 26.],
         [25., 26., 26.]]])

In [None]:
x3d

tensor([[[ 0,  1,  2],
         [ 3,  4,  5],
         [ 6,  7,  8]],

        [[ 9, 10, 11],
         [12, 13, 14],
         [15, 16, 17]],

        [[18, 19, 20],
         [21, 22, 23],
         [24, 25, 26]]])

In [None]:
nn.MaxPool2d(kernel_size=3).forward(x3d)

tensor([[[ 8]],

        [[17]],

        [[26]]])

- For a pooling layer, number of input channels = number of output channels. We typically do not do channel wise pooling
- Pooling is simple and works reasonably well, but applying self attention to weigh over inputs works a lot better,,

- What is the computation cost of pooling ?
  - Input is (c, h, w) and pooling size is (p)
  - if there is same padding $(c * (h-p+1) * (w-p+1) * p^2)$
  - If there is stride p in both the directions, $(c * (h-p+1) * (w-p+1))$

- Relu(x) = Max(0, x). Implement Max(a, b) using Relu(x)
  - Total = a + b
  - Smaller = (Total - Relu(a-b) - Relu(b-a))/2
  - larger = Total - Smaller
- Max(a1... an) can be implemented as
  - for i in n:
    - for j in n:
      - max = max(max, max(i, j))    