**Removing Computational Bottleneck Using: `CONV(1x1, s:1, p:0)`**

1x1 convolutional layers are used as dimension reduction modules to remove computational bottlenecks as well as to increase non-linearity. For example, a feature map with size 100 x 100 x C channels on convolution with $k$ 1x1 filters would result in a feature map of size 100 x 100 x $k$.
![](images/conv-1x1.png)

---

- Suppose a convolutional layer outputs a tensor (feature maps) of size ($N$, $F$, $H$, $W$), where $N$: Batch size; $F$: Number of convolutional filters; $H$ and $W$: Height and width of feature maps. Now if this output is fed into a convolution layer with $f$ 1x1 filters with zero padding and stride 1, then the output tensor will have size ($N$, $f$, $H$, $W$). Thus using 1x1 convolution layers changes dimensionality (number of filters).
    - If $f > F \rightarrow$ then dimensionality (number of filters) is increased
    - If $f < F \rightarrow$ then dimensionality (number of filters) is decreased
    
![](images/bottleneck-comparison.png)

**Naive Convolutions** *(CS231 Assignment-2: My solution)*
![](images/im2col.png)

---
**im2col For Loop**
![](images/im2col-for-loop.png)

In [223]:
import numpy as np

def conv_forward_naive(x, w, b, conv_param):
    """
    A naive implementation of the forward pass for a convolutional layer.

    The input consists of N data points, each with C channels, height H and width
    W. We convolve each input with F different filters, where each filter spans
    all C channels and has height HH and width HH.

    Input:
    - x: Input data of shape (N, C, H, W)
    - w: Filter weights of shape (F, C, HH, WW)
    - b: Biases, of shape (F,)
    - conv_param: A dictionary with the following keys:
    - 'stride': The number of pixels between adjacent receptive fields in the
      horizontal and vertical directions.
    - 'pad': The number of pixels that will be used to zero-pad the input.

    Returns a tuple of:
    - out: Output data, of shape (N, F, H', W') where H' and W' are given by
    H' = 1 + (H + 2 * pad - HH) / stride
    W' = 1 + (W + 2 * pad - WW) / stride
    - cache: (x, w, b, conv_param)
    """
    out = None
    #############################################################################
    # TODO: Implement the convolutional forward pass.                           #
    # Hint: you can use the function np.pad for padding.                        #
    #############################################################################
    N, C, H, W = x.shape
    F, C, HH, WW = w.shape
    stride = conv_param['stride']
    pad = conv_param['pad']
    
    # Pad input
    x_padded = np.pad(x, ((0, 0), (0, 0), (pad, pad), (pad, pad)), mode='constant')
    
    # Calculate output dimensions
    H_out = 1 + (H + 2 * pad - HH) / stride
    W_out = 1 + (W + 2 * pad - WW) / stride
    
    # Create 'out' array of output data shape filled with zeros
    out = np.zeros((N, F, H_out, W_out))
    
    ##----- im2col implementation - CS231n: winter1516_lecture_11.pdf -----##
    # Calculate new size = K * K * C
    filter_new_size = HH * WW * C 
    
    # Reshape Filter: New shape = # of Filters x (K * K * C)
    filter_reshaped = np.reshape(w, (F, filter_new_size))
    #print 'Filter Reshaped Size: ', filter_reshaped.shape
    
    # Convolution Steps
    for i in range(H_out):
        top = i * stride # Top index
        bottom = top + HH # Bottom index = Top index + Filter Height
        
        for j in range(W_out):
            left = j * stride # Left index
            right = left + WW # Right index = Left index + Filter Width
            
            # Slice x_padded as per top to bottom range and left to right range 
            # NOTE: Resulting shape = N x C x K x K
            x_slice = x_padded[:, :, top:bottom, left:right]
            
            # Reshape x_slice: New shape = (K * K * C) x N
            x_slice_reshaped = np.reshape(x_slice, (filter_new_size, N))
            #print 'X Slice Reshaped Size: ', x_slice_reshaped.shape
            
            # Calculate: [# of Filters x (K * K * C) . (K * K * C) x N] + b, i.e. y = w'x + b
            temp_y = filter_reshaped.dot(x_slice_reshaped).T + b
            # print 'Dot Product + Sum Shape: ', temp_y.shape
            out[:, :, i, j] = temp_y
    ##---------------------------------------------------------------------## 
    
    #############################################################################
    #                             END OF YOUR CODE                              #
    #############################################################################
    cache = (x, w, b, conv_param)
    return out, cache

In [285]:
# Test: Conv(1x1, s:1, p:0)

np.set_printoptions(precision=3)

x_shape = (1, 20, 5, 5)
w_shape = (1, 20, 1, 1)
stride = 1
padding = 0

x = np.linspace(-0.1, 0.5, num=np.prod(x_shape)).reshape(x_shape)
w = np.linspace(-0.2, 0.3, num=np.prod(w_shape)).reshape(w_shape)
b = np.linspace(-0.1, 0.2, num=w.shape[0])
conv_param = {'stride': stride, 'pad': padding}

out, _ = conv_forward_naive(x, w, b, conv_param=conv_param)
print 'X shape: {} with Stride: {} and Padding: {} returns out shape: {}'.format(x.shape, stride, padding, out.shape)
print 'Number of Parameters: {}'.format(out.size)
print 'Output Feature Map: \n', out

X shape: (1, 20, 5, 5) with Stride: 1 and Padding: 0 returns out shape: (1, 1, 5, 5)
Number of Parameters: 25
Output Feature Map: 
[[[[ 0.612  0.613  0.614  0.615  0.616]
   [ 0.618  0.619  0.62   0.621  0.622]
   [ 0.624  0.625  0.626  0.627  0.628]
   [ 0.63   0.631  0.632  0.633  0.634]
   [ 0.636  0.637  0.638  0.639  0.64 ]]]]


In [293]:
# Bottleneck vs Normal Computation Time comparison
def run_bottleneck(N, C, H, W, f1, s1, p1, f2, k2, s2, p2):
    """
    Inputs: N, C, H, W, f1, s1, p1, f2, k2, s2, p2
    """
    x_1_shape = (N, C, H, W)
    w_1_shape = (f1, C, 1, 1)
    x_1 = np.linspace(-0.1, 0.5, num=np.prod(x_1_shape)).reshape(x_1_shape)
    w_1 = np.linspace(-0.2, 0.3, num=np.prod(w_1_shape)).reshape(w_1_shape)
    b_1 = np.linspace(-0.1, 0.2, num=w_1.shape[0])

    w_2_shape = (f2, f1, k2, k2)
    w_2 = np.linspace(-0.2, 0.3, num=np.prod(w_2_shape)).reshape(w_2_shape)
    b_2 = np.linspace(-0.1, 0.2, num=w_2.shape[0])

    conv_param_1 = {'stride': s1, 'pad': p1}
    conv_param_2 = {'stride': s2, 'pad': p2}
    out, _ = conv_forward_naive(x_1, w_1, b_1, conv_param=conv_param_1)
    out, _ = conv_forward_naive(out, w_2, b_2, conv_param=conv_param_2)
    return out

def run_normal(N, C, H, W, f1, k, s, p):
    """
    Inputs: N, C, H, W, f1, k1, s1, p1
    """
    x_1_shape = (N, C, H, W)
    w_1_shape = (f1, C, k, k)
    x_1 = np.linspace(-0.1, 0.5, num=np.prod(x_1_shape)).reshape(x_1_shape)
    w_1 = np.linspace(-0.2, 0.3, num=np.prod(w_1_shape)).reshape(w_1_shape)
    b_1 = np.linspace(-0.1, 0.2, num=w_1.shape[0])
    conv_param_1 = {'stride': s, 'pad': p}
    out, _ = conv_forward_naive(x_1, w_1, b_1, conv_param=conv_param_1)
    return out

In [297]:
# Test Bottleneck: Input (256 depth) -> [Conv(1x1, s:1, p:0, 64 depth) -> Conv(3x3, s:1, p:1, 256 depth)]
%timeit -n 100 run_bottleneck(1, 256, 96, 96, 64, 1, 0, 256, 3, 1, 1)

100 loops, best of 3: 1.05 s per loop


In [298]:
# Test Normal: Input (256 depth) -> Conv(3x3, s:1, p:1, 256 depth)
%timeit -n 100 run_normal(1, 256, 96, 96, 256, 3, 1, 1)

100 loops, best of 3: 3.4 s per loop
