# ResNet
In ResNet50, 101, and 151, the learner component consists of four convolutional groups. The first group uses a nonstrided convolutional layer for the projection shortcut in the first convolutional block, which takes input from the stem component. The other three convolutional groups use a strided convolutional layer (feature pooling)
in the projection shortcut for the first convolutional block. Figure 5.12 shows this arrangement.
<img src="img_2.png" />

In this application, we did this because the first group starts with a nonstrided projection shortcut residual block, while all the remaining groups use a strided projection shortcut. Alternatively, we could have used a configuration attribute to indicate whether the first residual block is strided or not, and eliminated the special case (coding a separate block construction).

<img src="img_3.png" />

In [19]:


def learner(inputs, groups):
    """
    Construct the leaner.
    :param inputs: input to the learner
    :param groups: group parameters per groups
    :return:
    """
    outputs = inputs
    group_params = groups.pop(0)
    # first residual goups is not strided
    outputs = group(outputs, **group_params, strides=(1,1))

    # Second residual goup are strided convolution

    for group_params in groups:
        outputs = groups(outputs, **group_params, strides=(2, 2))
    return outputs



In [20]:
resnest_groups = { 50 : [ (64, 3), (128, 4), (256, 6),  (512, 3) ],		# ResNet50
           101: [ (64, 3), (128, 4), (256, 23), (512, 3) ],		# ResNet101
           152: [ (64, 3), (128, 8), (256, 36), (512, 3) ]		# ResNet152
         }

While ResNets continue to be used today as a stock model for the image classification backbone, the 50-layer ResNet50, depicted in figure 5.13, is the standard. At 50 layers, the model gives high accuracy at reasonable size and performance. The larger ResNets at 101 and 151 layers provide only minor increases in accuracy but at substantial increase in size and reduction in performance.
Each group starts with
1. A residual block with a linear projection shortcut,
2. Followed by one or more residual blocks with an identity shortcut.
3. All the residual blocks in a group have the same number of output filters.
4. Each group successively doubles the number of output filters, and
5. The residual block with a linear projection shortcut  doubles the number of filters from the input to the group


<img src="img_4.png">

The ResNets (for example, 50, 101, 152) consist of four convolutional groups; the output filters for the four groups follow the doubling convention, starting at 64, then 128, 256, and finally 512. The number convention (50) refers to the number of convolutional layers, which determines the number of convolutional blocks in each convolutional group.  The following is an example application of the skeleton template for coding the convolutional group of a ResNet50. For the group() function, we pop off the first block’s configuration attributes, which we know for a ResNet is a projection block, and then iterate through the remaining blocks as identity blocks:

<img src="img_5.png"/>

In [22]:
def group(inputs, blocks, strides=(2,2)):
    """
    Construct a Residual Group
    :param inputs: input into the group
    :param blocks: block parameters for each group
    :param strides: whether the projection block is a strided convolution
    """
    outputs = inputs
    block_params = blocks.pop(0)
    output = projection_block(outputs, strides=strides, **block_params)

    for block_params in blocks:
        outputs = identity_block(outputs, **block_params)
    return  outputs