Skip to content

CNN Architectures Trained With ImageNet

Gabriel Falcao edited this page Jan 15, 2023 · 3 revisions

Outline

  1. AlexNet
  2. ResNet-18
  3. VGG-16

AlexNet

input: 3x227x227(RGB image)
Convolutional Layer kernel: 3x96x11x11 stride=4; padding=0
Batch Normalization features: 96
Max Pooling kernel: 3x3 stride=2
ReLU (non-linearity)
Convolutional Layer kernel: 96x256x5x5 stride=1; padding=2
Batch Normalization features: 256
Max Pooling kernel: 3x3 stride=2
ReLU (non-linearity)
Convolutional Layer kernel: 256x384x3x3
Batch Normalization features: 384
ReLU (non-linearity)
Convolutional Layer kernel: 384x384x3x3 stride=1; padding=1
Batch Normalization features: 384
ReLU (non-linearity)
Convolutional Layer kernel: 384x256x3x3 stride=1; padding=1
Batch Normalization features: 256
Max Pooling kernel: 3x3 stride=2
ReLU (non-linearity)
reshape: 256x6x6 => 9216x1
Fully Connected Layer kernel: 9216x4096
Batch Normalization features: 4096
ReLU (non-linearity)
Dropout probability: 0.5
Fully Connected Layer kernel: 4096x4096
Batch Normalization features: 4096
ReLU (non-linearity)
Dropout probability: 0.5
Fully Connected Layer kernel: 4096x1000
Batch Normalization features: 1000

ResNet-18

input: 3x227x227 (RGB image)
Convolutional Layer kernel: 3x64x7x7 stride=2; padding=3
Batch Normalization features: 64
ReLU
Max Pooling kernel: 3x3 stride=2; padding=1
First
Group
Basic Block Convolutional Layer kernel: 64x64x3x3 stride=1; padding=1
Batch Normalization features: 64
ReLU
Convolutional Layer kernel: 64x64x3x3 stride=1; padding=1
Batch Normalization features: 64
ReLU
Basic Block Convolutional Layer kernel: 64x64x3x3 stride=1; padding=1
Batch Normalization features: 64
ReLU
Convolutional Layer kernel: 64x64x3x3 stride=1; padding=1
Batch Normalization features: 64
ReLU
Second
Group
Basic Block Convolutional Layer kernel: 64x128x3x3 stride=2; padding=1
Batch Normalization features: 128
ReLU
Convolutional Layer kernel: 128x128x3x3 stride=1; padding=1
Batch Normalization features: 128
(Downsample) kernel: 64x128x1x1 stride=2; padding=0
ReLU
Basic Block Convolutional Layer kernel: 128x128x3x3 stride=1; padding=1
Batch Normalization features: 128
ReLU
Convolutional Layer kernel: 128x128x3x3 stride=1; padding=1
Batch Normalization features: 128
ReLU
Third
Group
Basic Block Convolutional Layer kernel: 128x256x3x3 stride=2; padding=1
Batch Normalization features: 256
ReLU
Convolutional Layer kernel: 256x256x3x3 stride=1; padding=1
Batch Normalization features: 256
(Downsample) kernel: 128x256x1x1 stride=2; padding=0
ReLU
Basic Block Convolutional Layer kernel: 256x256x3x3 stride=1; padding=1
Batch Normalization features: 256
ReLU
Convolutional Layer kernel: 256x256x3x3 stride=1; padding=1
Batch Normalization features: 256
ReLU
Fourth
Group
Basic Block Convolutional Layer kernel: 256x512x3x3 stride=2; padding=1
Batch Normalization features: 512
ReLU
Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
(Downsample) kernel: 256x512x1x1 stride=2; padding=0
ReLU
Basic Block Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Average Pooling kernel: 7x7 stride=7
reshape: 512x1x1 => 512x1
Fully Connected Layer kernel: 512x1000

Note: ResNet-34 and ResNet-50 are also implemented in code but not yet explored.

VGG-16

input: 3x227x227 (RGB image)
Convolutional Layer kernel: 3x64x3x3 stride=1; padding=1
Batch Normalization features: 64
ReLU
Convolutional Layer kernel: 64x64x3x3 stride=1; padding=1
Batch Normalization features: 64
ReLU
Max Pooling kernel: 2x2 stride=2
Convolutional Layer kernel: 64x128x3x3 stride=1; padding=1
Batch Normalization features: 128
ReLU
Convolutional Layer kernel: 128128x3x3 stride=1; padding=1
Batch Normalization features: 128
ReLU
Max Pooling kernel: 2x2 stride=2
Convolutional Layer kernel: 128x256x3x3 stride=1; padding=1
Batch Normalization features: 256
ReLU
Convolutional Layer kernel: 256x256x3x3 stride=1; padding=1
Batch Normalization features: 256
ReLU
Convolutional Layer kernel: 256x2563x3 stride=1; padding=1
Batch Normalization features: 256
ReLU
Max Pooling kernel: 2x2 stride=2
Convolutional Layer kernel: 256x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Max Pooling kernel: 2x2 stride=2
Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Convolutional Layer kernel: 512x512x3x3 stride=1; padding=1
Batch Normalization features: 512
ReLU
Max Pooling kernel: 2x2 stride=2
Adaptive Average Pooling output size: 7x7
reshape: 512x7x7 => 25088x1
Fully Connected Layer kernel: 25088x4096
ReLU
Fully Connected Layer kernel: 4096x4096
ReLU
Fully Connected Layer kernel: 4096x1000

Note: VGG-11, VGG-13 and VGG-19 are also implemented in code but not yet explored.