CNN Architectures Trained With ImageNet
Gabriel Falcao edited this page Jan 15, 2023
·
3 revisions
input: 3x227x227(RGB image) | ||
Convolutional Layer | kernel: 3x96x11x11 | stride=4; padding=0 |
Batch Normalization | features: 96 | |
Max Pooling | kernel: 3x3 | stride=2 |
ReLU (non-linearity) | ||
Convolutional Layer | kernel: 96x256x5x5 | stride=1; padding=2 |
Batch Normalization | features: 256 | |
Max Pooling | kernel: 3x3 | stride=2 |
ReLU (non-linearity) | ||
Convolutional Layer | kernel: 256x384x3x3 | |
Batch Normalization | features: 384 | |
ReLU (non-linearity) | ||
Convolutional Layer | kernel: 384x384x3x3 | stride=1; padding=1 |
Batch Normalization | features: 384 | |
ReLU (non-linearity) | ||
Convolutional Layer | kernel: 384x256x3x3 | stride=1; padding=1 |
Batch Normalization | features: 256 | |
Max Pooling | kernel: 3x3 | stride=2 |
ReLU (non-linearity) | ||
reshape: 256x6x6 => 9216x1 | ||
Fully Connected Layer | kernel: 9216x4096 | |
Batch Normalization | features: 4096 | |
ReLU (non-linearity) | ||
Dropout | probability: 0.5 | |
Fully Connected Layer | kernel: 4096x4096 | |
Batch Normalization | features: 4096 | |
ReLU (non-linearity) | ||
Dropout | probability: 0.5 | |
Fully Connected Layer | kernel: 4096x1000 | |
Batch Normalization | features: 1000 |
input: 3x227x227 (RGB image) | ||||
Convolutional Layer | kernel: 3x64x7x7 | stride=2; padding=3 | ||
Batch Normalization | features: 64 | |||
ReLU | ||||
Max Pooling | kernel: 3x3 | stride=2; padding=1 | ||
First Group |
Basic Block | Convolutional Layer | kernel: 64x64x3x3 | stride=1; padding=1 |
Batch Normalization | features: 64 | |||
ReLU | ||||
Convolutional Layer | kernel: 64x64x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 64 | |||
ReLU | ||||
Basic Block | Convolutional Layer | kernel: 64x64x3x3 | stride=1; padding=1 | |
Batch Normalization | features: 64 | |||
ReLU | ||||
Convolutional Layer | kernel: 64x64x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 64 | |||
ReLU | ||||
Second Group |
Basic Block | Convolutional Layer | kernel: 64x128x3x3 | stride=2; padding=1 |
Batch Normalization | features: 128 | |||
ReLU | ||||
Convolutional Layer | kernel: 128x128x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 128 | |||
(Downsample) | kernel: 64x128x1x1 | stride=2; padding=0 | ||
ReLU | ||||
Basic Block | Convolutional Layer | kernel: 128x128x3x3 | stride=1; padding=1 | |
Batch Normalization | features: 128 | |||
ReLU | ||||
Convolutional Layer | kernel: 128x128x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 128 | |||
ReLU | ||||
Third Group |
Basic Block | Convolutional Layer | kernel: 128x256x3x3 | stride=2; padding=1 |
Batch Normalization | features: 256 | |||
ReLU | ||||
Convolutional Layer | kernel: 256x256x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 256 | |||
(Downsample) | kernel: 128x256x1x1 | stride=2; padding=0 | ||
ReLU | ||||
Basic Block | Convolutional Layer | kernel: 256x256x3x3 | stride=1; padding=1 | |
Batch Normalization | features: 256 | |||
ReLU | ||||
Convolutional Layer | kernel: 256x256x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 256 | |||
ReLU | ||||
Fourth Group |
Basic Block | Convolutional Layer | kernel: 256x512x3x3 | stride=2; padding=1 |
Batch Normalization | features: 512 | |||
ReLU | ||||
Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 512 | |||
(Downsample) | kernel: 256x512x1x1 | stride=2; padding=0 | ||
ReLU | ||||
Basic Block | Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 | |
Batch Normalization | features: 512 | |||
ReLU | ||||
Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 | ||
Batch Normalization | features: 512 | |||
ReLU | ||||
Average Pooling | kernel: 7x7 | stride=7 | ||
reshape: 512x1x1 => 512x1 | ||||
Fully Connected Layer | kernel: 512x1000 |
Note: ResNet-34 and ResNet-50 are also implemented in code but not yet explored.
input: 3x227x227 (RGB image) | ||
Convolutional Layer | kernel: 3x64x3x3 | stride=1; padding=1 |
Batch Normalization | features: 64 | |
ReLU | ||
Convolutional Layer | kernel: 64x64x3x3 | stride=1; padding=1 |
Batch Normalization | features: 64 | |
ReLU | ||
Max Pooling | kernel: 2x2 | stride=2 |
Convolutional Layer | kernel: 64x128x3x3 | stride=1; padding=1 |
Batch Normalization | features: 128 | |
ReLU | ||
Convolutional Layer | kernel: 128128x3x3 | stride=1; padding=1 |
Batch Normalization | features: 128 | |
ReLU | ||
Max Pooling | kernel: 2x2 | stride=2 |
Convolutional Layer | kernel: 128x256x3x3 | stride=1; padding=1 |
Batch Normalization | features: 256 | |
ReLU | ||
Convolutional Layer | kernel: 256x256x3x3 | stride=1; padding=1 |
Batch Normalization | features: 256 | |
ReLU | ||
Convolutional Layer | kernel: 256x2563x3 | stride=1; padding=1 |
Batch Normalization | features: 256 | |
ReLU | ||
Max Pooling | kernel: 2x2 | stride=2 |
Convolutional Layer | kernel: 256x512x3x3 | stride=1; padding=1 |
Batch Normalization | features: 512 | |
ReLU | ||
Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 |
Batch Normalization | features: 512 | |
ReLU | ||
Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 |
Batch Normalization | features: 512 | |
ReLU | ||
Max Pooling | kernel: 2x2 | stride=2 |
Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 |
Batch Normalization | features: 512 | |
ReLU | ||
Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 |
Batch Normalization | features: 512 | |
ReLU | ||
Convolutional Layer | kernel: 512x512x3x3 | stride=1; padding=1 |
Batch Normalization | features: 512 | |
ReLU | ||
Max Pooling | kernel: 2x2 | stride=2 |
Adaptive Average Pooling | output size: 7x7 | |
reshape: 512x7x7 => 25088x1 | ||
Fully Connected Layer | kernel: 25088x4096 | |
ReLU | ||
Fully Connected Layer | kernel: 4096x4096 | |
ReLU | ||
Fully Connected Layer | kernel: 4096x1000 |
Note: VGG-11, VGG-13 and VGG-19 are also implemented in code but not yet explored.