CNN Architectures Trained With ImageNet

Outline

AlexNet

input: 3x227x227(RGB image)
Convolutional Layer	kernel: 3x96x11x11	stride=4; padding=0
Batch Normalization	features: 96
Max Pooling	kernel: 3x3	stride=2
ReLU (non-linearity)
Convolutional Layer	kernel: 96x256x5x5	stride=1; padding=2
Batch Normalization	features: 256
Max Pooling	kernel: 3x3	stride=2
ReLU (non-linearity)
Convolutional Layer	kernel: 256x384x3x3
Batch Normalization	features: 384
ReLU (non-linearity)
Convolutional Layer	kernel: 384x384x3x3	stride=1; padding=1
Batch Normalization	features: 384
ReLU (non-linearity)
Convolutional Layer	kernel: 384x256x3x3	stride=1; padding=1
Batch Normalization	features: 256
Max Pooling	kernel: 3x3	stride=2
ReLU (non-linearity)
reshape: 256x6x6 => 9216x1
Fully Connected Layer	kernel: 9216x4096
Batch Normalization	features: 4096
ReLU (non-linearity)
Dropout	probability: 0.5
Fully Connected Layer	kernel: 4096x4096
Batch Normalization	features: 4096
ReLU (non-linearity)
Dropout	probability: 0.5
Fully Connected Layer	kernel: 4096x1000
Batch Normalization	features: 1000

ResNet-18

input: 3x227x227 (RGB image)
Convolutional Layer			kernel: 3x64x7x7	stride=2; padding=3
Batch Normalization			features: 64
ReLU
Max Pooling			kernel: 3x3	stride=2; padding=1
First Group	Basic Block	Convolutional Layer	kernel: 64x64x3x3	stride=1; padding=1
		Batch Normalization	features: 64
		ReLU
		Convolutional Layer	kernel: 64x64x3x3	stride=1; padding=1
		Batch Normalization	features: 64
		ReLU
	Basic Block	Convolutional Layer	kernel: 64x64x3x3	stride=1; padding=1
		Batch Normalization	features: 64
		ReLU
		Convolutional Layer	kernel: 64x64x3x3	stride=1; padding=1
		Batch Normalization	features: 64
		ReLU
Second Group	Basic Block	Convolutional Layer	kernel: 64x128x3x3	stride=2; padding=1
		Batch Normalization	features: 128
		ReLU
		Convolutional Layer	kernel: 128x128x3x3	stride=1; padding=1
		Batch Normalization	features: 128
		(Downsample)	kernel: 64x128x1x1	stride=2; padding=0
		ReLU
	Basic Block	Convolutional Layer	kernel: 128x128x3x3	stride=1; padding=1
		Batch Normalization	features: 128
		ReLU
		Convolutional Layer	kernel: 128x128x3x3	stride=1; padding=1
		Batch Normalization	features: 128
		ReLU
Third Group	Basic Block	Convolutional Layer	kernel: 128x256x3x3	stride=2; padding=1
		Batch Normalization	features: 256
		ReLU
		Convolutional Layer	kernel: 256x256x3x3	stride=1; padding=1
		Batch Normalization	features: 256
		(Downsample)	kernel: 128x256x1x1	stride=2; padding=0
		ReLU
	Basic Block	Convolutional Layer	kernel: 256x256x3x3	stride=1; padding=1
		Batch Normalization	features: 256
		ReLU
		Convolutional Layer	kernel: 256x256x3x3	stride=1; padding=1
		Batch Normalization	features: 256
		ReLU
Fourth Group	Basic Block	Convolutional Layer	kernel: 256x512x3x3	stride=2; padding=1
		Batch Normalization	features: 512
		ReLU
		Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
		Batch Normalization	features: 512
		(Downsample)	kernel: 256x512x1x1	stride=2; padding=0
		ReLU
	Basic Block	Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
		Batch Normalization	features: 512
		ReLU
		Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
		Batch Normalization	features: 512
		ReLU
Average Pooling			kernel: 7x7	stride=7
reshape: 512x1x1 => 512x1
Fully Connected Layer			kernel: 512x1000

Note: ResNet-34 and ResNet-50 are also implemented in code but not yet explored.

VGG-16

input: 3x227x227 (RGB image)
Convolutional Layer	kernel: 3x64x3x3	stride=1; padding=1
Batch Normalization	features: 64
ReLU
Convolutional Layer	kernel: 64x64x3x3	stride=1; padding=1
Batch Normalization	features: 64
ReLU
Max Pooling	kernel: 2x2	stride=2
Convolutional Layer	kernel: 64x128x3x3	stride=1; padding=1
Batch Normalization	features: 128
ReLU
Convolutional Layer	kernel: 128128x3x3	stride=1; padding=1
Batch Normalization	features: 128
ReLU
Max Pooling	kernel: 2x2	stride=2
Convolutional Layer	kernel: 128x256x3x3	stride=1; padding=1
Batch Normalization	features: 256
ReLU
Convolutional Layer	kernel: 256x256x3x3	stride=1; padding=1
Batch Normalization	features: 256
ReLU
Convolutional Layer	kernel: 256x2563x3	stride=1; padding=1
Batch Normalization	features: 256
ReLU
Max Pooling	kernel: 2x2	stride=2
Convolutional Layer	kernel: 256x512x3x3	stride=1; padding=1
Batch Normalization	features: 512
ReLU
Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
Batch Normalization	features: 512
ReLU
Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
Batch Normalization	features: 512
ReLU
Max Pooling	kernel: 2x2	stride=2
Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
Batch Normalization	features: 512
ReLU
Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
Batch Normalization	features: 512
ReLU
Convolutional Layer	kernel: 512x512x3x3	stride=1; padding=1
Batch Normalization	features: 512
ReLU
Max Pooling	kernel: 2x2	stride=2
Adaptive Average Pooling	output size: 7x7
reshape: 512x7x7 => 25088x1
Fully Connected Layer	kernel: 25088x4096
ReLU
Fully Connected Layer	kernel: 4096x4096
ReLU
Fully Connected Layer	kernel: 4096x1000

Note: VGG-11, VGG-13 and VGG-19 are also implemented in code but not yet explored.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNN Architectures Trained With ImageNet

Outline

AlexNet

ResNet-18

VGG-16

Clone this wiki locally