The network architecture is illustrated in Figure. It consists of a contracting path (left side) and an expansive path (right side).
The contracting path follows the typical architecture of a convolutional network. It consists of:
- Convolutions (kernel size =
3x3
, stride =1
padding =1
) - Batch Normalization (added in here, does not exist in original implementation)
- ReLU
- Convolutions (kernel size =
3x3
, stride =1
padding =1
) - Batch Normalization (added in here, does not exist in original implementation)
- ReLU
- Downsampling: Max pooling operation (kernel size =
2x2
, stride =2
)
Every step in the expansive path consists of:
- Upsampling: Transpose Convolution (kernel size =
2x2
, stride =2
, padding =0
) - Convolutions (filter size =
2x2
) - Concatenation with the correspondingly cropped feature map from the contracting path
- Convolutions (kernel size =
3x3
) - ReLU
- Convolutions (kernel size =
3x3
) - ReLU
At the final layer a 1x1
convolution is used to map each 64-component feature vector to the desired number of classes. In total the network has 23 convolutional layers.