Skip to content

MBConvBlockWithoutDepthwise stride implemented in 1x1 projection, wasting expansion arithmetic #660

@andravin

Description

@andravin

MBConvBlockWithoutDepthwise implements stride in the 1x1 projection convolution. When stride=2, the projection discards 3/4ths of the activations produced by the expansion. It would be equivalent to implement stride on the 3x3 expansion convolution instead, and this would reduce the total block arithmetic almost by a factor of 4.

self._expand_conv = tf.layers.Conv2D(
filters,
kernel_size=[3, 3],
strides=[1, 1],
kernel_initializer=conv_kernel_initializer,
padding='same',
use_bias=False)
self._bn0 = self._batch_norm(
axis=self._channel_axis,
momentum=self._batch_norm_momentum,
epsilon=self._batch_norm_epsilon)
# Output phase:
filters = self._block_args.output_filters
self._project_conv = tf.layers.Conv2D(
filters,
kernel_size=[1, 1],
strides=self._block_args.strides,
kernel_initializer=conv_kernel_initializer,
padding='same',
use_bias=False)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions