Choosing activation layer data type #323

phborba · 2020-04-18T18:46:09Z

Hi, first of all I would like to thank you for your work @qubvel , the package is awesome and very easy to use.

I was trying to use tensorflow mixed_precision but I could not get it working with segmentation_models. First of all, I've tried just to enable

tf.config.optimizer.set_experimental_options(
    {"auto_mixed_precision": True}
)

and defining a LossScaleOptimizer

opt = tf.keras.optimizers.Adam(learning_rate=0.01)
opt = tf.keras.mixed_precision.experimental.LossScaleOptimizer(opt,  
                                                       "dynamic")

but the training was a mess. I've tried with the same dataset I was using before and the same amount of epochs and batch size, but it did not converge at all. So, after some reading, I've discovered that the doc of tensorflow suggest the activations should be utype='float32' when using mixed_precision.

Is it possible to add an optional parameter to choose activation type when building models? I thought like the activation default could remain the same and if the activation_type parameter was filled, then it would assign a type for the activation.

The code sinipped bellow exemplifies my suggestion and it is a suggestion for the method conv2d_bn (

segmentation_models/segmentation_models/backbones/inception_resnet_v2.py

Line 41 in 94f624b

def conv2d_bn(x,

)

def conv2d_bn(x,
              filters,
              kernel_size,
              strides=1,
              padding='same',
              activation='relu',
              activation_data_type=None,
              use_bias=False,
              name=None):
    """Utility function to apply conv + BN.
    # Arguments
        x: input tensor.
        filters: filters in `Conv2D`.
        kernel_size: kernel size as in `Conv2D`.
        strides: strides in `Conv2D`.
        padding: padding mode in `Conv2D`.
        activation: activation in `Conv2D`.
        use_bias: whether to use a bias in `Conv2D`.
        name: name of the ops; will become `name + '_ac'` for the activation
            and `name + '_bn'` for the batch norm layer.
    # Returns
        Output tensor after applying `Conv2D` and `BatchNormalization`.
    """
    x = layers.Conv2D(filters,
                      kernel_size,
                      strides=strides,
                      padding=padding,
                      use_bias=use_bias,
                      name=name)(x)
    if not use_bias:
        bn_axis = 1 if backend.image_data_format() == 'channels_first' else 3
        bn_name = None if name is None else name + '_bn'
        x = layers.BatchNormalization(axis=bn_axis,
                                      scale=False,
                                      name=bn_name)(x)
    if activation is not None:
        ac_name = None if name is None else name + '_ac'
        if activation_data_type is None:
            x = layers.Activation(activation, name=ac_name)(x)
        else:
            x = layers.Activation(activation, name=ac_name,  dtype=activation_data_type)(x)
    return x

My main concern is not to change existing use cases, so that's why I used a data type optional parameter and the behaviour of each method would only be changed if the user chooses to do so.

If you accept these suggestions, I could help you out by doing them and then making a pull request, what do you think?

The text was updated successfully, but these errors were encountered:

innat · 2021-12-29T16:17:30Z

@qubvel would you please look into this. It's kinda important. Mixed precision training falls without this fix.

romitjain · 2022-07-27T07:47:03Z

Do you think #536 would help here?

phborba mentioned this issue Apr 23, 2020

Adding optional parameter activation_dtype to models #327

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choosing activation layer data type #323

Choosing activation layer data type #323

phborba commented Apr 18, 2020

innat commented Dec 29, 2021

romitjain commented Jul 27, 2022 •

edited

Loading

Choosing activation layer data type #323

Choosing activation layer data type #323

Comments

phborba commented Apr 18, 2020

innat commented Dec 29, 2021

romitjain commented Jul 27, 2022 • edited Loading

romitjain commented Jul 27, 2022 •

edited

Loading