Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choosing activation layer data type #323

Open
phborba opened this issue Apr 18, 2020 · 2 comments
Open

Choosing activation layer data type #323

phborba opened this issue Apr 18, 2020 · 2 comments

Comments

@phborba
Copy link

phborba commented Apr 18, 2020

Hi, first of all I would like to thank you for your work @qubvel , the package is awesome and very easy to use.

I was trying to use tensorflow mixed_precision but I could not get it working with segmentation_models. First of all, I've tried just to enable

tf.config.optimizer.set_experimental_options(
    {"auto_mixed_precision": True}
)

and defining a LossScaleOptimizer

opt = tf.keras.optimizers.Adam(learning_rate=0.01)
opt = tf.keras.mixed_precision.experimental.LossScaleOptimizer(opt,  
                                                       "dynamic")

but the training was a mess. I've tried with the same dataset I was using before and the same amount of epochs and batch size, but it did not converge at all. So, after some reading, I've discovered that the doc of tensorflow suggest the activations should be utype='float32' when using mixed_precision.

Is it possible to add an optional parameter to choose activation type when building models? I thought like the activation default could remain the same and if the activation_type parameter was filled, then it would assign a type for the activation.

The code sinipped bellow exemplifies my suggestion and it is a suggestion for the method conv2d_bn (

)

def conv2d_bn(x,
              filters,
              kernel_size,
              strides=1,
              padding='same',
              activation='relu',
              activation_data_type=None,
              use_bias=False,
              name=None):
    """Utility function to apply conv + BN.
    # Arguments
        x: input tensor.
        filters: filters in `Conv2D`.
        kernel_size: kernel size as in `Conv2D`.
        strides: strides in `Conv2D`.
        padding: padding mode in `Conv2D`.
        activation: activation in `Conv2D`.
        use_bias: whether to use a bias in `Conv2D`.
        name: name of the ops; will become `name + '_ac'` for the activation
            and `name + '_bn'` for the batch norm layer.
    # Returns
        Output tensor after applying `Conv2D` and `BatchNormalization`.
    """
    x = layers.Conv2D(filters,
                      kernel_size,
                      strides=strides,
                      padding=padding,
                      use_bias=use_bias,
                      name=name)(x)
    if not use_bias:
        bn_axis = 1 if backend.image_data_format() == 'channels_first' else 3
        bn_name = None if name is None else name + '_bn'
        x = layers.BatchNormalization(axis=bn_axis,
                                      scale=False,
                                      name=bn_name)(x)
    if activation is not None:
        ac_name = None if name is None else name + '_ac'
        if activation_data_type is None:
            x = layers.Activation(activation, name=ac_name)(x)
        else:
            x = layers.Activation(activation, name=ac_name,  dtype=activation_data_type)(x)
    return x

My main concern is not to change existing use cases, so that's why I used a data type optional parameter and the behaviour of each method would only be changed if the user chooses to do so.

If you accept these suggestions, I could help you out by doing them and then making a pull request, what do you think?

@innat
Copy link

innat commented Dec 29, 2021

@qubvel would you please look into this. It's kinda important. Mixed precision training falls without this fix.

@romitjain
Copy link

romitjain commented Jul 27, 2022

Do you think #536 would help here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants