Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add separable_conv2d_transpose operation #12001

Open
andreas-eberle opened this issue Aug 3, 2017 · 42 comments
Open

Feature Request: Add separable_conv2d_transpose operation #12001

andreas-eberle opened this issue Aug 3, 2017 · 42 comments
Labels
type:feature Feature requests

Comments

@andreas-eberle
Copy link
Contributor

Some recent papers (e.g.) have shown that transposed separable convolutions can be a great choice for decoders in encoder decoder architectures.

Can you add a seperable_conv2d_transpose operation comparable to the conv2d_transpose operation?

@poxvoculi poxvoculi added stat:contribution welcome Status - Contributions welcome type:feature Feature requests labels Aug 3, 2017
@tjingrant
Copy link
Contributor

tjingrant commented Aug 4, 2017

Hi, I will try to work on this one.

@andreas-eberle do you have specific examples in which a seperable_conv2d_transpose operation is profitable? Do you have a link to your e.g.?

@yeshwanthv5
Copy link

Please provide the links of the reference papers

@andreas-eberle
Copy link
Contributor Author

Sorry, didn't notice that I forgot the link.

In The Devil is in the Decoder they compare several "deconvolution" strategies and show that separable transposed convolution has very good performance.

In section 2.1.1 they give a short explanation about separable transposed convolution.

Afaik, separable (not transposed) convolution was introduced in Xception: Deep Learning with Depthwise Separable Convolutions

@ezfn
Copy link

ezfn commented Jan 2, 2018

+1 on that.
Besides the examples given by andreas-eberle, It is generally helpful to have operations that are not allowed to mix the channels, so that channels from different layers can be trivially combined (i.e. summed without any learned parameters).

@titusnicolae-intel
Copy link

Hi, is anyone working on this anymore?
I'd like to start work on this, if anyone can provide some supervision that would be helpful.

@estathop
Copy link

estathop commented Jul 4, 2018

useful feature to be implemented

@dhaneshr
Copy link

any updates on this ?

@notnot
Copy link

notnot commented Aug 4, 2018

I'd like to try a GAN with separable_conv2d and separable_conv2d_transpose, and i was surprised to see that separable_conv2d_transpose isn't available yet. Some have stated they started working on an implementation, how is that work going?

@chris-boson
Copy link

I would also be able to help with an implementation, @tjingrant have you started on it?
Also seems like the authors of The Devil is in the Decoder must have access to it.

@HouseOfFinwe
Copy link

HouseOfFinwe commented Oct 16, 2018

Any idea when there will be an implementation for separable_conv2d_transpose?

@brucechou1983
Copy link

Any update to this feature?

@chris-boson
Copy link

Easiest to just usetf.keras.layers.Conv2DTranspose followed by tf.keras.layers.DepthwiseConv2D

@HouseOfFinwe
Copy link

@chris-boson. This is not equivalent. One of the major selling points of DepthwiseConv2DTranspose (if it existed) is a reduction of parameters, which would not be achieved by a transpose followed by a depthwise conv.

@chris-boson
Copy link

chris-boson commented Oct 22, 2018

@HouseOfFinwe It does in fact reduce the parameter count considerably, especially in the case of many output channels. Use filters of shape [stride, stride] instead of [1, 1] for the pointwise conv in separable convolution to avoid checkerboarding.

@netanel-s
Copy link

+1 on this, would be highly appreciated.

@veqtor
Copy link

veqtor commented Oct 30, 2018

@chris-boson could you give a clearer example of using Conv2DTranspose followed by DepthwiseConv2D? Regardless, I still think a pure depthwise transpose would be even more efficient

@zhcm
Copy link

zhcm commented Nov 29, 2018

Any update to this feature?

@shenyi0220
Copy link

Any updates or progress on this?

@mbuckler
Copy link

I would be interested in this

@ltrottier
Copy link

That would be a nice addition indeed.

@ygoncharov
Copy link

Seems like a feature that makes a lot of sense

@veqtor
Copy link

veqtor commented Mar 9, 2019

Would this require a new op? Is it difficult because of lack of hw support?

@mjmjmtl-pony
Copy link

Any update on this?

1 similar comment
@voletiv
Copy link

voletiv commented Oct 10, 2019

Any update on this?

@CoachRDeveloper
Copy link

Would like to see this feature implemented

@edmondja
Copy link

edmondja commented Jun 8, 2020

+1

@gurpreet-singh135
Copy link

gurpreet-singh135 commented Jun 19, 2020

Hi, is anyone working on this currently?
I'd like to work on this feature. Also can somebody please provide some resources to start from.

@edmondja
Copy link

edmondja commented Jun 19, 2020

Is it very different to using upsampling + separableconv ?

@gurpreet-singh135
Copy link

@edmondja the problem with upsampling + separableconv is that it increases the number of computations compared to sep-conv2d-transpose

@yaoshiang
Copy link

Anyone find a workaround? I attempted to do a workaround with a stack of Conv2DTranspose, each with filters=1... but it was not very efficient. No promise this works but this was my attempt fwiw.

class DepthwiseConv2DTranspose(layers.Layer):
def init(self, filters, **kwargs):
super(DepthwiseConv2DTranspose, self).init(**kwargs)
self._filters = filters
self._t = []
for _ in range(filters):
self._t.append(layers.Conv2DTranspose(filters=1, kernel_size=5, strides=2, output_padding=1))

def __call__(self, img):
    upsample = []
    for i in range(self._filters):
        t = self._t[i](img[:,:,:,i:i+1])
        t = t[:,:,:,0]
        upsample.append(t)
    upsample = tf.stack(upsample, axis=-1)
    return upsample

@junhyukso
Copy link
Contributor

Any update?

@Orpheus23
Copy link

Here are two sample ones that do some of it, the first doesn't compile on TPU it does work on CPU

class Depthwise_Conv2D_Transpose(tf.keras.layers.Layer):
     def __init__(self, filters,kernel_size,strides,padding='same',use_bias=False,kernel_initializer=None,name="",**kwargs):
         super(Depthwise_Conv2D_Transpose, self).__init__(**kwargs)
         self.kernel_size = kernel_size
         self.strides = strides[0]
         self.padding = padding
         self.use_bias = use_bias
         self.kernel_init = kernel_initializer
         self.lambdas =[]
         for i in tf.range(filters):  
            self.lambdas.append(layers.Conv2DTranspose(filters=1, kernel_size=kernel_size,strides=self.strides,padding=self.padding))
         self.filters = filters
         self.input_image_shape = 0
         self.nm = name
     def call(self, inputs):
         #tf.print(inputs.shape,[-1]+[self.deconv_length(self.input_image_shape,self.strides,self.kernel_size,self.padding)]*2+[1])
         inputs_channel_wise =   tf.split(inputs,self.filters, -1)#
         x_outputs = [c(x) for x, c in zip(inputs_channel_wise, self.lambdas)]
         
         channel_wise_conv =  tf.concat(x_outputs, -1)#tf.transpose(tf.squeeze(channel_wise_conv,axis = -1),[0,2,3,1])
         return channel_wise_conv

The second compiles also on TPU and works well on CPU but doesn't train on TPU

class Depthwise_Conv2D_Transpose(tf.keras.layers.Layer):
    def __init__(self, filters,kernel_size,strides,padding='same',use_bias=False,kernel_initializer=None,name="",**kwargs):
        super(Depthwise_Conv2D_Transpose, self).__init__(**kwargs)
        self.kernel_size = kernel_size
        self.strides = strides[0]
        self.padding = padding
        self.use_bias = use_bias
        self.kernel_init = kernel_initializer
        self.lambdas =[]
        self.filters = filters
        self.input_image_shape = 0
        self.nm = name
        
    def deconv_length(self,dim_size, stride_size, kernel_size, padding, output_padding=None, dilation=1):

        assert padding in {'same', 'valid', 'full'}
        if dim_size is None:
            return None

        # Get the dilated kernel size
        kernel_size = kernel_size + (kernel_size - 1) * (dilation - 1)

        # Infer length if output padding is None, else compute the exact length
        if output_padding is None:
            if padding == 'valid':
                dim_size = dim_size * stride_size + max(kernel_size - stride_size, 0)
            elif padding == 'full':
                dim_size = dim_size * stride_size - (stride_size + kernel_size - 2)
            elif padding == 'same':
                dim_size = dim_size * stride_size
        else:
            if padding == 'same':
                pad = kernel_size // 2
            elif padding == 'valid':
                pad = 0
            elif padding == 'full':
                pad = kernel_size - 1

            dim_size = ((dim_size - 1) * stride_size + kernel_size - 2 * pad + output_padding)

        return dim_size

    def build(self, input_shape):
        for i in range(input_shape[-1]):  
           self.lambdas.append(self.add_weight(name = self.nm +"weights"+ str(i),initializer=tf.keras.initializers.deserialize(self.kernel_init),shape=(self.kernel_size,self.kernel_size,1,1), trainable=True))
        self.input_image_shape = input_shape[1]
        #self.lambdas = tf.stack(self.lambdas,axis = 0)
        self.image_shape = input_shape[-1]
        super(Depthwise_Conv2D_Transpose, self).build(input_shape)
    @tf.function
    def call(self, inputs):
        
        inputs_channel_wise =   tf.split(inputs,self.image_shape, -1)

        channel_wise_conv = tf.map_fn(lambda x:tf.nn.conv2d_transpose(x[0], filters=x[1], 
                                                                              output_shape=[tf.shape(inputs)[0]]+[self.deconv_length(self.input_image_shape,self.strides,self.kernel_size,self.padding)]*2+[1], 
                                                                              strides=self.strides, 
                                                                              padding=self.padding.upper()),(inputs_channel_wise,self.lambdas), fn_output_signature=tf.float32)
        
        channel_wise_conv = tf.transpose(tf.squeeze(channel_wise_conv,axis = -1),[0,2,3,1])
        return channel_wise_conv  

If any solutions found do update thanks!

@Ram-WD
Copy link

Ram-WD commented Nov 14, 2021

any update ?

@glenn-jocher
Copy link

@Orpheus23 is your TF Depthwise Conv2dTranspose implementation in #12001 (comment) still the best available today?

@Orpheus23
Copy link

Idk about any recent changes, but until early 2021(When I was checking) it was. If a better solution is needed, then it is best to write the same in c++ and export it. The layers that I had written didn't train properly on TPU(Idk about GPU, if it worked for someone then do update) as they required a lot of space in memory. Perhaps it is because of the mapping of conv2d transpose to every layer and the space required by tf.transpose + tf.squeeze.

Either way it would best to define on c++, the functions that tensorflow provides on python are very unideal for defining a new type of layer. With c++ there could be a better alternative to mapping conv2d_transpose to every layer and then reshaping everything.

@glenn-jocher
Copy link

@Orpheus23 hmm, yes that's what I was worried about. We've been running YOLOv5 experiments with these layers in PyTorch, and they seem to export well everywhere except TF. We build the TF models natively rather than go through ONNX, and currently there seems to be no efficient solution to build TF models with these layers. In PyTorch it's pretty simple as you can just set groups to equal the input/output channel counts to create depthwise conv2dtranspose layers.

@Orpheus23
Copy link

That's there, Pytorch works very well in these matters. The group feature is not here, maybe with tf.gather and conv2d transpose you might be able to do it, it will not be very clean. If you really want to go ahead with TF here, then C++ is the best bet. Integrating the C++ part with python was not that much of a headache. Otherwise the other alternatives are, to do bi-linear interpolation and then depth-wise conv2d or to hack it out with TF in python. In any case, Best of luck with the implementation.

@glenn-jocher
Copy link

@Orpheus23 got it, thanks! I'm linking @zldrobit here, who's been a lot of the brains behind the YOLOv5 TF models.

@glenn-jocher
Copy link

@AyushExel @sergiossm this is the main issue I found regarding DW Conv2d Transpose layers in TF (C++ conversion proposed by @Orpheus23)

@saikrn112
Copy link

Hey guys, any update on this feature?

@bayesian-mind
Copy link

Any update on the feature request for depthwise conv2d transpose?

Copy link

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Nov 14, 2023
@github-actions github-actions bot removed stale This label marks the issue/pr stale - to be closed automatically if no activity stat:contribution welcome Status - Contributions welcome labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature Feature requests
Projects
None yet
Development

No branches or pull requests