Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conv1D with dilation and "causal" padding isn't reducing the size of the sequence #8751

Closed
chausies opened this issue Dec 11, 2017 · 12 comments

Comments

@chausies
Copy link

chausies commented Dec 11, 2017

If you run

x = Input(shape=(seq_length, dim))
y = Conv1D(num_filts, 2, dilation_rate=8, padding="causal")(x)
print(y.shape)

Then the shape printed out will be (?, seq_length, dim), which means the sequence length wasn't changed at all. But isn't this incorrect? Because the dilation makes the effective kernel width of the convolution 9, shouldn't the sequence length of the output be decremented by 8 because the causal kernel needs 9 values to perform its dot product and so doesn't output on the first 8 samples in the sequence?

To clarify, I'm running with an updated tensorflow backend.

@chausies chausies changed the title Conv1D with dilation with "causal" padding isn't reducing the size of the sequence Conv1D with dilation and "causal" padding isn't reducing the size of the sequence Dec 11, 2017
@gabrieldemarmiesse
Copy link
Contributor

Thank you for the detailed issue. I'll look into it.

@saftacatalinmihai
Copy link

any news on this ?

@gabrieldemarmiesse
Copy link
Contributor

I'm sorry, I took other pull requests in the meatime and forgot about this issue. Thank you for pinging me. I'm looking into it now.

@gabrieldemarmiesse
Copy link
Contributor

I'm not fammiliar with causal padding so I took a look on google and in the code.

For those interested, here is the code in keras:

def conv_output_length(input_length, filter_size,
                       padding, stride, dilation=1):
    """Determines output length of a convolution given input length.
    """
    if input_length is None:
        return None
    assert padding in {'same', 'valid', 'full', 'causal'}
    dilated_filter_size = filter_size + (filter_size - 1) * (dilation - 1)
    if padding == 'same':
        output_length = input_length
    elif padding == 'valid':
        output_length = input_length - dilated_filter_size + 1
    elif padding == 'causal':
        output_length = input_length
    elif padding == 'full':
        output_length = input_length + dilated_filter_size - 1
    return (output_length + stride - 1) // stride

So it seems that only the stride can change the output lenght when using padding="causal".

from the ducumentation:
"causal" results in causal (dilated) convolutions, e.g. output[t] does not depend on input[t+1:]. Useful when modeling temporal data where the model should not violate the temporal order. See WaveNet: A Generative Model for Raw Audio, section 2.1.

Looking again at the code in keras for the convolution in 1D:

if padding == 'causal':
    # causal (dilated) convolution:
    left_pad = dilation_rate * (kernel_shape[0] - 1)
    x = temporal_padding(x, (left_pad, 0))
    padding = 'valid'

So keras ensure that we do not depend on earlier values by padding with zeros at the start, then act as if the padding was in mode valid.

It makes sense, and in the end, this is true that the documentation didn't say that keras implemented this shift of the output values by using zero padding.

To get the behavior that was given in @chausies 's message, you just need to use padding='valid' instead of padding='causal', but then, it's your job to keep track of the shift, which is (in my opinion) more complicated than to let keras pad your tensor.

So in the end, the documentation isn't clear, and it caused the confusion. If someone wants to do a quick PR to mention the zero padding in the docs of causal, i'd be great. I'm on holidays, so I can't do it now.

If something isn't clear, feel free to ask.

@emerrf
Copy link

emerrf commented Apr 1, 2018

I was looking into this today by chance and comparing it with the WaveNet implementation from:
https://github.com/munich-ai-labs/keras2-wavenet/blob/master/wavenet_utils.py#L29
I have the impression that the Munich AI labs developers had the same issue and they overridden compute_output_shape. Maybe @imdatsolak has more details. HTH

@gabrieldemarmiesse
Copy link
Contributor

Their implementation is the same as the build in causal keras implementation judging from the code. It seems they were just not aware that causal already existed in keras.

@abderrahim
Copy link

I think causal should not be a value of padding, but a separate option that can be used with both padding='valid' and padding='same' (and padding='full' I guess, I don't know this option).

What keras does now with padding='causal' is causal convolution with padding='same' and what @chausies is asking for is causal convolution with padding='valid'.

I propose adding a new causal option, and making padding='causal' a deprecated equivalent to causal=True, padding='same'. If there is no objection to this, I'll try to implement it.

@gabrieldemarmiesse
Copy link
Contributor

Can you give a minimal example of Conv1D(causal=True, padding='valid') in the way you think about it? What would be the difference with Conv1D(causal=False, padding='valid')? Can you give an example with NumPy arrays so that it's explicit?

@abderrahim
Copy link

@gabrieldemarmiesse thanks for making me think more deeply about this. You are indeed right. It is a documentation issue.

@gabrieldemarmiesse
Copy link
Contributor

Yould you have the time to do a PR on the docs to mention the zero padding?

BertrandDechoux added a commit to BertrandDechoux/keras that referenced this issue Aug 26, 2018
fchollet pushed a commit that referenced this issue Aug 28, 2018
* Update causal Conv1D doc : uses zero padding #8751

See issue #8751

* Fix docstring formatting.
@wt-huang
Copy link

Closing as this is resolved

@rjpg
Copy link

rjpg commented Apr 15, 2019

hello,

I am trying to understand the temporal relation with 'causal' padding ...

My goal is to implement in 2D ... for Conv2D assuming the x-axis is the "time steps" and the y-axis are several variables...

for example,assuming K.set_image_dim_ordering("th"), the input shape of CNN2D is (batch,filters, x, y)

and we have (none, 1 , 128,9) means that we have 128 time steps and 9 variables in a multivariate time series problem.

How hard it would be to create or include (with a new extended conv2D) the 'causal' padding in conv2D ?

we could also choose if the "time dimension" is vertical in x or horizontal in y...

For the other 'variables dimension', we could choose 'valid' or 'same' ... If someone is thinking of giving a simple example assume 'valid' padding for this variables dimension since it seems to be the one used inside 'causal' padding ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants