Is this Reshape step redundant? #3

hamelsmu · 2017-06-16T18:26:38Z

See this line of code: https://github.com/philipperemy/keras-attention-mechanism/blob/master/attention_lstm.py#L19

Isnt this redundant? Because the Permute layer right before it will reshape the Tensor.

Let me know if I'm missing something. I am trying to understand attention and thus far your writeup is helping

hamelsmu · 2017-06-16T21:17:04Z

Also you don't need this line of code:
https://github.com/philipperemy/keras-attention-mechanism/blob/master/attention_lstm.py#L25

you can pass name = 'name' in any layer

philipperemy · 2017-06-17T02:19:50Z

@hamelsmu yes the Reshape layer is redundant and does not add any value to the model (Everything is done by the Permute layer).

It's more to enforce the correct shape. The output of the Permute layer is (?, ?) and by adding this Reshape layer, we make it more clear about the real shapes (they are static and known at compilation time). So I wanted to reflect this idea of static shapes (vs dynamic shapes).

philipperemy · 2017-06-17T02:21:52Z

Thanks for your feedback! Highly appreciated!

a = Dense(TIME_STEPS, activation='softmax', name='attention_vec')(a)
if SINGLE_ATTENTION_VECTOR:
    a = Lambda(lambda x: K.mean(x, axis=1), name='attention_vec')(a)  # this is the attention vector!
    a = RepeatVector(input_dim)(a)

Is this what you meant? Removing the Else clause and adding name='attention_vec' before the If?

hamelsmu · 2017-06-17T08:49:19Z

Yeah thats right

philipperemy · 2017-06-18T04:09:48Z

It would not work here because we define different layers with the same names twice.

RuntimeError: The name "attention_vec" is used 2 times in the model. All layer names should be unique.

hamelsmu · 2017-06-19T00:20:43Z

@philipperemy right. However I suppose you can say that the attention layer is a_probs because that is the layer that is being multiplied by the inputs. So you can re-factor to look like this:

    a = Dense(TIME_STEPS, activation='softmax')(a)
    if SINGLE_ATTENTION_VECTOR:
        a = Lambda(lambda x: K.mean(x, axis=1), name='dim_reduction')(a) 
        a = RepeatVector(input_dim)(a, name='time_repeat')

    a_probs = Permute((2, 1), name='attention_vec')(a)
    output_attention_mul = merge([inputs, a_probs], name='attention_mul', mode='mul')

philipperemy · 2017-06-19T01:43:44Z

Ok seems good for me! The only thing is that attention_vec.shape will change from (1, 2, 20) to (1, 20, 2), where 20 is the number of time steps, and 2 the number of input dims. So we have to change the axis from 1 to 2 (on which we aggregate). Simply because we want to display the vector for the time axis.

attention_vector = np.mean(
      get_activations(
       m,
       testing_inputs_1,
       print_shape_only=True,
       layer_name='attention_vec')[0], axis=2).squeeze()

philipperemy · 2017-06-19T01:50:38Z

Let me know if it seems good for you:

PR: #4

hamelsmu · 2017-06-19T02:01:27Z

Thanks!

philipperemy added enhancement question labels Jun 17, 2017

philipperemy changed the title ~~Is this Reshpae step redundant?~~ Is this Reshape step redundant? Jun 19, 2017

philipperemy mentioned this issue Jun 19, 2017

Small Improvement #4

Merged

hamelsmu closed this as completed Jun 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is this Reshape step redundant? #3

Is this Reshape step redundant? #3

hamelsmu commented Jun 16, 2017

hamelsmu commented Jun 16, 2017

philipperemy commented Jun 17, 2017 •

edited

Loading

philipperemy commented Jun 17, 2017

hamelsmu commented Jun 17, 2017

philipperemy commented Jun 18, 2017 •

edited

Loading

hamelsmu commented Jun 19, 2017 •

edited

Loading

philipperemy commented Jun 19, 2017 •

edited

Loading

philipperemy commented Jun 19, 2017 •

edited

Loading

hamelsmu commented Jun 19, 2017

Is this Reshape step redundant? #3

Is this Reshape step redundant? #3

Comments

hamelsmu commented Jun 16, 2017

hamelsmu commented Jun 16, 2017

philipperemy commented Jun 17, 2017 • edited Loading

philipperemy commented Jun 17, 2017

hamelsmu commented Jun 17, 2017

philipperemy commented Jun 18, 2017 • edited Loading

hamelsmu commented Jun 19, 2017 • edited Loading

philipperemy commented Jun 19, 2017 • edited Loading

philipperemy commented Jun 19, 2017 • edited Loading

hamelsmu commented Jun 19, 2017

philipperemy commented Jun 17, 2017 •

edited

Loading

philipperemy commented Jun 18, 2017 •

edited

Loading

hamelsmu commented Jun 19, 2017 •

edited

Loading

philipperemy commented Jun 19, 2017 •

edited

Loading

philipperemy commented Jun 19, 2017 •

edited

Loading