Implemented Self Attention for Question Encoding #40

Drajan · 2019-05-03T20:18:14Z

Multi-head Attention weights are computed using two FF layers with ReLU activation and softmax for probabilities ==> softmax(FF(ReLU(FF(input))))
input = [batch_size, num_words, embed_size]
attention = [batch_size, num_words, num_attention_heads]
input_attention_weighted = [batch_size, num_heads, embed_size]
output = [batch_size, num_heads*embed_size] ==> concatenating multi-head representation

Deepta Rajan added 2 commits May 3, 2019 13:09

minor typos

b4be9b8

self-attention for questions

2162965

tkornuta-ibm approved these changes May 3, 2019

View reviewed changes

tkornuta-ibm merged commit f6c6d1e into IBM:develop May 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implemented Self Attention for Question Encoding #40

Implemented Self Attention for Question Encoding #40

Uh oh!

Drajan commented May 3, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Implemented Self Attention for Question Encoding #40

Implemented Self Attention for Question Encoding #40

Uh oh!

Conversation

Drajan commented May 3, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants