2021 assignment 3 Q2 self attention section: the expected_self_attn_output provided is wrong

In the `forward()` for `MultiHeadAttention` class in `assignment3/cs231n/transformer_layers.py`  <br /> People can only get the provided `expected_self_attn_output` if people do `attention weights --- dropout --- attention weights after dropout X value matrix`. However, your assignment instruction explicitly instructed people to follow a different order, namely, `attention weights --- attention weights X value matrix --- dropout`. If people follow the order you actually instructed, their `self_attn_output` will be different from the provided `expected_self_attn_output`. So the check you provided in your Transformer_Captioning.ipynb is wrong.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2021 assignment 3 Q2 self attention section: the expected_self_attn_output provided is wrong #278

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

2021 assignment 3 Q2 self attention section: the expected_self_attn_output provided is wrong #278

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions