Skip to content

2021 assignment 3 Q2 self attention section: the expected_self_attn_output provided is wrong #278

@manuka2

Description

@manuka2

In the forward() for MultiHeadAttention class in assignment3/cs231n/transformer_layers.py
People can only get the provided expected_self_attn_output if people do attention weights --- dropout --- attention weights after dropout X value matrix. However, your assignment instruction explicitly instructed people to follow a different order, namely, attention weights --- attention weights X value matrix --- dropout. If people follow the order you actually instructed, their self_attn_output will be different from the provided expected_self_attn_output. So the check you provided in your Transformer_Captioning.ipynb is wrong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions