diff --git a/+transformer/+layer/attention.m b/+transformer/+layer/attention.m index ea52804..fee28fe 100644 --- a/+transformer/+layer/attention.m +++ b/+transformer/+layer/attention.m @@ -28,7 +28,7 @@ % hyper-parameter. % % Outputs: -% Z - A (numFeatures*numHeads)-by-numInputSubwords-by-numObs +% A - A (numFeatures*numHeads)-by-numInputSubwords-by-numObs % output array. % present - A numFeatures-by-numAllSubwords-by-numHeads-by-numObs-by-2 % array. This contains the 'keys' and 'values' that