You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There seems some bugs in this repo. For instance, the parameter of nn.TransformerEncoderLayer d_model in line 30 of transformer.py is the number of expected features in the input which must be divisible by num_heads. But in this repo, the d_model is set as NUM_BOX_FEATURES 109, and num_heads is 4. (109 % 4 eq 1)
The text was updated successfully, but these errors were encountered:
As described in the paper, when num_features is not divisible by num_heads, a fully-connected layer is applied to project num_features into d_model. If you look at the hyperparameter setting for the commit you are referring to, you will find nhead is set to 1 so no fully-connected layer is required. The latest main branch shows the example of adding the fully-connected layer so you can now vary the value of nhead.
There seems some bugs in this repo. For instance, the parameter of nn.TransformerEncoderLayer d_model in line 30 of transformer.py is the number of expected features in the input which must be divisible by num_heads. But in this repo, the d_model is set as NUM_BOX_FEATURES 109, and num_heads is 4. (109 % 4 eq 1)
The text was updated successfully, but these errors were encountered: