You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your implementation. In comparing your codebase to the author's implementation, I discovered that while you have a single expansion factor in your configuration, the authors have separate values - one for tokens and one for channels.
I'm not suggesting that anything necessarily needs to change in your implementation. However, if you wanted to align your codebase to be able to fully replicate the author's work, you may consider allowing for two separate parameters - token_expansion_factor and channels_expansion_factor.
Thank you again for this work, and for all your contributions generally. You are a an incredible asset to the community.
The text was updated successfully, but these errors were encountered:
Thanks for your implementation. In comparing your codebase to the author's implementation, I discovered that while you have a single expansion factor in your configuration, the authors have separate values - one for tokens and one for channels.
Specifically, their channels expansion factor is 4, but their tokens expansion factor is 0.5. (The hidden_dim is the base projection size). Note that they actually use a feature count, but I'm translating to the mechanism you use in this codebase.
Thus, when executing the MixerBlock, the tokens "expansion" is actually a bottleneck.
The parameters can be verified as well in Table 1 ("Specifications of Mixer Architectures") at the top of page 4 in version 4 (the current version as of Feb 14, 2022) of their paper.
I'm not suggesting that anything necessarily needs to change in your implementation. However, if you wanted to align your codebase to be able to fully replicate the author's work, you may consider allowing for two separate parameters - token_expansion_factor and channels_expansion_factor.
Thank you again for this work, and for all your contributions generally. You are a an incredible asset to the community.
The text was updated successfully, but these errors were encountered: