Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the dimension of query and key #11

Closed
netw0rkf10w opened this issue Sep 11, 2020 · 4 comments
Closed

Regarding the dimension of query and key #11

netw0rkf10w opened this issue Sep 11, 2020 · 4 comments

Comments

@netw0rkf10w
Copy link

Hi,

I observed in the code that the query's and key's dimensions are haft of the value's (out_planes // 2, group_planes // 2). Is there a specific reason for that (apart making it faster)?

Thanks.

@csrhddlam
Copy link
Owner

csrhddlam commented Sep 11, 2020

No, the "half" here is just an architectural hyper-parameter, which controls efficiency-accuracy trade-off. Empirically, we found "half" is a good trade-off in our cases, but "one" or "quarter" might also work well.

@netw0rkf10w
Copy link
Author

@csrhddlam Thanks. I guess "one" is supposed to give the best accuracy? Do you have an estimate of how better it is compared to "half"?

@csrhddlam
Copy link
Owner

Yes, more channels usually leads to better accuracy, but we did not study much about it. Personally, I won't expect much improvement by switching from "half" to "one".

@netw0rkf10w
Copy link
Author

Thanks for the answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants