-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How Dense layers work? #219
Comments
|
Hi, How do you extract these strings (in fact operations) : 'aijb,qwei->qweja' for example if we have [[4, 5, 5], [5, 4, 5]] instead [[2, 2, 5, 5], [2, 5, 2, 5]] Thank you in advance. |
Hi, You mean how to read this notation or how do I come up with this particular formulas? If the former, check out some tutorial, e.g. https://rockt.github.io/2018/04/30/einsum If the latter, then check out the Tensorizing Neural Networks [1] paper for the definition of a TT layer, formula (5). You have input vector x which you reshape into e.g. [2, 2, 5, 5] (or [4, 5, 5]) tensor X, and then you need to do the summation w.r.t. j1, j2, j3, j4 (or j1, j2, j3 in case of [4, 5, 5]). E.g. the first step in the pseudocode above Note that in these formulas we multiply X @ TTW, while in the paper we do TTW.T @ X, sorry about confusing notation. |
Hi,
For example we have a dense layer with shape (100, 100) and we will try with shape [[2, 2, 5, 5], [2, 5, 2, 5]] and max_tt_rank=4
based on this example we have these tt_cores
(1, 2, 2, 4)
(4, 2, 5, 4)
(4, 5, 2, 4)
(4, 5, 5, 1)
I mean for a dense layer with shape of (100, 100) and rank=20 we have two dense layers like (100, 20) and (20, 100).
I want to know how does T3F library work in this manner?
Thank you in advance.
Best regards,
Miladona
The text was updated successfully, but these errors were encountered: