-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector Quantization Layer Prototype #52
Conversation
Great notebook, @FlorianPfisterer.
|
Well, that's because my experiments where based on the paper 😆. We're not going to train the conv layers, but it still adds a gradient flow that might be relevant for another VQ-layer further upstream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor things, mostly questions. Really great experiments, I especially love the plots where one can see how the embeddings move!
" access_count = tf.reduce_sum(one_hot_access, axis=[0, 1], name='access_count')\n", | ||
"\n", | ||
" # closest embedding update loss (alpha-loss)\n", | ||
" nearest_loss = tf.reduce_mean(alpha * tf.norm(y - x, lookup_ord, axis=2), axis=[0, 1])\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you noted yourself, add the tf.stop_gradient()
here for y
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather for x
, I'd say. Can you triple-check, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR contains a vector quantization (VQ) prototype that was developed and tested in four Jupyter notebooks.
The code is not production-ready and only there for the sake of experimenting (it is located in
/experiments
). The well-tested and functionally extended version can now be developed based on this prototype (#51). Merging of this PR resolves #25.The work can be found here (please review the four files +
README.md
):