Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector Quantization Layer #51

Closed
6 of 7 tasks
Simsso opened this issue Sep 18, 2018 · 4 comments
Closed
6 of 7 tasks

Vector Quantization Layer #51

Simsso opened this issue Sep 18, 2018 · 4 comments
Assignees
Labels
code Software is relevant or involved research Scientific items

Comments

@Simsso
Copy link
Owner

Simsso commented Sep 18, 2018

Development of a production-ready vector quantization (VQ) layer in TensorFlow, based on the prototype developed in #25 and merged with #52 (+prototype 2).

Development branch: vq-layer

Documentation

Sub-tasks

  • Test batch size effect on gradient (f390595)
  • Assignment of least used embedding space vectors with values from input x that were furthest away from embedding space vectors
  • Consider developing a custom C++ op
  • Research on current memory efficiency (which parts consume a lot of RAM, how to profile)
  • is_training parameter updates only during training (not needed anymore)
  • Add scatter_update call to tf.GraphKeys.UPDATE_OPS
  • dotp norm order VQ-Layer Cosine Distance #63
@Simsso Simsso added code Software is relevant or involved research Scientific items labels Sep 18, 2018
@ghost ghost mentioned this issue Sep 21, 2018
11 tasks
@Simsso
Copy link
Owner Author

Simsso commented Sep 22, 2018

Just discovered the tf.test.TestCase class, which we should consider using.

@Simsso
Copy link
Owner Author

Simsso commented Sep 24, 2018

Since L1, L2, and Inf norm seem to be problematic in high dimensional space, it might be worth considering to add an additional norm order, namely dotp. It would

  1. normalize all input vectors to unit norm
  2. initialize the embedding space with unit norm vectors
  3. replace inputs with the vector from the embedding space to which the dot product is the greatest
  4. define the loss as the negative dot product

Issue for that #63

@Simsso
Copy link
Owner Author

Simsso commented Oct 1, 2018

I've noticed that the VQ layer uses some of the embedding vectors very sparsely. Despite beta-loss (beta=0.5, alpha=1). Here, gamma=0:

image

That's something worth contemplating about. Maybe the idea of replacing less frequently used embeddings should be pursued.

@Simsso Simsso added this to the 13. Working Group Meeting milestone Oct 1, 2018
@FlorianPfisterer
Copy link
Collaborator

My guess is that L1/L2/inf norm are just not good distance measures in high-dimensional space. Maybe we could try applying PCA or something to both activations and embeddings and then calculate the distance on their basis.
-> will be discussed in #56

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code Software is relevant or involved research Scientific items
Projects
None yet
Development

No branches or pull requests

2 participants