Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RQ-VAE: How can I get a list of all learned codebook vectors (as indexed in the "indices")? #28

Closed
christophschuhmann opened this issue Oct 3, 2022 · 2 comments

Comments

@christophschuhmann
Copy link

Hi Lucid,
i am working on quantizing CLIP image embeddings with your RQ-VAE. It works pretty well.

Next I want to take all learned codebook vectors and add them to the vocab of a GPT (as frozen token embeddings).

The idea is to train a GPT with CLIP image embeddings in between texts, e.g. IMAGE-CAPTION or TEXT-IMAGE-TEXT-IMAGE- ... Flamingo-style).

If this works, then GPT could maybe also learn to generate quantized CLIP IM embeddings token by token --> and then e.g. show images through a.) retrieval or b.) a DALLE 2 decoder :)

... So my question is: Once the RQ-VAE is trained and i can get the quantized reconstructions and indices - How can I get a list or tensor of the actual codebook? (all possible vectors from the rq-vocab) :)

@kradonneoh
Copy link

kradonneoh commented Oct 7, 2022

+1 I can reverse engineer the forward function, but it'd be nice if there was an easy function call I'm missing

Edit: ended up reverse engineering it anyways :-) You can do codes from indices like:
quantizer.layers[i]._codebook.embed[0, tokens_ids[:, i]] for each layer i in the residual vector quantizer. As a bonus, you can reconstruct the input (image / audio / etc.) by doing:

decoded_vector = 0.0
for i, layer in enumerate(quantizer.layers):
    vector = vector + layer._codebook.embed[0, tokens[:, i]]

@lucidrains
Copy link
Owner

@christophschuhmann @kradonneoh oh hey! nice to hear that the library is working well for your use case

I've added the feature to return all the codes across quantization layers here ec24746

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants