Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about the visual autoencoder #55

Open
Junction4Nako opened this issue Dec 28, 2023 · 1 comment
Open

question about the visual autoencoder #55

Junction4Nako opened this issue Dec 28, 2023 · 1 comment

Comments

@Junction4Nako
Copy link

Junction4Nako commented Dec 28, 2023

Thanks for the great work! I have some questions about the checkpoints:

  1. It seems that BAAI/Emu2 does not include the weight of visual decoder (diffusion unet), but I think in section 2.2.3 of the paper, Emu2 should include the autoencoder-trained decoder?
  2. Emu2-Gen provides the weights of visual decoder, can the visual encoder and decoder in BAAI/Emu2-Gen work as an autoencoder?
    looking forward to your reply~
@ryanzhangfan
Copy link
Collaborator

ryanzhangfan commented Dec 28, 2023

Thanks for your interest in our work!

  1. The visual decoder of Emu2.
    As stated in the paper, we freeze the visual encoder during the training of Emu2-Gen and visual decoder. Hence, Emu2 and Emu2-Gen share exactly the same visual decoder. The visual decoder weights in Emu2-Gen can be directly used in Emu2.

  2. The autoencoder paradigm
    Yes, the visual encoder and the visual decoder can work as an autoencoder. Our pipeline currently supports to generate the output in an autoenoding manner. You can find instructions at HF version model or native PyTorch version model(at the bottom part of the example codes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants