The vae encoder of the first_stage_model #345

forgetable233 · 2024-04-10T12:10:09Z

I'm using the sv3d_p model. I noticed that the vae encoder of the first_stage_model is not provided in the ckpt.
I wonder what's the vae encoder of the first_stage_model while training?

JiuTongBro · 2024-04-27T07:47:28Z

Same question.

pengc02 · 2024-05-12T15:04:45Z

Hi guys, i'm also focus on this. It seems that sv3d use the same encoder and decoder as svd, while svd's encoder is released on huggingface. You can refer to: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/tree/main/vae for the ckpt, https://github.com/huggingface/diffusers/blob/v0.24.0-release/src/diffusers/models/autoencoder_kl_temporal_decoder.py for the model code, and https://github.com/huggingface/diffusers/blob/v0.24.0-release/src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py for how to use.

chenshuo20 · 2024-07-28T21:29:26Z

@pengc02 thx! It helps.

chenshuo20 · 2024-07-28T23:18:54Z

Hi guys, i'm also focus on this. It seems that sv3d use the same encoder and decoder as svd, while svd's encoder is released on huggingface. You can refer to: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/tree/main/vae for the ckpt, https://github.com/huggingface/diffusers/blob/v0.24.0-release/src/diffusers/models/autoencoder_kl_temporal_decoder.py for the model code, and https://github.com/huggingface/diffusers/blob/v0.24.0-release/src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py for how to use.

Also I find that you can use such config to load the vae model:

vae_encoder_config:
      target: src.diffusers.models.autoencoders.autoencoder_kl_temporal_decoder.AutoencoderKLTemporalDecoder
      params:
        block_out_channels: [128, 256, 512, 512]
        layers_per_block: 2
        in_channels: 3
        out_channels: 3
        down_block_types: ["DownEncoderBlock2D", "DownEncoderBlock2D", "DownEncoderBlock2D", "DownEncoderBlock2D"]

jjihwan mentioned this issue Jun 15, 2024

Can you provide the weight of SV3D encoder? jjihwan/SV3D-fine-tune#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The vae encoder of the first_stage_model #345

The vae encoder of the first_stage_model #345

forgetable233 commented Apr 10, 2024

JiuTongBro commented Apr 27, 2024

pengc02 commented May 12, 2024 •

edited

Loading

chenshuo20 commented Jul 28, 2024

chenshuo20 commented Jul 28, 2024

The vae encoder of the first_stage_model #345

The vae encoder of the first_stage_model #345

Comments

forgetable233 commented Apr 10, 2024

JiuTongBro commented Apr 27, 2024

pengc02 commented May 12, 2024 • edited Loading

chenshuo20 commented Jul 28, 2024

chenshuo20 commented Jul 28, 2024

pengc02 commented May 12, 2024 •

edited

Loading