Question about hyperparameters

I've quite a lot of questions about this model. I've successfully trained latent diffusion on AFHQ dataset. But i'm having a hard time understanding many hyperparameters in yaml files.

In Autoencoder yaml:
- `embed_dim`: Why we are using embeddings in an autoencoder?
- `n_embed`: What is this?
- `double_z`: What is the purpose of this? I've noticed that it's True for KL autoencoder & False for VQ autoencoder. Why?
- `ch`: I know it means channels. But how does this changes model architecture?
- `ch_mult`: How does this work?
- `lossconfig.target`: This is set to `taming.modules.losses.vqperceptual.VQLPIPSWithDiscriminator` is it using a discriminator (like in a GAN)? Why an autoencoder needs an Discriminator?
- `lossconfig.params.disc_weight`: Is that related to discriminator in VQLPIPSWithDiscriminator & how does it influences it?
- `lossconfig.params.codebook_weight`: What is a codebook weight in a VQ autoencoder?


In Latent Diffusion yaml:
- `first_stage_key`: What is this? In every yaml file it's set to `image`.
- `num_timesteps_cond`: What does this do? In every file it's set to `1`.
- `log_every_t`: How does this work?

<br />
I would be grateful for any form of assistance. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about hyperparameters #287

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about hyperparameters #287

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions