Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latent scaling #86

Open
guochengqian opened this issue Jun 11, 2024 · 1 comment
Open

latent scaling #86

guochengqian opened this issue Jun 11, 2024 · 1 comment

Comments

@guochengqian
Copy link

guochengqian commented Jun 11, 2024

Dear authors, thanks for releasing zero123++.
May I know why do you perform latent and image unscaling? And how do you decide the scaling ratio?
https://huggingface.co/sudo-ai/zero123plus-pipeline/blob/main/pipeline.py#L396

latents = unscale_latents(latents)
image = unscale_image(...
def unscale_latents(latents):
    latents = latents / 0.75 + 0.22
    return latents

def unscale_image(image):
    image = image / 0.5 * 0.8
    return image

Thank you very much!

@eliphatfs
Copy link
Collaborator

We collected a set of natural images and data renderings and compare their latents to normalize the renderings so that the latents look more like natural images, which SD2 is trained on, mostly.
For image, we empirically found that this scaling will let the model converge faster. This also helps reduce the contrast of rendering latents to the normal level of natural images. This way the model can learn better 'global timesteps' (the same reason why we swap the noise schedule and choose a v-prediction base model).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants