-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What camera intrinsic used for finetuning the gray bg Zero123Plus? #5
Comments
@cwchenwang Hi, we aimed to strictly follow the camera setting of Zero123++ v1.2 (fov=30) during fine-tuning. We asked the authors of Zero123++ about the object normalization and camera distance in this issue. The original answer was that the object should be normalized into a unit cube (it has been correced to unit sphere), which was an unintentional mistake resulting in larger objects in the rendered image. This will not influence the reconstruction results in most cases. However, if the shape of the object is close to a cube, it will occupy a very large region in the generated image and make the reconstruction result |
@bluestyle97 thanks a lot for open-sourcing the codebase for fine-tuning the zero123++ models. I am using this for validation views in blender:
Any suggestions would be super helpful |
Hi, have you found a proper scale to reproduce the results as shown in the InstantMesh? Thanks! |
@mengxuyiGit
Not sure if these params were used by instantmesh authors for training cc: @bluestyle97 |
Thanks for the amazing work. I noticed that after finetuning to white bg, the output image has a large scale:
![output-whitebg](https://private-user-images.githubusercontent.com/23579918/322326348-4b5dc9a7-a6f8-4920-a5cf-106a272e831c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0NjE2MDEsIm5iZiI6MTcyMDQ2MTMwMSwicGF0aCI6Ii8yMzU3OTkxOC8zMjIzMjYzNDgtNGI1ZGM5YTctYTZmOC00OTIwLWE1Y2YtMTA2YTI3MmU4MzFjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA4VDE3NTUwMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTM3YjEzZmY0MjBjZDI3NjA1YTk0MjA5NjNhNDU1Njc0MGVmMTllNjc5ZjQxNDhmMzViY2VjYjVkNTkyY2ZiNzQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.yNuWLRaXnWBjbmgUyMa0iMVFOKpemDZrtLfP5JFJ2_s)
![output-ori](https://private-user-images.githubusercontent.com/23579918/322326356-11fcb06c-4d1b-4c7c-bf29-99153576bc7a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0NjE2MDEsIm5iZiI6MTcyMDQ2MTMwMSwicGF0aCI6Ii8yMzU3OTkxOC8zMjIzMjYzNTYtMTFmY2IwNmMtNGQxYi00YzdjLWJmMjktOTkxNTM1NzZiYzdhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA4VDE3NTUwMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWU5Mjg2ZmVhMzVhMjQ3MTYzMWJmYjE4Y2U3NDc0Y2I1NDUzY2FlNWViMjMwODVjZTNlZGIyODY2YjBhOWE5MDcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.yXB4IjpAiiOI6ASiui2VxooIbOywnWFmVo2enbtgq4w)
Do you use different intrinsics when finetuning? Would large scale output better for the reconstruction stage?
The text was updated successfully, but these errors were encountered: