Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upscaled RBG and CCM ,Tile-Based generation #10

Open
mr-lab opened this issue Mar 25, 2024 · 8 comments
Open

Upscaled RBG and CCM ,Tile-Based generation #10

mr-lab opened this issue Mar 25, 2024 · 8 comments

Comments

@mr-lab
Copy link

mr-lab commented Mar 25, 2024

Hi ,
i wanted to ask if let's say i have taken the 256x6 MV images and generated a Higher resolution MV sheet
is it possible for CRM to generate a better 3d model with more details ?

My tests ideas are :
-Regular Upscale , (CCM won't be that good probably just change resolutions no upscale , still can't figure out if the CCM are used for texturing or generating the 3d mesh ... or both )
or
-run a Tile-Based Algorithm:
first do a regular CRM image generation RGB and CCM 256x6 then upscale them as follows
Algorithm will split the input image into multiple Tiles and generate RGB and CCM for each tile , then blend them all together into one High resolution MV RGB CCM images .

the Tile code is ready and only need some modifications , it showed some great results with Depth map blending
i did some modifications to the code and the models config files and changed the size of the input tensors(array images), the generated RGB and CCM are just garbage using the regular workflow at high resolutions so i can't really tell .
what i need to know :

1-will the Decoder Works with resolutions Higher that 256x6 example 512x3,072 ? or the model is just trained on that and wont work ?
2-i read the paper multiple times , but can't understand CCM , can we skip generating those and just use RGB ? are CCM essential for Mesh generation or used just for texturing ?
3-let's say we have extremely detailed Depth maps , like 4k ultra sharp Maps even skin pores will be present... can we in anyway introduce those depth maps into the workflow of CRM ? (this one is very important)

do let me know ,and many thanks in advance , much love and respect for your work , cheers

@thuwzy
Copy link
Collaborator

thuwzy commented Mar 25, 2024

Thank you for your interest in our CRM paper!

  1. The decoder cannot directly work at resolution 512x3,072. However, I will upload the model trained on this resolution 512x3,072 if you can get the upsampling image work.
  2. The CCM is for better geometry and cannot be skipped. I have conducted ablation study on Figure 10 in my paper. CRM without CCM input has worse geometry.
  3. Actually depth map can be equivalently transformed to CCM in my framework. So I think it is highly likely to work.
    By the way, I think the resolution of CCM is not very important. I think a good pipeline may be generate 256*1536 image and CCM, and then use neural network to upsample the image and simply resize the CCM to be in the resolution of 512x3,072.

@mr-lab
Copy link
Author

mr-lab commented Mar 25, 2024

thank you very much will be waiting for that model
I will explore more Point 3 .

@mr-lab
Copy link
Author

mr-lab commented Mar 29, 2024

original model render:
image
couple of Re-renders
image
image
your work is a blessing to us, those are Re-renders of the RGB to retexture the mesh . more consistency is needed .
will move to depth map after that , good depth comes from good RGB.
cheers.

@mosvlad
Copy link

mosvlad commented Apr 1, 2024

original model render: image couple of Re-renders image image your work is a blessing to us, those are Re-renders of the RGB to retexture the mesh . more consistency is needed . will move to depth map after that , good depth comes from good RGB. cheers.

Awesome!!! Can you share your result with code?

@zz7379
Copy link

zz7379 commented Apr 3, 2024

original model render: image couple of Re-renders image image your work is a blessing to us, those are Re-renders of the RGB to retexture the mesh . more consistency is needed . will move to depth map after that , good depth comes from good RGB. cheers.

is this a up-scaled rgb? or rendered mesh?

@mosvlad
Copy link

mosvlad commented Apr 3, 2024

I'm trying to upscale RGB from stage1

This:

CRM/run.py

Line 152 in 3e677cb

stage1_images = rt_dict["stage1_images"]

and this:
stage1_images = self.stage1_sample(pixel_img, prompt, scale=scale, step=step)

For upscale i'm used BSRGAN.
https://github.com/cszn/BSRGAN

  1. Generate images (256x1536) by stage 1
  2. Upscale it by BSRGAN (x2 or 4x)
  3. Resize images to (256x1536)
  4. Use upscaled and resized images for generate3d

изображение

This steps not make quality improvement like @mr-lab comments.

Another way i tried make upscale for every image generated in step1:

  1. Generate image (256x256)
  2. Upscale it by BSRGAN (x2 or x4)
  3. Resize to original size (256x256)
  4. Use upscaled and resized images for stage2

изображение
The results are not very good either

Maybe @mr-lab share some more information about his research....

@mr-lab
Copy link
Author

mr-lab commented Apr 3, 2024

@mosvlad
we need a decoder that can process higher resolutions 512*3,072
@thuwzy is probably working on that .
Now we are working on an alternative ,Transfer CRM results to a 3d blob representing the shape of the subject ,
then remodel that blob into a model by moving vertex pos until they match target...still long way to see any good results .
CRM is the only True 3d generator , times and times again proven to provide consistent multi-view shots , no other project can do . will continue to prepare for a larger decoder .

@snowflakewang
Copy link

@thuwzy Hello, I am interested in upscaling the resolution of RGBs to get high-resolution textured meshes. You mentioned that you are working on 512-level decoders. I am curious about the maximum resolution that GPUs (maybe A100/A800) can handle. Is 1024 an acceptable resolution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants