Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to enhance the straight-line conditioning? #50

Closed
xarthurx opened this issue May 2, 2023 · 12 comments
Closed

Is it possible to enhance the straight-line conditioning? #50

xarthurx opened this issue May 2, 2023 · 12 comments

Comments

@xarthurx
Copy link

xarthurx commented May 2, 2023

Original Title: Is it possible to enhance the straight-line conditioning?

Hello, thank you for the great work. CN + SD really changed the design field a lot.

I'm from both architecture and computer science background, and am currently investigating how far we can go in this direction for conceptual design phase.

There's one issue that we've tried to improve for a while, but cannot get through:

SD w/o CN
image

SD with CN
image

If you look at the image above, the mullions and window frames are not straight, the lines are wobbly.
We used a screenshot of a 3D model for the conditioning, but regardless of the preprocessor used, the generated images always have more or less issues like this.

What we thought about the cause might be:

  1. The preprocessed image has only 512 resolution, which makes the processed lines already wobbly (some lines are very light after processing)
  2. this is a short-comming of the SD itself.

We also tried to use volume screenshot without the mullions, but the results are similar:

SD with CN
image

Question:

At this point, we'd like to seek advice from the developers how this issue can be improved:

  1. Should we train a Diffusion model (dreambooth, or LoRA approach) with more architecture related model (we've tried a few from Civitai, but the improvements are limited)
  2. Should we train our own CN (for instance, a series of "non-perfect canny-style" images + perfect architecture rendering to have a CN understand those facade need to have straight mullions)?
  3. Or what should we do at this point?
@lllyasviel
Copy link
Owner

lllyasviel commented May 10, 2023

Just Use Automatic 1111

Below results are all default parameters and the same simple prompts shown in my screenshot. A1111 is just magic.
image
image
image

@lllyasviel lllyasviel added the documentation Improvements or additions to documentation label May 10, 2023
@lllyasviel lllyasviel changed the title Is it possible to enhance the straight-line conditioning? [Everyone Should Read] Why my ControlNet results are not amazing as those YouTube or Twitters? How can I improve the performance? May 10, 2023
@lllyasviel lllyasviel pinned this issue May 10, 2023
@lllyasviel lllyasviel changed the title [Everyone Should Read] Why my ControlNet results are not amazing as those YouTube or Twitters? How can I improve the performance? [Everyone Should Read] Why my ControlNet results are not as amazing as those YouTube or Twitters? How can I improve the performance? May 10, 2023
@lllyasviel lllyasviel changed the title [Everyone Should Read] Why my ControlNet results are not as amazing as those YouTube or Twitters? How can I improve the performance? [Everyone Should Read] How to get results as amazing as the others on Twitter and YouTube? How to improve model performance? May 10, 2023
@lllyasviel
Copy link
Owner

lllyasviel commented May 10, 2023

Edit: Frequently asked questions are edited and pinned to help more people.
Edit2: Closed since solution found. Edited title restored.

@xarthurx
Copy link
Author

@lllyasviel
First, really thank you for your time about this topic.

For the image you generated, I'd like to provide an architectural perspective:

As we're professionals, we evaluate the quality of the specific architecture seriously (geometry, space quality, etc.), and not based on the "general feeling" or the "style" of the image.

So if you look at the facade in the image, you'll see that the mullions and windows are in strange shape. We've experienced a lot in this effect and cannot overcome it completely with training dreambooth or lora. -- That's why we're here, and would like to seek advice from you to see of ControlNET can help.

image

@lllyasviel
Copy link
Owner

lllyasviel commented May 10, 2023

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges
image
(and you can try m**j*****y and compare which solution is better)
(and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

@xarthurx
Copy link
Author

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges image (and you can try mj***y and compare which solution is better) (and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

Really helpful input!

  1. We turned to SD+ControlNet from MJ becaused we need to control the geometry more strictly in the later part of the design process, so MJ is not an option for non-conceptial design.
  2. The somewhat results help to some extent (YES, we're indeed using a1111), but not fully resolve the problem (it may by burning the GPU very hard). It seems my naive proposal of trainig a cnet was not a good idea to you. Theoretically, do you think there's a possibility, though doesn't have to be a quick / user-end solution, to resolve the issue?

@lllyasviel
Copy link
Owner

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala".
Generate a perfect image may take many hours on a 4090

@lllyasviel
Copy link
Owner

unfortunaly, it seems at that resolution, webui's gradio HTML crashes before controlnet fail. Good news is that controlnet is still working at that scale. bad news is that your browser does not support it. perhaps try firefox

@xarthurx
Copy link
Author

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala". Generate a perfect image may take many hours on a 4090

unfortunaly, it seems at that resolution, webui's gradio HTML crashes before controlnet fail. Good news is that controlnet is still working at that scale. bad news is that your browser does not support it. perhaps try firefox

This is definitely a “theoretical” solution (though different from what I expected), but I kind of understand how the "tile" works unexpectedly. 🤣

I guess then for practical use (need ~2k resolution in < 5min), this is still an "unresolved" problem...
As I originally and incorrectly assume this can be fixed by a special type of cnet, it seems I need to wait for a more "vector-based" style plugin to control for such things...

But anyway, thank you for your time and input. Really appreciate it.

@xarthurx
Copy link
Author

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala". Generate a perfect image may take many hours on a 4090

It just came to my mind after posting the above post that, we actually use a region-based script to "upscale and then downscale" the area of the facade?

  • select the area need to be fixed
  • upscale with "tile" until the result is satisfied
  • downscale to the initial size

This save GPU time and probably can save the browser, too?

@lllyasviel
Copy link
Owner

lllyasviel commented May 10, 2023

LDM learn specific patterns in specific conv layer levels - if you want to get the learned pattern to draw something like a window on a wall, you need to give a 512x512 space to occupy that thing so that the specific patterns learned in corresponding conv layer can be triggered. so you cannot downscale it, unfortunately
But perhaps can try only slicing the tiles along with mlsd lines to save computation power.
But we already begin to burn gpu, then perhaps just burn it without unnecessary mercy

@lllyasviel lllyasviel unpinned this issue May 11, 2023
@lllyasviel lllyasviel removed the documentation Improvements or additions to documentation label May 11, 2023
@lllyasviel lllyasviel changed the title [Everyone Should Read] How to get results as amazing as the others on Twitter and YouTube? How to improve model performance? Is it possible to enhance the straight-line conditioning? May 12, 2023
@xiaohaipeng
Copy link

xiaohaipeng commented May 15, 2023

@lllyasviel >

oh,god,this pic perfect ,has great details,with controlnet tile model,how do you set params in detail?

@daizhuo
Copy link

daizhuo commented Jun 16, 2023

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges image (and you can try mj***y and compare which solution is better) (and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

How do you make this?
Could you provide a detailed processing?
This processing is very import for architecture design.
I tried with no luck!
Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants