Is it possible to enhance the straight-line conditioning? #50

xarthurx · 2023-05-02T07:57:21Z

Original Title: Is it possible to enhance the straight-line conditioning?

Hello, thank you for the great work. CN + SD really changed the design field a lot.

I'm from both architecture and computer science background, and am currently investigating how far we can go in this direction for conceptual design phase.

There's one issue that we've tried to improve for a while, but cannot get through:

SD w/o CN

SD with CN

If you look at the image above, the mullions and window frames are not straight, the lines are wobbly.
We used a screenshot of a 3D model for the conditioning, but regardless of the preprocessor used, the generated images always have more or less issues like this.

What we thought about the cause might be:

The preprocessed image has only 512 resolution, which makes the processed lines already wobbly (some lines are very light after processing)
this is a short-comming of the SD itself.

We also tried to use volume screenshot without the mullions, but the results are similar:

SD with CN

Question:

At this point, we'd like to seek advice from the developers how this issue can be improved:

Should we train a Diffusion model (dreambooth, or LoRA approach) with more architecture related model (we've tried a few from Civitai, but the improvements are limited)
Should we train our own CN (for instance, a series of "non-perfect canny-style" images + perfect architecture rendering to have a CN understand those facade need to have straight mullions)?
Or what should we do at this point?

lllyasviel · 2023-05-10T00:31:46Z

Just Use Automatic 1111

Below results are all default parameters and the same simple prompts shown in my screenshot. A1111 is just magic.

lllyasviel · 2023-05-10T03:26:28Z

Edit: Frequently asked questions are edited and pinned to help more people.
Edit2: Closed since solution found. Edited title restored.

xarthurx · 2023-05-10T08:29:57Z

@lllyasviel
First, really thank you for your time about this topic.

For the image you generated, I'd like to provide an architectural perspective:

As we're professionals, we evaluate the quality of the specific architecture seriously (geometry, space quality, etc.), and not based on the "general feeling" or the "style" of the image.

So if you look at the facade in the image, you'll see that the mullions and windows are in strange shape. We've experienced a lot in this effect and cannot overcome it completely with training dreambooth or lora. -- That's why we're here, and would like to seek advice from you to see of ControlNET can help.

lllyasviel · 2023-05-10T10:48:12Z

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges

(and you can try m**j*****y and compare which solution is better)
(and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

xarthurx · 2023-05-10T12:01:06Z

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges (and you can try mj***y and compare which solution is better) (and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

Really helpful input!

We turned to SD+ControlNet from MJ becaused we need to control the geometry more strictly in the later part of the design process, so MJ is not an option for non-conceptial design.
The somewhat results help to some extent (YES, we're indeed using a1111), but not fully resolve the problem (it may by burning the GPU very hard). It seems my naive proposal of trainig a cnet was not a good idea to you. Theoretically, do you think there's a possibility, though doesn't have to be a quick / user-end solution, to resolve the issue?

lllyasviel · 2023-05-10T20:42:10Z

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala".
Generate a perfect image may take many hours on a 4090

lllyasviel · 2023-05-10T20:47:30Z

unfortunaly, it seems at that resolution, webui's gradio HTML crashes before controlnet fail. Good news is that controlnet is still working at that scale. bad news is that your browser does not support it. perhaps try firefox

xarthurx · 2023-05-10T20:50:56Z

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala". Generate a perfect image may take many hours on a 4090

unfortunaly, it seems at that resolution, webui's gradio HTML crashes before controlnet fail. Good news is that controlnet is still working at that scale. bad news is that your browser does not support it. perhaps try firefox

This is definitely a “theoretical” solution (though different from what I expected), but I kind of understand how the "tile" works unexpectedly. 🤣

I guess then for practical use (need ~2k resolution in < 5min), this is still an "unresolved" problem...
As I originally and incorrectly assume this can be fixed by a special type of cnet, it seems I need to wait for a more "vector-based" style plugin to control for such things...

But anyway, thank you for your time and input. Really appreciate it.

xarthurx · 2023-05-10T20:53:33Z

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala". Generate a perfect image may take many hours on a 4090

It just came to my mind after posting the above post that, we actually use a region-based script to "upscale and then downscale" the area of the facade?

select the area need to be fixed
upscale with "tile" until the result is satisfied
downscale to the initial size

This save GPU time and probably can save the browser, too?

lllyasviel · 2023-05-10T21:00:15Z

LDM learn specific patterns in specific conv layer levels - if you want to get the learned pattern to draw something like a window on a wall, you need to give a 512x512 space to occupy that thing so that the specific patterns learned in corresponding conv layer can be triggered. so you cannot downscale it, unfortunately
But perhaps can try only slicing the tiles along with mlsd lines to save computation power.
But we already begin to burn gpu, then perhaps just burn it without unnecessary mercy

xiaohaipeng · 2023-05-15T12:06:01Z

@lllyasviel >

oh,god,this pic perfect ,has great details,with controlnet tile model,how do you set params in detail?

daizhuo · 2023-06-16T03:33:08Z

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges (and you can try mj***y and compare which solution is better) (and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

How do you make this?
Could you provide a detailed processing?
This processing is very import for architecture design.
I tried with no luck!
Thank you so much!

lllyasviel added the documentation Improvements or additions to documentation label May 10, 2023

lllyasviel changed the title ~~Is it possible to enhance the straight-line conditioning?~~ [Everyone Should Read] Why my ControlNet results are not amazing as those YouTube or Twitters? How can I improve the performance? May 10, 2023

lllyasviel pinned this issue May 10, 2023

lllyasviel unpinned this issue May 11, 2023

lllyasviel closed this as completed May 11, 2023

lllyasviel removed the documentation Improvements or additions to documentation label May 11, 2023

lllyasviel changed the title ~~[Everyone Should Read] How to get results as amazing as the others on Twitter and YouTube? How to improve model performance?~~ Is it possible to enhance the straight-line conditioning? May 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to enhance the straight-line conditioning? #50

Is it possible to enhance the straight-line conditioning? #50

xarthurx commented May 2, 2023 •

edited

Loading

lllyasviel commented May 10, 2023 •

edited

Loading

lllyasviel commented May 10, 2023 •

edited

Loading

xarthurx commented May 10, 2023

lllyasviel commented May 10, 2023 •

edited

Loading

xarthurx commented May 10, 2023

lllyasviel commented May 10, 2023

lllyasviel commented May 10, 2023

xarthurx commented May 10, 2023

xarthurx commented May 10, 2023

lllyasviel commented May 10, 2023 •

edited

Loading

xiaohaipeng commented May 15, 2023 •

edited

Loading

daizhuo commented Jun 16, 2023

Is it possible to enhance the straight-line conditioning? #50

Is it possible to enhance the straight-line conditioning? #50

Comments

xarthurx commented May 2, 2023 • edited Loading

Question:

lllyasviel commented May 10, 2023 • edited Loading

Just Use Automatic 1111

lllyasviel commented May 10, 2023 • edited Loading

xarthurx commented May 10, 2023

lllyasviel commented May 10, 2023 • edited Loading

xarthurx commented May 10, 2023

lllyasviel commented May 10, 2023

lllyasviel commented May 10, 2023

xarthurx commented May 10, 2023

xarthurx commented May 10, 2023

lllyasviel commented May 10, 2023 • edited Loading

xiaohaipeng commented May 15, 2023 • edited Loading

daizhuo commented Jun 16, 2023

xarthurx commented May 2, 2023 •

edited

Loading

lllyasviel commented May 10, 2023 •

edited

Loading

lllyasviel commented May 10, 2023 •

edited

Loading

lllyasviel commented May 10, 2023 •

edited

Loading

lllyasviel commented May 10, 2023 •

edited

Loading

xiaohaipeng commented May 15, 2023 •

edited

Loading