Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training a ControlNet to generate furnished room -> empty room (and vice versa). Improvement plateau... #659

Open
whydna opened this issue Mar 15, 2024 · 4 comments

Comments

@whydna
Copy link

whydna commented Mar 15, 2024

I'm working on a project to take images of furnished rooms and remove all the furniture. I've got a large dataset of image pairs. I'm not using any preprocessing on the images so as to allow the model to preserve details of the original image (wall color, floor material, etc.).

After training on a 4090 for about 5 days, and I'm no longer seeing any improvement (see examples below).

I'm looking to get tips about where to go from here.

  • Does it just need to be trained longer?
  • Do I need to adjust the learning rate?
  • Should I spend more time cleaning the dataset (a small % of the dataset is probably bad, as you can see in one of the examples below, the target image is dark).
  • Should I preprocess the image to simplify this? (i.e MLSD) It would lose the details of the original, but maybe at least will provide better output for final image.
  • Perhaps ControlNet isn't the right arch for this and instead use pix2pix?

Thanks for the help!

Example 1

Source:
Screenshot 2024-03-15 at 10 18 12 AM

Target:
Screenshot 2024-03-15 at 10 21 32 AM

Model Result:
Screenshot 2024-03-15 at 10 17 59 AM

Example 2

Source:
Screenshot 2024-03-15 at 10 20 30 AM

Target:
Screenshot 2024-03-15 at 10 21 54 AM

Model Result:
Screenshot 2024-03-15 at 10 20 41 AM

First Training Run

Screenshot 2024-03-15 at 10 29 44 AM

Second Training Run

Screenshot 2024-03-15 at 10 30 41 AM
@dereksun105
Copy link

how large was your dataset?

@innat-asj
Copy link

Controlnet is not the right arch for this, instead play around with inpainting methods.

@whydna
Copy link
Author

whydna commented Jul 14, 2024

@innat-asj can you elaborate a bit? ty!

@innat-asj
Copy link

innat-asj commented Jul 14, 2024

It's only my understanding about the architecture. The control-net doesn't required for the removal operations. Coz, there is nothing to control. Instead, there are few option we could try to remove the object.

I tried Mi-GAN out of the box with the given checkpoint and its promising. Hence, if it could be trained for specific task, it would be better. I also tried LaMa and MAT, but I found MI-GAN better in terms of simplicity and performance.

Lastly, reversing the above process won't work for empty room to furnished room. It requires additional stuff. In that case, control-net will be required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants