Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto mask uniform background base on PR #589 mask loss #1114

Closed
wants to merge 16 commits into from

Conversation

gesen2egee
Copy link
Contributor

@gesen2egee gesen2egee commented Feb 11, 2024

Base on PR #589. Thanks recris.

(1) Change the mechanism for caching_latent_to_disk by directly storing the mask in the NPZ file. Actually, due to trim_and_resize, it was impossible to match the required size, made cache_latent_to_disk unusable.

(2) Modify the way of color augmentation to preserve the original alpha channel unchanged, which should allow for more accurate handling of the mask.

(3) Add --mask_simple_background. Enable auto-masking of latent loss based on the dominant edge color if it occupies more than 30% of the image edges. This helps in focusing the model on the main content by ignoring simple or uniform background colors such as solid white or black.

I think this will help improve the quality of datasets with a high proportion of white backgrounds, transparent backgrounds, and simple backgrounds in anime images.

This is a simpler feature implementation. It might be possible to add automatic masking for faces (for clothing training), characters, etc. However, because automatically processed masks are harder to inspect and would introduce a significant amount of additional requirements, it might be better to use the script I wrote or the webui's rembg to pre-generate masks for manual inspection.

@kohya-ss
Copy link
Owner

Thank you for this PR.

As I wrote before in #589 , I've implemented ControlNetDataset which has conditioning data recently. I believe the ControlNetDataset can have not only canny or pose control images but also mask images.

I think it might be an idea to add a capability to handle ControlNetDataset for each training script. However, it will take some time to implement.

So, please let me carefully consider how to handle this PR. I hope you will understand.

I also think --mask_simple_background is simple but interesting idea. It seems to be promising.

@gesen2egee gesen2egee closed this Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants