-
Notifications
You must be signed in to change notification settings - Fork 26.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Please add my extension to the Extensions Index :) #5037
Comments
SUPER useful! |
it would be great to decouple the background / foreground seperator and extractor so that it can be run as a standalone thing, if that's not already possible. |
Wow. Very cool. All this time I have been keying out blue/green backgrounds. I do wonder though, is this the same as the new depth2img in 2.0 |
No but it could be implemented. The depth to image feature seems to be using a depth map as a mask for img2img. Probably that replacing the depth by transparency does the trick. What I do here is cut the background out of the image by using the depth mask. It could be seen as some sort of outpainting I guess. |
Wow! What a feature! By cropping and pasting the depth map, it could move a single subject slightly to the right, am I correct? It has the potential to fully edit, resize, invert, shift, etc? Future tech: I could imagine when paired with a language model, you could say: |
Thanks! Yes indeed! So far my script is dividing the different foregrounds along the width of the background but using the center as a reference. The yeah for sure I could do more simple operations on these foreground subjects. I was also maybe thinking about creating a mask generator using the depth maps for img2img masks and we would get the same depth aware img2img feature as the 2.0. Should actually be pretty fucking easy as long as the mask feature takes the transparency into account (I haven't used it much tho) |
I would rather use sliders for that and a few pillow functions :D |
well, take your time, the feature you have now are already awesome! |
The last image seems more like "actual paradise obtainable" |
Hey can someone tell me where the mask image hides in p. during img2img? Somehow p.mask gives me "none type" if I do a print(type(p.mask)) even while having a mask |
So basically if you run this by using the "custom code" custom script you can already get an img2img depth aware inpainting :
You need my extension to be already installed before because it refers to the simple_depthmap script. Just using the same function as my extension. |
Note : the depth precision depends on which midas model is used. So right now I'm using the small and fast one which is mostly good a guessing the front/background. Same levels of precision as SD2.0 can be reached with a few more tweaks. |
I've played with 4 different MiDaS models and so far the best was almost always DPT-Hybrid. Try it if you haven't already. It works very well with the depthmap script over here https://github.com/thygate/stable-diffusion-webui-depthmap-script |
I took the code for the depth analysis from there after asking thygate :) I only implement the small model because it suited the needs of my idea but indeed that the next part of the plan. I don't remember how that model compares to the big one. |
Even though DPT-Hybrid gave me the most usable results 90% of the time, there were still 1 case out of 10 that were better served with one of the 3 alternatives, so it's clearly better to give the user the ability to select any of those 4 models. Another project you should look at if you are interested by this depthmap generation procedure is this one, which takes it some step further and extract a 3d model from the depthmap (with color info recorded as vertex colors) and then creates simple videos to demonstrate the 3d effect. https://github.com/donlinglok/3d-photo-inpainting/commits/master I helped donlinglok debug the branch he adapted for this to work locally on windows and it now works perfectly. I used to have to use a collab to get these kind of results, and I could barely believe my eyes when I finally got it to work on my machine (8 GB VRAM). |
I'm interested in testing a colab with that! I initially got inspired by this repository which I had in my bookmarks since something like 2 years and only tested a few months ago when Stable Diffusion came out! My first idea was to try to integrate that into SD or find some similar way to use midas! I will definitely take a look into it! So far I've made pixels move based on their depth which is way below the level of anything that I've seen yet.
Indeed. I can't add such feature without giving that option. Edit : oh lol that's the same repository! |
While the experts are in here, is it possible to get an alpha channel in the png with any of this? |
What do you want to do exactly ? In most cases, it's better to have the alpha channel in a separate image. |
|
Can't you do that with the extension proposed in this thread ? (I haven't tested it yet so I may be wrong but it already does so much more than that - it's nothing less than an image compositing extension !) One way I've been doing it manually is by using a depthmap and then turn that into a mask by adjusting the levels to make the area I want to keep white, and the rest of it black. But you need to do that in another application - something like photoshop or gimp or krita. Since a depthmap starts with white for objects very close to the camera and then gradually darkens to black as objects are positioned further and further away from the camera, you can key out the background by making everything darker than a given shade black, and making all pixels that are brighter than this threshold full-white. This is your depth-based alpha channel. You can then import it as a mask for inpainting. Or to composite images together in some other app. |
Thanks. I was using it for video. Maybe it can be done in those ways but just having an image sequence with alpha seems a lot easier. |
I'm having a bit of issues with the hybrid model so far. I made this to be able to test it fully from the custom code feature so I don't need to reload. So with this :
I get that :
Which seems to happen around here :
Other than that with the other models I get results but the depthmaps are not super "deep". Also it's not about the transparency for the mask but simply the grayscale. |
I see and I understand what you mean. It should be possible to automate the process I was describing of turning a depthmap into a mask, and then assemble that channel with the RGB channels to create a new PNG. The big limit caused by the inclusion of an alpha channel stored alongside RGB channels in a single PNG file is that the color information can be truncated in the parts of your image that are fully transparent. Sometimes that's what you want - it's a compression scheme to remove pixels that would not be seen and it does make the file size smaller. But if, for example, you have an airplane in the foreground and clouds in the background, you won't be able to maintain the cloud information of your image if your alpha channel is made to keep only the airplane foreground visible. It will be kept for semi-transparent pixels, but for completely transparent pixels the image is just replaced by (usually) pure white. At least in apps like Photoshop - there might be ways around this limit that I'm not aware of. If you have your alpha channel on a separate image, then you can keep both the plane and its cloudy background together in the same image, and you also gain the ability to adjust the mask without having to re-encode your RGB image. This is a limit of the PNG format, but other formats that do support alpha channels, such as TIFF, allow you to keep the full RGB information in the parts of your image that would be indexed as transparent. So yes, for image sequences, it's more convenient to have everything assembled together (RGB+A) in single PNG, but if you do that, make sure you keep your original sequence somewhere in case you'd need to rework your alpha channel because if you don't you might be stuck with pixels that have been erased from your PNG sequence because you thought they were to be transparent. What could help though is some option to add frame numbers to filenames when rendering sequences, and to let the user define the frame numbers, which would allow us to re-render only a part of an animation, or to append it later, without having to re-edit your sequences further down the production pipeline. |
Can this extension do something similar to subject select feature in Photoshop? |
I am sorry, I do not know what you mean as I do not use photoshop. |
@AugmentedRealityCat would you know why I'm getting that error with the hybrid model? |
I do not. Just to make sure, do you have this problem even with 512x512 as a resolution ? |
Yes I do. But in the end I don't think that this is a big issue. I noticed that the big model gives quite high contrasts depth map (so low depth differences) and the small one has more "depth" to its depthmaps. I also added a function to stretch out the depthmaps if really needed. Currently creating a new repository. And I removed the hybrid. |
ok it's done |
I see that it has been added since a bit so I'm closing this request :) |
My other extension does exactly that. |
@Extraltodeus |
@Extraltodeus Here is the colab of latest 3d-photo-inpainting with LeRes Thanks for @AugmentedRealityCat help! And how I replace MiDas with LeRes |
Is there an existing issue for this?
What would your feature do ?
I made that extension and wish to have it added to the index so people could install it automatically :)
It is a depth aware extension that can help to create multiple complex subjects on a single image.
It generates a background, then multiple foreground subjects, cuts their backgrounds after a depth analysis, paste them onto the background and finally does an img2img for a clean finish.
You can use it to make more complex images or simply full blown waifus harems. Your choice.
Example :
Distracted boyfriend meme :
KITTENS deprived of oxygen :
Multiple characters with various colors or features :
It does not add a new tab. So I guess just the "script" tag is necessary.
The text was updated successfully, but these errors were encountered: