-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get depth programatically #65
Conversation
…h more depth. This allows for deeper models to be created. Example: https://huggingface.co/ptx0/pixart-reality-mix which is a 900M model
…yer with a 'final' layer giving me an out by one error
Looks good. We'll want to add an auto checkpoint select node as well that auto detects/generates the correct config. That way we can support any size model. In theory, a node like that isn't hard, but one issue I ran into was the I think it should be possible to completely get rid of that value by dynamically generating it from the image size similar to how it's done for HunYuanDiT. I gave it a quick test and it seems to work. Should I try to make those changes I just mentioned on the base repo so we can use this PR for the auto config node as well or should I merge this as-is? |
Actually, with that last commit it does seem to fail with the diffusers weights for me since |
…sed a layer with a 'final' layer giving me an out by one error" This reverts commit e43ecbe.
Yes, I accidentally made something break so I reverted it, i was trying to fix the "missing UNET message" but that doesn't matter as long as the correct layers exist. |
I think merge it as is, then we could work on a detection node, I've been trying to figure out pe_interpolation as that should allow inference at any size, I had it working on square! I could gen 2048x2048 from the 1024 model, but as soon as you selected an aspect ratio it went off the wall. |
I've closed it as there are issues I just ran into. I'll reopen it when I've made sure I fix them |
Fair lol, take your time. I'll check on the PE factor stuff, see how hard it is to guess. I assume just doing an average for width+height and then taking the ratio for the base (512?) didn't help? |
Nope, didn't help at all. But give it a go and you'll see |
Doing that seems like it shouldn't work, unless you were doing it the other way around. DiT is notoriously bad at resolutions it wasn't trained on. Also, I'm able to guess the factor with the formula I'm thinking maybe a soft-rounding for values that are close to whole integers could work, then leave it up to luck for values outside that lol. Not like the model works outside those anyway. |
Pushed an auto checkpoint loader but it needs better logic to get the right config for diffusers, which is missing a bunch of keys that the default one has. I can go into more detail if this is something you'd like to look into. de52d3a |
PixArt/loader.py
Outdated
@@ -76,6 +76,8 @@ def load_pixart(model_path, model_conf): | |||
device=model_management.get_torch_device() | |||
) | |||
|
|||
model_conf.unet_config['depth'] = sum(key.endswith('.scale_shift_table') for key in state_dict.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overriding the blocks here is a bad idea, and using .scale_shift_table
will always be off by one because the final layer also has a scale shift table entry. I think just leave the loader as-is in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that if you don't override it it messes up the generation for the larger models as it expects 28 layers not 42.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed using simple loader and an additional config in the standard loader.
I've made more changes, I'm not sure the autodetect node works with 900M models, I'll do some more investigation, I kept getting problems with gens by using the autoconfig + the new layer code to allow for more depth, the only combo I found that works is forcing the depth in the loader.py. I will keep digging since there is renewed interest in PixArt after SD3's launch. |
In further testing the simple loader is working fine. It was something to do with my setup. I've removed the model_conf override and added a config to the standard loader now so this should be good to merge? |
Get the depth of the model by counting layers to allow for models with more depth. This allows for deeper models to be created. Example: https://huggingface.co/ptx0/pixart-reality-mix which is a 900M model