Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get depth programatically #65

Merged
merged 9 commits into from
Jun 23, 2024
Merged

Conversation

GavChap
Copy link
Contributor

@GavChap GavChap commented Jun 18, 2024

Get the depth of the model by counting layers to allow for models with more depth. This allows for deeper models to be created. Example: https://huggingface.co/ptx0/pixart-reality-mix which is a 900M model

@city96
Copy link
Owner

city96 commented Jun 18, 2024

Looks good. We'll want to add an auto checkpoint select node as well that auto detects/generates the correct config.

That way we can support any size model. In theory, a node like that isn't hard, but one issue I ran into was the pe_interpolation factor, which is not stored in the diffusers state dict.

I think it should be possible to completely get rid of that value by dynamically generating it from the image size similar to how it's done for HunYuanDiT.

I gave it a quick test and it seems to work. Should I try to make those changes I just mentioned on the base repo so we can use this PR for the auto config node as well or should I merge this as-is?

@city96
Copy link
Owner

city96 commented Jun 18, 2024

Actually, with that last commit it does seem to fail with the diffusers weights for me since cross_attn.proj.weight is the comfy name and not the diffusers name

…sed a layer with a 'final' layer giving me an out by one error"

This reverts commit e43ecbe.
@GavChap
Copy link
Contributor Author

GavChap commented Jun 18, 2024

Actually, with that last commit it does seem to fail with the diffusers weights for me since cross_attn.proj.weight is the comfy name and not the diffusers name

Yes, I accidentally made something break so I reverted it, i was trying to fix the "missing UNET message" but that doesn't matter as long as the correct layers exist.

@GavChap
Copy link
Contributor Author

GavChap commented Jun 18, 2024

Looks good. We'll want to add an auto checkpoint select node as well that auto detects/generates the correct config.

That way we can support any size model. In theory, a node like that isn't hard, but one issue I ran into was the pe_interpolation factor, which is not stored in the diffusers state dict.

I think it should be possible to completely get rid of that value by dynamically generating it from the image size similar to how it's done for HunYuanDiT.

I gave it a quick test and it seems to work. Should I try to make those changes I just mentioned on the base repo so we can use this PR for the auto config node as well or should I merge this as-is?

I think merge it as is, then we could work on a detection node, I've been trying to figure out pe_interpolation as that should allow inference at any size, I had it working on square! I could gen 2048x2048 from the 1024 model, but as soon as you selected an aspect ratio it went off the wall.

@GavChap GavChap closed this Jun 18, 2024
@GavChap
Copy link
Contributor Author

GavChap commented Jun 18, 2024

I've closed it as there are issues I just ran into. I'll reopen it when I've made sure I fix them

@city96
Copy link
Owner

city96 commented Jun 18, 2024

Fair lol, take your time. I'll check on the PE factor stuff, see how hard it is to guess. I assume just doing an average for width+height and then taking the ratio for the base (512?) didn't help?

@GavChap
Copy link
Contributor Author

GavChap commented Jun 18, 2024

Fair lol, take your time. I'll check on the PE factor stuff, see how hard it is to guess. I assume just doing an average for width+height and then taking the ratio for the base (512?) didn't help?

Nope, didn't help at all. But give it a go and you'll see

@GavChap GavChap reopened this Jun 18, 2024
@city96
Copy link
Owner

city96 commented Jun 18, 2024

I could gen 2048x2048 from the 1024 model, but as soon as you selected an aspect ratio it went off the wall.

Doing that seems like it shouldn't work, unless you were doing it the other way around. DiT is notoriously bad at resolutions it wasn't trained on.

Also, I'm able to guess the factor with the formula (x.shape[-1]+x.shape[-2])/2.0 / (512/8.0) [PE scale computed: 2.0625 [vs:2]]

I'm thinking maybe a soft-rounding for values that are close to whole integers could work, then leave it up to luck for values outside that lol. Not like the model works outside those anyway.

image

@city96
Copy link
Owner

city96 commented Jun 19, 2024

Pushed an auto checkpoint loader but it needs better logic to get the right config for diffusers, which is missing a bunch of keys that the default one has. I can go into more detail if this is something you'd like to look into. de52d3a

image

PixArt/loader.py Outdated
@@ -76,6 +76,8 @@ def load_pixart(model_path, model_conf):
device=model_management.get_torch_device()
)

model_conf.unet_config['depth'] = sum(key.endswith('.scale_shift_table') for key in state_dict.keys())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overriding the blocks here is a bad idea, and using .scale_shift_table will always be off by one because the final layer also has a scale shift table entry. I think just leave the loader as-is in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that if you don't override it it messes up the generation for the larger models as it expects 28 layers not 42.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed using simple loader and an additional config in the standard loader.

PixArt/diffusers_convert.py Outdated Show resolved Hide resolved
PixArt/models/PixArtMS.py Show resolved Hide resolved
@GavChap
Copy link
Contributor Author

GavChap commented Jun 19, 2024

I've made more changes, I'm not sure the autodetect node works with 900M models, I'll do some more investigation, I kept getting problems with gens by using the autoconfig + the new layer code to allow for more depth, the only combo I found that works is forcing the depth in the loader.py. I will keep digging since there is renewed interest in PixArt after SD3's launch.

@GavChap
Copy link
Contributor Author

GavChap commented Jun 23, 2024

In further testing the simple loader is working fine. It was something to do with my setup. I've removed the model_conf override and added a config to the standard loader now so this should be good to merge?

@city96 city96 merged commit faf6979 into city96:main Jun 23, 2024
@city96
Copy link
Owner

city96 commented Jun 23, 2024

Well, I can confirm that it "works", though it looks like it definitely needs more training lol. Still, good job on this! Thanks!

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants