Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

similar to LaVi-Bridge, any connection? #13

Open
dandincyf opened this issue Mar 14, 2024 · 14 comments
Open

similar to LaVi-Bridge, any connection? #13

dandincyf opened this issue Mar 14, 2024 · 14 comments

Comments

@dandincyf
Copy link

i just find a recent paper https://arxiv.org/abs/2403.07860 with a similar topic, any difference or connection?

@budui
Copy link
Collaborator

budui commented Mar 14, 2024

LaVi-Bridge and ELLA are independent works completed at the same time, and there was no communication between us.

ELLA focuses on the exploration of connectors between LLM and UNet, while LaVi-Bridge has tried many LLM and visual models. These two papers can learn from each other. LaVi-Bridge proved that LLaMA+LoRA has better performance than T5, and we are considering following their ideas. However, LLaMA+LoRA means that the gradient needs to go through LLM and consumes more GPU memory for training.

Both ELLA and LaVi-Bridge report T2I-Benchmark scores, allowing for some simple comparisons. We will open source ELLA (SD1.5) as soon as possible for the community to conduct some in-depth analysis and comparison.

@scarbain
Copy link

@budui Aren't you going to opensource ELLA for SDXL too? I thought you also had a working version for SDXL

@Bionagato
Copy link

Bionagato commented Mar 14, 2024

Please release ELLA for SDXL. At this point, nobody uses SD1.5, so the community will likely not show interest if it's based on it. We already have SD 2.x, SDXL, Stable Cascade, and soon SD3.

@Manni1000
Copy link

i think 1.5 is also cool but sdxl too

@melohux
Copy link
Collaborator

melohux commented Mar 15, 2024

We greatly appreciate your interest in ELLA_sdxl. However, the process of open-sourcing ELLA_sdxl requires an extensive review by our senior leadership. This procedure can be considerably time-consuming. Conversely, ELLA_sdv1.5, which is more research-oriented, can be released promptly. We would appreciate your patience and understanding about this.

@victorca25
Copy link

victorca25 commented Mar 17, 2024

At this point, nobody uses SD1.5

This is a false statement, most people still use SD1.5, because it has the largest amount of LoRAs available and requires the lowest amount of resources for inference.

@Bionagato
Copy link

@victorca25 1.5 is old, people still using SD1.5 in 2024 do so because they lack the resources to run XL. The only exception to this was anime, but now there are Animagine and Pony. If someone doesn't have the resources for SDXL, they probably won't for SD1.5 + LLM.

@victorca25
Copy link

@victorca25 1.5 is old

New 1.5 fine tunes come out every day, so this is another false statement.

people still using SD1.5 in 2024 do so because they lack the resources to run XL.

Or they do not see the point of needing more VRAM and waiting longer for marginally better results.

If someone doesn't have the resources for SDXL, they probably won't for SD1.5 + LLM.

You're just assuming, you don't know. But ELLA + SD1.5 models has the potential to generate better results than SDXL, which will now be superseded by 3.0, so by your own logic, why waste time with SDXL? :D

@scarbain
Copy link

I can confirm SD1.5 is still used in professional prod environments, of course. Which doesn't reduce the necessity to opensource ELLA for SDXL, that would still be awesome :)

@moesie
Copy link

moesie commented Mar 18, 2024

In the end it is still the privilege of the Tencent management to decide if it suits their company goals better to open source whatever code their employees are working on or to keep it proprietary.
To discuss this here is kind of pointless and OT.
This thread is about the similarities and differences between LaVi-Bridge and ELLA.

@zethfoxster
Copy link

e largest amount of LoRAs available and requires the lowest amount of resources for inference.

im on my 4090 running 1.5SD...

@rundiffusion
Copy link

LaVi-Bridge and ELLA are independent works completed at the same time, and there was no communication between us.

ELLA focuses on the exploration of connectors between LLM and UNet, while LaVi-Bridge has tried many LLM and visual models. These two papers can learn from each other. LaVi-Bridge proved that LLaMA+LoRA has better performance than T5, and we are considering following their ideas. However, LLaMA+LoRA means that the gradient needs to go through LLM and consumes more GPU memory for training.

Both ELLA and LaVi-Bridge report T2I-Benchmark scores, allowing for some simple comparisons. We will open source ELLA (SD1.5) as soon as possible for the community to conduct some in-depth analysis and comparison.

We are RunDiffusion. We build Juggernaut XL. The worlds most downloaded SDXL model. We would be very interested in getting an aligned version of our model. Have you thought about how that would work? Licensing, alignment service? Etc Would you allow a model to be aligned and released publicly? If not, we would still be interested in a version for our business use and clients. We are in communication with Nvidia about their DRaFT+ approach and are experimenting in that area.

We would love to talk about your SDXL plans.

Please let us know!

@budui
Copy link
Collaborator

budui commented May 5, 2024

Thank you for your interest in our work. Juggernaut XL is a fantastic model. Without any adjustments, Juggernaut XL+ELLA works very well. In fact, not only Juggernaut XL, ELLA can be easily integrated with CLIP-fixed (which means during finetuning, text encoder is fixed) SD derivative models for use, including even video generation models like AnimateDiff. Unfortunately, we have no plan to open source ELLA-SDXL. However, we are very happy to keep in touch with you, and we can share the experience we have learned in training ELLA-SDXL

@rundiffusion
Copy link

@budui We're so glad you like it! Remember that Juggernaut XL does not have a license that permits it to be distributed. We would very much like to be involved in that process if you are already integrating ELLA with Juggernaut XL in your research. The team would LOVE to see some results from Juggernaut+Ella! That would be so exciting!

We understand you're not going to open source the SDXL alignment. That probably includes releasing a model that is open sourced. We think that is fine. We too are finding ways to build IP and keep it to build a sustainable business. (like our SFW Juggernaut XL model that is proprietary)

I'm sure we can make something lucrative for your stakeholders with some sort of licensing partnership. Juggernaut has been downloaded over half a million times world wide and would be an excellent brand to attach to Ella to license out to inference providers or private companies.

We definitely should have a talk with our teams to find something that makes sense. Please email darin@rundiffusion.com so we can chat privately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants