added doc for Kandinsky3.0 #5937

charchit7 · 2023-11-27T00:58:56Z

What does this PR do?

Fixes #5936

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten @yiyixuxu

a-r-r-o-w · 2023-11-27T06:02:43Z

docs/source/en/api/pipelines/kandinsky3.md

+
+The description from it's Github page: 
+
+*Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, Kandinsky 3.0 incorporates more data and specifically related to Russian culture, which allows to generate pictures related to Russin culture. Furthermore, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*


Suggested change

*Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, Kandinsky 3.0 incorporates more data and specifically related to Russian culture, which allows to generate pictures related to Russin culture. Furthermore, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*

*Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, Kandinsky 3.0 incorporates more data and specifically related to Russian culture, which allows to generate pictures related to Russian culture. Furthermore, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*

patrickvonplaten · 2023-11-27T08:19:48Z

cc @yiyixuxu

yiyixuxu

left one comment. Thanks for adding this!

yiyixuxu · 2023-11-27T21:25:46Z

docs/source/en/api/pipelines/kandinsky3.md

@@ -9,7 +9,25 @@ specific language governing permissions and limitations under the License.

 # Kandinsky 3

-TODO
+Kandinsky 3 is created by [Arkhipkin Vladimir](https://github.com/oriBetelgeuse), [Igor Pavlov](https://github.com/boomb0om), [Andrei Filatov](https://github.com/anvilarth), [Zein Shaheen](https://github.com/zeinsh).


Let's add the entire list of authors here https://github.com/ai-forever/Kandinsky-3#authors

Ahh, I missed it. @yiyixuxu Thank you for pointing out :)

yiyixuxu · 2023-11-27T21:35:07Z

docs/source/en/api/pipelines/kandinsky3.md

+
+The description from it's Github page: 
+
+*Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, Kandinsky 3.0 incorporates more data and specifically related to Russian culture, which allows to generate pictures related to Russin culture. Furthermore, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*


Let's instead talk a little bit more about the architecture here. I think mostly:

a very power text encoder: FLAN-UL2

special Unet blocks that's twice deeper but remain same parameter counts

movq decoder (same as kandinsky 2)

yiyixuxu · 2023-11-27T21:39:06Z

docs/source/en/api/pipelines/kandinsky3.md

+
+The description from it's Github page: 
+
+*Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, Kandinsky 3.0 incorporates more data and specifically related to Russian culture, which allows to generate pictures related to Russin culture. Furthermore, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*


Suggested change

*Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, Kandinsky 3.0 incorporates more data and specifically related to Russian culture, which allows to generate pictures related to Russin culture. Furthermore, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*

*Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*

HuggingFaceDocBuilderDev · 2023-11-28T04:25:10Z

The documentation is not available anymore as the PR was closed or merged.

docs/source/en/api/pipelines/kandinsky3.md

patrickvonplaten · 2023-11-29T14:31:56Z

Great job @charchit7

charchit7 · 2023-11-29T14:35:31Z

Thanks @patrickvonplaten First PR merged to diffusers! more to go.

* added en doc for Kandinsky3.0 * required changes * Update docs/source/en/api/pipelines/kandinsky3.md * Update docs/source/en/api/pipelines/kandinsky3.md * Update docs/source/en/api/pipelines/kandinsky3.md --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

added en doc for Kandinsky3.0

9408773

charchit7 mentioned this pull request Nov 27, 2023

[doc] add a doc page for kandinsky 3.0! #5936

Closed

a-r-r-o-w reviewed Nov 27, 2023

View reviewed changes

a-r-r-o-w approved these changes Nov 27, 2023

View reviewed changes

yiyixuxu reviewed Nov 27, 2023

View reviewed changes

required changes

dd85f86

charchit7 requested a review from yiyixuxu November 28, 2023 16:33

yiyixuxu mentioned this pull request Nov 29, 2023

[Kandinsky 3.0] Follow-up TODOs #5944

Merged

6 tasks

yiyixuxu reviewed Nov 29, 2023

View reviewed changes

docs/source/en/api/pipelines/kandinsky3.md Outdated Show resolved Hide resolved

docs/source/en/api/pipelines/kandinsky3.md Outdated Show resolved Hide resolved

yiyixuxu added 2 commits November 29, 2023 01:51

Update docs/source/en/api/pipelines/kandinsky3.md

581f45d

Update docs/source/en/api/pipelines/kandinsky3.md

9c20779

patrickvonplaten reviewed Nov 29, 2023

View reviewed changes

docs/source/en/api/pipelines/kandinsky3.md Outdated Show resolved Hide resolved

Update docs/source/en/api/pipelines/kandinsky3.md

a77e09c

patrickvonplaten merged commit 6031ecb into huggingface:main Nov 29, 2023

charchit7 deleted the kd3_doc branch November 30, 2023 20:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added doc for Kandinsky3.0 #5937

added doc for Kandinsky3.0 #5937

Uh oh!

charchit7 commented Nov 27, 2023

Uh oh!

a-r-r-o-w Nov 27, 2023

Uh oh!

patrickvonplaten commented Nov 27, 2023

Uh oh!

yiyixuxu left a comment

Uh oh!

yiyixuxu Nov 27, 2023

Uh oh!

charchit7 Nov 28, 2023

Uh oh!

yiyixuxu Nov 27, 2023

Uh oh!

yiyixuxu Nov 27, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Nov 28, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

patrickvonplaten commented Nov 29, 2023

Uh oh!

charchit7 commented Nov 29, 2023

Uh oh!

Uh oh!


		The description from it's Github page:

		Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, Kandinsky 3.0 incorporates more data and specifically related to Russian culture, which allows to generate pictures related to Russin culture. Furthermore, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.

added doc for Kandinsky3.0 #5937

added doc for Kandinsky3.0 #5937

Uh oh!

Conversation

charchit7 commented Nov 27, 2023

What does this PR do?

Before submitting

Who can review?

Uh oh!

a-r-r-o-w Nov 27, 2023

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Nov 27, 2023

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Nov 27, 2023

Choose a reason for hiding this comment

Uh oh!

charchit7 Nov 28, 2023

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Nov 27, 2023

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Nov 27, 2023

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

patrickvonplaten commented Nov 29, 2023

Uh oh!

charchit7 commented Nov 29, 2023

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Nov 28, 2023 •

edited

Loading