-
Notifications
You must be signed in to change notification settings - Fork 25
Doc/qwen image #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doc/qwen image #133
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @Glaceon-Hyy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces comprehensive documentation for integrating and utilizing the Qwen-Image model within the DiffSynth-Engine framework. It provides clear instructions and code examples for image generation, enhancing the tutorial for both English and Chinese users by detailing model loading, parameter usage, and best practices.
Highlights
- New Qwen-Image Documentation: A new section dedicated to the Qwen-Image model has been added, providing comprehensive instructions and code examples for its integration and use in image generation.
- Code Examples for Qwen-Image: The documentation now includes detailed Python code snippets demonstrating how to load the
QwenImagePipeline
and generate images, complete with example prompts and parameter configurations. - Detailed Parameter Explanations: Key parameters for the
QwenImagePipeline
such asprompt
,negative_prompt
,cfg_scale
,height
,width
,num_inference_steps
, andseed
are now thoroughly explained, offering fine-grained control over image generation. - Usage Guidelines and Optimizations: Information regarding recommended image resolutions,
cfg_scale
suggestions, default behaviors fornegative_prompt
, and current multi-GPU parallelism support has been added.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @Glaceon-Hyy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
I've updated the documentation to include instructions and examples for integrating and utilizing the Qwen-Image generation model within the DiffSynth-Engine framework. This enhancement provides users with clear guidance on how to leverage this specific model for image generation, complete with code snippets and parameter explanations, available in both English and Chinese tutorials.
Highlights
- New Qwen-Image Generation Documentation (English): I've added a comprehensive new section to the English tutorial (
docs/tutorial.md
) that details how to perform image generation using theQwenImagePipeline
. This includes a runnable Python code example and a thorough breakdown of all relevant parameters. - New Qwen-Image Generation Documentation (Chinese): I've also mirrored the new Qwen-Image generation documentation in the Chinese tutorial (
docs/tutorial_zh.md
), ensuring that users who prefer Chinese can access the same detailed instructions and code examples for using theQwenImagePipeline
.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds documentation for the Qwen-Image model to both the English and Chinese tutorials. The changes are good, but I've found a few minor typos and formatting inconsistencies in the markdown files that should be addressed for better clarity and consistency.
docs/tutorial.md
Outdated
|
||
### Image Generation(Qwen-Image) | ||
|
||
The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, cfg_scale, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a stray cfg_scale,
in this sentence. Removing it would improve clarity.
The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, cfg_scale, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway. | |
The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway. |
docs/tutorial.md
Outdated
* `negative_prompt`:The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly". | ||
* `cfg_scale`:The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content. | ||
* `height`: Image height. | ||
* `width`: Image width. | ||
* `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality. | ||
* `seed`: The random seed. A fixed seed ensures reproducible results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of formatting issues in this list:
- Lines 130 and 131 use a full-width colon (
:
) instead of a standard colon (:
). - The list item indentation is inconsistent. It's best to use a single space after the asterisk for all items for consistency.
* `negative_prompt`:The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly". | |
* `cfg_scale`:The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content. | |
* `height`: Image height. | |
* `width`: Image width. | |
* `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality. | |
* `seed`: The random seed. A fixed seed ensures reproducible results. | |
* `negative_prompt`: The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly". | |
* `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content. | |
* `height`: Image height. | |
* `width`: Image width. | |
* `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality. | |
* `seed`: The random seed. A fixed seed ensures reproducible results. |
docs/tutorial_zh.md
Outdated
* `negative_prompt`:负面提示词,用于描述不希望图像中出现的内容,例如“ugly”,默认为一个空格而不是空字符串, " "。 | ||
* `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性,推荐值为4。 | ||
* `height`:图像高度。 | ||
* `width`:图像宽度。 | ||
* `num_inference_steps`:推理步数,通常推理步数越多,计算时间越长,图像质量越高。 | ||
* `seed`:随机种子,固定的随机种子可以使生成的内容固定。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list items here use a full-width colon (:
) instead of a standard colon (:
). For consistency and correct rendering, it's better to use standard colons.
* `negative_prompt`:负面提示词,用于描述不希望图像中出现的内容,例如“ugly”,默认为一个空格而不是空字符串, " "。 | |
* `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性,推荐值为4。 | |
* `height`:图像高度。 | |
* `width`:图像宽度。 | |
* `num_inference_steps`:推理步数,通常推理步数越多,计算时间越长,图像质量越高。 | |
* `seed`:随机种子,固定的随机种子可以使生成的内容固定。 | |
* `negative_prompt`: 负面提示词,用于描述不希望图像中出现的内容,例如“ugly”,默认为一个空格而不是空字符串, " "。 | |
* `cfg_scale`: [Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性,推荐值为4。 | |
* `height`: 图像高度。 | |
* `width`: 图像宽度。 | |
* `num_inference_steps`: 推理步数,通常推理步数越多,计算时间越长,图像质量越高。 | |
* `seed`: 随机种子,固定的随机种子可以使生成的内容固定。 |
Warning Gemini encountered an error creating the review. You can try again by commenting |
No description provided.