support edit_image_auto_resize#829
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @mi804, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces an automatic image resizing feature for the Qwen-Image editing pipeline. It adds a new parameter, edit_image_auto_resize, which, when enabled, automatically resizes the input image for editing to a standardized area while maintaining its original aspect ratio. This enhancement aims to improve consistency and potentially performance in image editing tasks by standardizing input dimensions, and its usage is reflected in updated examples and the training configuration.
Highlights
- New Parameter for Image Resizing: A new boolean parameter
edit_image_auto_resizehas been added to theQwenImagePipeline's__call__method, allowing users to control whether the input image for editing is automatically resized. - Automatic Image Resizing Logic: The
QwenImageUnit_EditImageEmbeddernow includes logic to automatically resize the inputedit_imageto a target area (1024x1024 pixels) while preserving its original aspect ratio, based on theedit_image_auto_resizeflag. - Updated Examples and Documentation: Example scripts (
Qwen-Image-Edit.pyandQwen-Image-Edit-Lowres-Fix.py) have been updated to demonstrate the usage and effect of the newedit_image_auto_resizeparameter, showing both enabled and disabled scenarios. - Training Pipeline Integration: The training pipeline in
train.pyhas been configured to utilize theedit_image_auto_resizefeature by default during preprocessing.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces an edit_image_auto_resize feature to the QwenImagePipeline. This allows automatically resizing the edit image to a target area while preserving its aspect ratio. The changes include adding the new parameter to the pipeline, implementing the resizing logic in QwenImageUnit_EditImageEmbedder, and updating example files to demonstrate its usage.
My review focuses on improving code style and maintainability. I've suggested moving an import statement to the top of the file, aligning default parameter values for consistency, and improving variable naming for better readability. These changes will make the code cleaner and easier to understand.
| def calculate_dimensions(self, target_area, ratio): | ||
| import math | ||
| width = math.sqrt(target_area * ratio) | ||
| height = width / ratio | ||
| width = round(width / 32) * 32 | ||
| height = round(height / 32) * 32 | ||
| return width, height |
There was a problem hiding this comment.
The import math statement should be at the top of the file to adhere to PEP 8 style guidelines. This improves readability and avoids repeated imports. Please move it to the top of the file with other imports.
| def calculate_dimensions(self, target_area, ratio): | |
| import math | |
| width = math.sqrt(target_area * ratio) | |
| height = width / ratio | |
| width = round(width / 32) * 32 | |
| height = round(height / 32) * 32 | |
| return width, height | |
| def calculate_dimensions(self, target_area, ratio): | |
| width = math.sqrt(target_area * ratio) | |
| height = width / ratio | |
| width = round(width / 32) * 32 | |
| height = round(height / 32) * 32 | |
| return width, height |
| return edit_image.resize((calculated_width, calculated_height)) | ||
|
|
||
|
|
||
| def process(self, pipe: QwenImagePipeline, edit_image, tiled, tile_size, tile_stride, edit_image_auto_resize=False): |
There was a problem hiding this comment.
The default value for edit_image_auto_resize is False here, but it's True in the public __call__ method (line 283). For consistency and to avoid confusion, it's better to make the default value here True as well.
| def process(self, pipe: QwenImagePipeline, edit_image, tiled, tile_size, tile_stride, edit_image_auto_resize=False): | |
| def process(self, pipe: QwenImagePipeline, edit_image, tiled, tile_size, tile_stride, edit_image_auto_resize=True): |
| edit_image = pipe.preprocess_image(resized_edit_image).to(device=pipe.device, dtype=pipe.torch_dtype) | ||
| edit_latents = pipe.vae.encode(edit_image, tiled=tiled, tile_size=tile_size, tile_stride=tile_stride) |
There was a problem hiding this comment.
The variable edit_image is being reused to store the preprocessed tensor. This can be confusing as it originally held a PIL Image. Using a new variable name for the tensor would improve code clarity.
| edit_image = pipe.preprocess_image(resized_edit_image).to(device=pipe.device, dtype=pipe.torch_dtype) | |
| edit_latents = pipe.vae.encode(edit_image, tiled=tiled, tile_size=tile_size, tile_stride=tile_stride) | |
| processed_image = pipe.preprocess_image(resized_edit_image).to(device=pipe.device, dtype=pipe.torch_dtype) | |
| edit_latents = pipe.vae.encode(processed_image, tiled=tiled, tile_size=tile_size, tile_stride=tile_stride) |
support edit_image_auto_resize
|
@Artiprocher @你好,想请教一下,因为什么目的引入的这个edit_image_auto_resize参数呢?是因为原模型在训练是就保持了edit image 统一为1024么? |
No description provided.