Skip to content

Conversation

@Jintao-Huang
Copy link
Collaborator

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the reliability of multimodal data processing by implementing a retry mechanism for image URL fetching and standardizing the timeout configuration through a new SWIFT_TIMEOUT environment variable. These changes aim to enhance system stability and provide clearer, more consistent control over network operations, with all relevant documentation updated accordingly.

Highlights

  • Enhanced Image URL Fetching: The system now incorporates robust retry logic for fetching image URLs, utilizing requests.Session with HTTPAdapter and urllib3.util.retry.Retry. This ensures that transient network issues are handled gracefully, with up to 3 retries for GET requests.
  • Standardized Timeout Configuration: The environment variable for controlling timeouts has been standardized from TIMEOUT to SWIFT_TIMEOUT. This new variable, defaulting to 20 seconds for image fetching, provides a consistent mechanism for configuring request timeouts across the system.
  • Documentation Updates: Both the Chinese and English documentation have been updated to reflect the introduction of the SWIFT_TIMEOUT environment variable, detailing its purpose and default value for multimodal image URL fetching, and updating references in the FAQ section.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

本次 PR 主要在 vision_utils.py 中为从 URL 加载文件增加了重试机制,并使用了 requests.Session 来管理 HTTP 请求,提高了代码的健壮性。同时,将超时控制的环境变量从 TIMEOUT 更改为 SWIFT_TIMEOUT,并将默认值从 300 秒调整为 20 秒。相关的中英文文档也已同步更新。代码实现良好,我只在 vision_utils.py 中提出一个关于使用 with 语句来简化会话管理的小建议,以使代码更加简洁和 Pythonic。

Comment on lines 114 to 127
session = requests.Session()
session.mount("http://", HTTPAdapter(max_retries=retries))
session.mount("https://", HTTPAdapter(max_retries=retries))

timeout = float(os.getenv('SWIFT_TIMEOUT', '20'))
request_kwargs = {'timeout': timeout} if timeout > 0 else {}

try:
response = session.get(path, **request_kwargs)
response.raise_for_status()
content = response.content
res = BytesIO(content)
finally:
session.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

为了使代码更简洁且符合 Python 风格,建议使用 with 语句来管理 requests.Sessionwith 语句可以自动处理会话的关闭,从而替代 try...finally 结构,使代码更具可读性。

            with requests.Session() as session:
                session.mount("http://", HTTPAdapter(max_retries=retries))
                session.mount("https://", HTTPAdapter(max_retries=retries))

                timeout = float(os.getenv('SWIFT_TIMEOUT', '20'))
                request_kwargs = {'timeout': timeout} if timeout > 0 else {}

                response = session.get(path, **request_kwargs)
                response.raise_for_status()
                content = response.content
                res = BytesIO(content)

@Jintao-Huang
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry mechanism for downloading images from URLs, making the process more robust against transient network failures. The implementation uses requests.Session with HTTPAdapter and urllib3.util.retry.Retry. The timeout handling has also been updated to use a new environment variable SWIFT_TIMEOUT with a default of 20 seconds. The documentation has been updated accordingly in both Chinese and English to reflect these changes.

My review includes suggestions to improve the efficiency of HTTP requests by reusing a requests.Session object, making the retry logic more configurable, and enhancing the clarity of the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For clarity, it would be helpful to mention the default timeout value here, similar to how it was documented for the old TIMEOUT variable. This helps users understand the default behavior.

Suggested change
设置环境变量`SWIFT_TIMEOUT`。或者`InferClient`中可以传参数。
设置环境变量`SWIFT_TIMEOUT`(默认为20秒)。或者`InferClient`中可以传参数。


### Q12: Can we set request timeout time for Swift3.0 deployment inference? What happens if the image URL is invalid?
You can set the `TIMEOUT` environment variable, which defaults to 300 seconds. Alternatively, you can pass parameters in `InferClient`.
You can set the `SWIFT_TIMEOUT` environment variable. Alternatively, you can pass parameters in `InferClient`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For clarity, it would be helpful to mention the default timeout value here, similar to how it was documented for the old TIMEOUT variable. This helps users understand the default behavior.

Suggested change
You can set the `SWIFT_TIMEOUT` environment variable. Alternatively, you can pass parameters in `InferClient`.
You can set the `SWIFT_TIMEOUT` environment variable, which defaults to 20 seconds. Alternatively, you can pass parameters in `InferClient`.

request_kwargs['timeout'] = timeout
content = requests.get(path, **request_kwargs).content
res = BytesIO(content)
retries = Retry(total=3, backoff_factor=1, allowed_methods=['GET'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The retry parameters (total and backoff_factor) are hardcoded. It would be more flexible to allow configuring them via environment variables, similar to how SWIFT_TIMEOUT is handled. This would allow users to adjust the retry behavior based on their network conditions without changing the code.

Suggested change
retries = Retry(total=3, backoff_factor=1, allowed_methods=['GET'])
retries = Retry(total=int(os.getenv('SWIFT_RETRY_TOTAL', 3)), backoff_factor=float(os.getenv('SWIFT_RETRY_BACKOFF_FACTOR', 1)), allowed_methods=['GET'])

Comment on lines +111 to +113
with requests.Session() as session:
session.mount('http://', HTTPAdapter(max_retries=retries))
session.mount('https://', HTTPAdapter(max_retries=retries))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Creating a new requests.Session for each file download is inefficient, especially when downloading multiple images in a batch. A session object should be reused across requests to leverage connection pooling, which can significantly improve performance.

It's recommended to create a single session object at the module level and reuse it for all HTTP requests.

For example:

# At module level
_session = requests.Session()
retries = Retry(total=3, backoff_factor=1, allowed_methods=['GET'])
adapter = HTTPAdapter(max_retries=retries)
_session.mount('http://', adapter)
_session.mount('https://', adapter)

def load_file(...):
    # ...
    if path.startswith('http'):
        # ...
        response = _session.get(path, **request_kwargs)
        # ...

While this change is outside the current diff, it's a highly recommended practice for performance.

@Jintao-Huang Jintao-Huang merged commit 35a1c61 into modelscope:main Oct 16, 2025
1 of 2 checks passed
Jintao-Huang added a commit that referenced this pull request Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants