Skip to content

feat: unify image loader#407

Merged
e06084 merged 7 commits into
MigoXLab:devfrom
e06084:dev
May 28, 2026
Merged

feat: unify image loader#407
e06084 merged 7 commits into
MigoXLab:devfrom
e06084:dev

Conversation

@e06084
Copy link
Copy Markdown
Collaborator

@e06084 e06084 commented May 28, 2026

image image

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a unified ImageLoader utility to handle image loading and API encoding across all evaluators, refactoring several models and rules to use this new helper. It also updates example scripts, test data schemas, and adds a comprehensive documentation plan for Agentic Search evaluation. The review feedback highlights critical improvements for the new ImageLoader and refactored code, including removing redundant streaming to prevent connection leaks, forcing eager loading of PIL images to avoid closed stream errors, defaulting to PNG encoding to prevent RGBA compatibility issues, and ensuring safe filename generation when handling PIL Image objects.

Comment on lines +26 to +31
def _download_url(url: str, timeout: int = 30) -> bytes:
import requests

resp = requests.get(url, timeout=timeout, stream=True)
resp.raise_for_status()
return resp.content
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using stream=True without reading the response in chunks is redundant when accessing resp.content directly. Additionally, if the response is not closed or used within a context manager, it can lead to connection leaks. Removing stream=True is safer and simpler.

Suggested change
def _download_url(url: str, timeout: int = 30) -> bytes:
import requests
resp = requests.get(url, timeout=timeout, stream=True)
resp.raise_for_status()
return resp.content
def _download_url(url: str, timeout: int = 30) -> bytes:
import requests
resp = requests.get(url, timeout=timeout)
resp.raise_for_status()
return resp.content

Comment on lines +62 to +79
if source.startswith("data:"):
header, data = source.split(",", 1)
image_bytes = base64.b64decode(data)
return Image.open(io.BytesIO(image_bytes))

if source.startswith(("http://", "https://")):
image_bytes = _download_url(source)
return Image.open(io.BytesIO(image_bytes))

# Local file path
if not os.path.isfile(source):
raise FileNotFoundError(
f"Image file not found: '{source}'\n"
f"Current working directory: {os.getcwd()}\n"
f"Absolute path would be: {os.path.abspath(source)}\n"
f"Ensure the path is correct relative to your working directory."
)
return Image.open(source)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

PIL.Image.open performs lazy loading and keeps the underlying file/stream open. If the BytesIO object or file stream is garbage collected or closed before the image data is actually read, subsequent operations on the image will fail with ValueError: I/O operation on closed file. Calling img.load() immediately after opening forces PIL to read and decode the image data into memory, safely releasing the stream.

        if source.startswith(\"data:\"):
            header, data = source.split(\",\", 1)
            image_bytes = base64.b64decode(data)
            img = Image.open(io.BytesIO(image_bytes))
            img.load()
            return img

        if source.startswith((\"http://\", \"https://\")):
            image_bytes = _download_url(source)
            img = Image.open(io.BytesIO(image_bytes))
            img.load()
            return img

        # Local file path
        if not os.path.isfile(source):
            raise FileNotFoundError(
                f\"Image file not found: '{source}'\\n\"
                f\"Current working directory: {os.getcwd()}\\n\"
                f\"Absolute path would be: {os.path.abspath(source)}\\n\"
                f\"Ensure the path is correct relative to your working directory.\"
            )
        img = Image.open(source)
        img.load()
        return img

Comment on lines +93 to +99
if isinstance(source, Image.Image):
buf = io.BytesIO()
fmt = source.format or "PNG"
mime = _MIME_MAP.get(f".{fmt.lower()}", "image/png")
source.save(buf, format=fmt)
b64 = base64.b64encode(buf.getvalue()).decode("utf-8")
return f"data:{mime};base64,{b64}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When encoding a PIL Image for the API, using the original format (e.g., JPEG) can raise an OSError: cannot write mode RGBA as JPEG if the image has an alpha channel. Defaulting to PNG is much safer, avoids format compatibility issues, and is fully supported by OpenAI-compatible vision APIs.

        if isinstance(source, Image.Image):
            buf = io.BytesIO()
            # Default to PNG to avoid JPEG RGBA compatibility issues and ensure API compatibility
            source.save(buf, format=\"PNG\")
            b64 = base64.b64encode(buf.getvalue()).decode(\"utf-8\")
            return f\"data:image/png;base64,{b64}\"

Comment on lines +700 to 702
img_basename = Path(str(image_source)).name
vis_filename = f"visual_{img_basename}"
vis_path = str(output_dir / vis_filename)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If image_source is a PIL Image object, calling str(image_source) returns a string containing < and > characters (e.g., <PIL.PngImagePlugin.PngImageFile...>), which are invalid in filenames on Windows and highly problematic on other filesystems. Generating a clean filename based on the type of image_source is much safer. Note that the same issue exists on lines 709-711.

                img_basename = Path(image_source.split(\"?\")[0]).name if isinstance(image_source, str) else f\"image_{id(image_source)}.png\"
                vis_filename = f\"visual_{img_basename}\"
                vis_path = str(output_dir / vis_filename)

@e06084 e06084 merged commit a5f4dcb into MigoXLab:dev May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants