feat: unify image loader#407
Conversation
e06084
commented
May 28, 2026
There was a problem hiding this comment.
Code Review
This pull request introduces a unified ImageLoader utility to handle image loading and API encoding across all evaluators, refactoring several models and rules to use this new helper. It also updates example scripts, test data schemas, and adds a comprehensive documentation plan for Agentic Search evaluation. The review feedback highlights critical improvements for the new ImageLoader and refactored code, including removing redundant streaming to prevent connection leaks, forcing eager loading of PIL images to avoid closed stream errors, defaulting to PNG encoding to prevent RGBA compatibility issues, and ensuring safe filename generation when handling PIL Image objects.
| def _download_url(url: str, timeout: int = 30) -> bytes: | ||
| import requests | ||
|
|
||
| resp = requests.get(url, timeout=timeout, stream=True) | ||
| resp.raise_for_status() | ||
| return resp.content |
There was a problem hiding this comment.
Using stream=True without reading the response in chunks is redundant when accessing resp.content directly. Additionally, if the response is not closed or used within a context manager, it can lead to connection leaks. Removing stream=True is safer and simpler.
| def _download_url(url: str, timeout: int = 30) -> bytes: | |
| import requests | |
| resp = requests.get(url, timeout=timeout, stream=True) | |
| resp.raise_for_status() | |
| return resp.content | |
| def _download_url(url: str, timeout: int = 30) -> bytes: | |
| import requests | |
| resp = requests.get(url, timeout=timeout) | |
| resp.raise_for_status() | |
| return resp.content |
| if source.startswith("data:"): | ||
| header, data = source.split(",", 1) | ||
| image_bytes = base64.b64decode(data) | ||
| return Image.open(io.BytesIO(image_bytes)) | ||
|
|
||
| if source.startswith(("http://", "https://")): | ||
| image_bytes = _download_url(source) | ||
| return Image.open(io.BytesIO(image_bytes)) | ||
|
|
||
| # Local file path | ||
| if not os.path.isfile(source): | ||
| raise FileNotFoundError( | ||
| f"Image file not found: '{source}'\n" | ||
| f"Current working directory: {os.getcwd()}\n" | ||
| f"Absolute path would be: {os.path.abspath(source)}\n" | ||
| f"Ensure the path is correct relative to your working directory." | ||
| ) | ||
| return Image.open(source) |
There was a problem hiding this comment.
PIL.Image.open performs lazy loading and keeps the underlying file/stream open. If the BytesIO object or file stream is garbage collected or closed before the image data is actually read, subsequent operations on the image will fail with ValueError: I/O operation on closed file. Calling img.load() immediately after opening forces PIL to read and decode the image data into memory, safely releasing the stream.
if source.startswith(\"data:\"):
header, data = source.split(\",\", 1)
image_bytes = base64.b64decode(data)
img = Image.open(io.BytesIO(image_bytes))
img.load()
return img
if source.startswith((\"http://\", \"https://\")):
image_bytes = _download_url(source)
img = Image.open(io.BytesIO(image_bytes))
img.load()
return img
# Local file path
if not os.path.isfile(source):
raise FileNotFoundError(
f\"Image file not found: '{source}'\\n\"
f\"Current working directory: {os.getcwd()}\\n\"
f\"Absolute path would be: {os.path.abspath(source)}\\n\"
f\"Ensure the path is correct relative to your working directory.\"
)
img = Image.open(source)
img.load()
return img| if isinstance(source, Image.Image): | ||
| buf = io.BytesIO() | ||
| fmt = source.format or "PNG" | ||
| mime = _MIME_MAP.get(f".{fmt.lower()}", "image/png") | ||
| source.save(buf, format=fmt) | ||
| b64 = base64.b64encode(buf.getvalue()).decode("utf-8") | ||
| return f"data:{mime};base64,{b64}" |
There was a problem hiding this comment.
When encoding a PIL Image for the API, using the original format (e.g., JPEG) can raise an OSError: cannot write mode RGBA as JPEG if the image has an alpha channel. Defaulting to PNG is much safer, avoids format compatibility issues, and is fully supported by OpenAI-compatible vision APIs.
if isinstance(source, Image.Image):
buf = io.BytesIO()
# Default to PNG to avoid JPEG RGBA compatibility issues and ensure API compatibility
source.save(buf, format=\"PNG\")
b64 = base64.b64encode(buf.getvalue()).decode(\"utf-8\")
return f\"data:image/png;base64,{b64}\"| img_basename = Path(str(image_source)).name | ||
| vis_filename = f"visual_{img_basename}" | ||
| vis_path = str(output_dir / vis_filename) |
There was a problem hiding this comment.
If image_source is a PIL Image object, calling str(image_source) returns a string containing < and > characters (e.g., <PIL.PngImagePlugin.PngImageFile...>), which are invalid in filenames on Windows and highly problematic on other filesystems. Generating a clean filename based on the type of image_source is much safer. Note that the same issue exists on lines 709-711.
img_basename = Path(image_source.split(\"?\")[0]).name if isinstance(image_source, str) else f\"image_{id(image_source)}.png\"
vis_filename = f\"visual_{img_basename}\"
vis_path = str(output_dir / vis_filename)