[Data] - Replace read_images test with testing invalid bytes instead testing with empty file by goutamvenkat-anyscale · Pull Request #62647 · ray-project/ray

goutamvenkat-anyscale · 2026-04-15T23:54:21Z

Description

The valueerror should be raised only in the event that the contents of the file are invalid not if the file is empty

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

…of empty file Signed-off-by: Goutam <goutam@anyscale.com>

gemini-code-assist

Code Review

This pull request refactors the image datasource tests to use the tmp_path fixture and updates the test for unidentified image errors to use a file with invalid bytes. Feedback indicates that the implementation still fails to address the primary objective of handling empty files without raising a ValueError. Additionally, a suggestion was made to use the existing path variable in the test for better clarity and correctness.

gemini-code-assist · 2026-04-15T23:55:36Z

-        with tempfile.NamedTemporaryFile(suffix=".png") as file:
-            with pytest.raises(ValueError):
-                ray.data.read_images(paths=file.name).materialize()
+    def test_unidentified_image_error(ray_start_regular_shared, tmp_path):


The PR description states that ValueError should not be raised for empty files. However, this change only updates the error test to use non-empty invalid bytes. The actual implementation in ImageDatasource._read_stream (in python/ray/data/_internal/datasource/image_datasource.py) still raises ValueError for empty files because PIL.Image.open fails on empty buffers.\n\nTo fully address the PR's objective, the implementation should be updated to handle empty files (e.g., by returning an empty block), and a test case should be added to verify that reading an empty file does not raise an error and results in an empty dataset.

gemini-code-assist · 2026-04-15T23:55:36Z

+            file.write(b"spam")  # Invalid bytes for a PNG file
+
+        with pytest.raises(ValueError):
+            ray.data.read_images(paths=file.name).materialize()


The variable path is already defined as a string on line 172. It is cleaner to use path directly instead of file.name, especially since the file object is closed at this point.

Suggested change

ray.data.read_images(paths=file.name).materialize()

ray.data.read_images(paths=path).materialize()

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Reviewed by Cursor Bugbot for commit eba3b7a. Configure here.}

cursor · 2026-04-15T23:56:19Z

+            file.write(b"spam")  # Invalid bytes for a PNG file
+
+        with pytest.raises(ValueError):
+            ray.data.read_images(paths=file.name).materialize()


Uses closed file handle instead of existing path variable

Low Severity

file.name is used on line 177 after the with open(...) block has exited, even though the local variable path (defined on line 172) already holds the identical value. Referencing an attribute on a closed file handle outside its context manager is needlessly confusing when a clearer alternative is already in scope. paths=path would be more straightforward.

^{Reviewed by Cursor Bugbot for commit eba3b7a. Configure here.}

[Data] - Replace read_images test with testing invalid bytes instead …

eba3b7a

…of empty file Signed-off-by: Goutam <goutam@anyscale.com>

goutamvenkat-anyscale requested a review from a team as a code owner April 15, 2026 23:54

goutamvenkat-anyscale added data Ray Data-related issues go add ONLY when ready to merge, run all tests labels Apr 15, 2026

gemini-code-assist bot reviewed Apr 15, 2026

View reviewed changes

cursor bot reviewed Apr 15, 2026

View reviewed changes

ayushk7102 approved these changes Apr 16, 2026

View reviewed changes

Merge branch 'master' into goutam/test_image_fix

c53aed8

goutamvenkat-anyscale enabled auto-merge (squash) April 16, 2026 15:36

github-actions bot disabled auto-merge April 16, 2026 15:37

goutamvenkat-anyscale merged commit 808478d into ray-project:master Apr 16, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] - Replace read_images test with testing invalid bytes instead testing with empty file#62647

[Data] - Replace read_images test with testing invalid bytes instead testing with empty file#62647
goutamvenkat-anyscale merged 2 commits intoray-project:masterfrom
goutamvenkat-anyscale:goutam/test_image_fix

goutamvenkat-anyscale commented Apr 15, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 15, 2026

Uh oh!

gemini-code-assist bot Apr 15, 2026

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	ray.data.read_images(paths=file.name).materialize()
	ray.data.read_images(paths=path).materialize()

Conversation

goutamvenkat-anyscale commented Apr 15, 2026

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Apr 15, 2026

Choose a reason for hiding this comment

Uses closed file handle instead of existing path variable

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants