-
Notifications
You must be signed in to change notification settings - Fork 217
Skipping test_graphics_apis.py when running on WSL
#1166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Skipping test_graphics_apis.py when running on WSL
#1166
Conversation
4597400 to
ae3f200
Compare
cudaErrorOperatingSystem to the error tuple in the test_graphics_apis.pytest_graphics_apis.py when running on WSL
|
/ok to test 9da73e3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR addresses graphics API test failures on WSL by implementing platform-specific test skipping. A new helpers.py module is introduced to centralize test utilities, providing the IS_WSL flag used by test_graphics_apis.py to skip both test_graphics_api_smoketest() and test_cuda_register_image_invalid() when running on WSL platforms. The solution recognizes that WSL lacks OpenGL passthrough capabilities, making these graphics interop tests non-functional in that environment.
PR Description Notes:
- The description should clarify that tests are being skipped on WSL, not that errors are being added to acceptable tuples (as the screenshot might suggest).
Potential Issues
helpers.py import structure adds unnecessary complexity:
Lines 28-31 implement a package-vs-module detection workaround that suggests cuda_python_test_helpers has an inconsistent installation structure. This pattern may cause confusion for future maintainers:
if hasattr(cuda_python_test_helpers, "__path__"):
import cuda_python_test_helpers.cuda_python_test_helpers as cuda_python_test_helpersConsider standardizing how cuda_python_test_helpers is structured across installation methods (editable vs wheel) to eliminate this conditional logic.
sys.path modification is not cleaned up on success:
Line 25 inserts a path into sys.path but never removes it, even after successful import. While this may be acceptable in a test context, it permanently modifies the import search order for the entire test session. Consider using try/finally to ensure cleanup, or document why the persistent modification is intentional.
Confidence: 4/5
The change correctly solves the stated problem (WSL test failures) with a clean, maintainable approach. The confidence score is not 5 because: (1) the import fallback logic in helpers.py suggests underlying structural issues with cuda_python_test_helpers packaging that may surface later, and (2) the permanent sys.path modification could theoretically cause import shadowing issues if other packages have conflicting names (though this is unlikely in practice).
2 files reviewed, no comments
|
|
/ok to test 6927cfe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test /always/ fails on WSL. WSL is my daily driver, I wrote this test, and it "works for me" ™️. So I think there is probably something more specific/different about the WSL setup that is failing there. Maybe there's no $DISPLAY, or maybe the drivers aren't installed? Are there any clues in the way it's failing? I think we should update this to something more narrow than just "always skip on WSL".
I also have some feedback on the cuda_python_test_helpers stuff, but I think that's out-of-scope for here so it can be dealt with separately.
|
I agree with Mike that we shouldn't just skip. Lots of things, if not all, are supposed to just work on WSL, and I raised this offline to Rob and all in the team channel that incomplete knowledge of WSL failures are now biting us. |
|
I can dig deeper into why this happening... I figured since it was throwing a clear operating system error code that WSL wasn't support. But I'll dig deeper. |
| # Clean up sys.path modification | ||
| if test_helpers_path in sys.path: | ||
| sys.path.remove(test_helpers_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you trying to achieve here by removing this from sys.path after adding it?
The import system will keep cuda_python_test_helpers in sys.modules, so any subsequent imports of it will just pull from that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was its preventing polluting global state because the import path changes is alive for the duration of the lifetime of the interpreter process.
| def test_graphics_api_smoketest(): | ||
| # Due to lazy importing in pyglet, pytest.importorskip doesn't work | ||
| try: | ||
| import pyglet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should be restructured to avoid mixing up the reason for unexpected failures, expected failures, or missing dependencies.
Right now, it's just pytest.skip-ing and then returning (I realize this is pre-existing) which is making the whole thing unnecessarily complicated.
# skips if `pyglet.image` isn't importable for whatever reason, you can address
# the laziness by importing something deeper like `pyglet.image`
pyglet_image = pytest.importorskip("pyglet.image")
# if the next line raises an `AttributeError`, that should be specified in a
# `pytest.mark.xfail(raises=AttributeError)` decorator using whatever condition
# to indicate why the failure happened
tex = pyglet_image.Texture.create(512, 512)Then you can probably remove the code that's mixing these reasons together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also chain xfail decorators so that you can specify different conditions and types. For example, let's say pyglet is installed but we're running without a display. That will raises a different error than AttributeError. So you can do the following:
@pytest.mark.xfail(condition=..., raises=AttributeError)
@pytest.mark.xfail(condition=os.environ.get("DISPLAY") is None, raises=pyglet.TheRightException)
def test_graphics_api_smoketest():
...You can also combine these if they're not mutually exclusive (i.e., an AttributeError is raised when DISPLAY is None.
@pytest.mark.xfail(condition=... or os.environ.get("DISPLAY") is None, raises=(AttributeError, pyglet.TheRightException))
def test_graphics_api_smoketest():
...|
Looking at the screenshot again, I think maybe just adding |
|
But why didn't you get the |
|
I don't know if it is obvious to see relations between issues/PRs, but I think this might be relevant (#1165). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This review covers only the changes made since the last review, not the entire PR. The developer has responded to previous feedback by refactoring the WSL detection logic into cuda_python_test_helpers and cleaning up the test helper imports in both cuda_core/tests/helpers.py and cuda_bindings/tests/helpers.py. Specifically, cuda_python_test_helpers now exports IS_WSL and supports_ipc_mempool, with the latter incorporating WSL detection and lazy-importing handle_return to avoid hard dependencies. The test helper modules in cuda_core and cuda_bindings now use try-finally blocks to clean up temporary sys.path modifications after importing the helper package.
However, the changes introduce a critical logic error in both cuda_core/tests/helpers.py and cuda_bindings/tests/helpers.py: the sys.path.remove() call executes in the finally block immediately after the import statement (lines 26-32 in both files), which removes the path from sys.path before the module attributes IS_WSL and supports_ipc_mempool are accessed on lines 35-36. While Python's import system will cache the module in sys.modules, this premature cleanup is confusing and suggests a misunderstanding of when the cleanup should occur. The cleanup should happen after all usage of the imported module, not inside the try-finally block that wraps the import itself.
Additionally, in cuda_bindings/tests/helpers.py, the refactor removed the package-vs-module handling code (old lines 28-30) that checked whether cuda_python_test_helpers was imported as a package or module. This removal may cause issues if the import context changes.
Potential Issues
High Severity:
-
cuda_core/tests/helpers.py and cuda_bindings/tests/helpers.py (lines 26-32, 35-36): The
sys.pathcleanup in thefinallyblock occurs immediately after the import, before the module attributes are accessed. While this will work due tosys.modulescaching, it creates confusing logic. The reviewer previously asked "What are you trying to achieve here by removing this fromsys.pathafter adding it?" and this implementation doesn't clearly address that question. If the intent is cleanup, it should happen after line 36, not between lines 28-32. -
cuda_bindings/tests/helpers.py (removed lines 28-30): The deletion of the package-vs-module handling code may break imports if
cuda_python_test_helpersis ever imported as a package instead of a module. The old code checkedhasattr(cuda_python_test_helpers, 'IS_WSL')and had fallback logic. This safeguard is now gone.
Medium Severity:
- cuda_python_test_helpers/init.py (line 41): The lazy import of
handle_returninsidesupports_ipc_mempoolis a good pattern, but the comment says it's to "avoid hard dependency on cuda.core". Ifcuda.coreis not installed, the import will still fail when the function is called. Consider wrapping this in a try-except to return False gracefully if cuda.core is unavailable.
Confidence Score: 1 / 5
The logic errors in the sys.path cleanup make this PR unsafe to merge without correction. The timing of the cleanup suggests a misunderstanding of Python's import mechanics, and the removal of defensive code in cuda_bindings increases fragility.
3 files reviewed, 3 comments
| try: | ||
| sys.path.insert(0, test_helpers_path) | ||
| import cuda_python_test_helpers | ||
| finally: | ||
| # Clean up sys.path modification | ||
| if test_helpers_path in sys.path: | ||
| sys.path.remove(test_helpers_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The finally block removes test_helpers_path from sys.path immediately after import on line 28. This breaks the import cuda_python_test_helpers statement since Python needs the path to remain in sys.path during the import process. The cleanup should happen after all uses of the imported module (after line 36), not inside the try/finally that wraps the import.
| try: | ||
| sys.path.insert(0, test_helpers_path) | ||
| import cuda_python_test_helpers | ||
| finally: | ||
| # Clean up sys.path modification | ||
| if test_helpers_path in sys.path: | ||
| sys.path.remove(test_helpers_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The sys.path cleanup in the finally block runs before the import statement (line 28) completes, which means cuda_python_test_helpers may not be accessible after the try-finally block exits. The finally block executes immediately after the try block, removing the path before the module can be used at line 35-36. Move the cleanup to after all usage of the imported module (after line 36).
| test_helpers_path = str(pathlib.Path(__file__).resolve().parents[2] / "cuda_python_test_helpers") | ||
| try: | ||
| sys.path.insert(0, test_helpers_path) | ||
| import cuda_python_test_helpers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: If the import on line 28 fails, the path remains in sys.path since the finally block's removal is conditional. Consider moving the import outside the try-finally and only wrapping the sys.path manipulation, or ensure the path is always removed regardless of import success.
Yeah, I think @rparolin can just compare our WSL installs until we understand what might be the difference, then we can gate on that. But generally speaking, if CUDA tells us the OS isn't suitable, that's a "reasonable" result in terms of a test that runs everywhere. |
OK. Here's the CUDA's dark corner (which we learned through debugging #248 (comment)). In the graphics interop context |
|
(I canceled all CI runs because they are hanging and blocking all other teams from getting their tests run, see the nv-gha-runners thread.) |
|
I'm seeing the error in the PR description also here (ctk-next testing). Note the This is with Windows 11 24H2 headless. See also #1165 that @leofang pointed out already. |
|
This seems to work for me (same as in #1165 (comment)): diff --git a/cuda_bindings/tests/test_graphics_apis.py b/cuda_bindings/tests/test_graphics_apis.py
old mode 100644
new mode 100755
index 67f674b..d8bb2ee
--- a/cuda_bindings/tests/test_graphics_apis.py
+++ b/cuda_bindings/tests/test_graphics_apis.py
@@ -1,6 +1,8 @@
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: LicenseRef-NVIDIA-SOFTWARE-LICENSE
+import os
+
import pytest
from cuda.bindings import runtime as cudart
@@ -22,6 +24,8 @@ def test_graphics_api_smoketest():
error_name = cudart.cudaGetErrorName(err)[1].decode()
if error_name == "cudaSuccess":
assert int(gfx_resource) != 0
+ elif os.environ.get("DISPLAY") is None:
+ assert error_name == "cudaErrorOperatingSystem"
else:
assert error_name in ("cudaErrorInvalidValue", "cudaErrorUnknown") |
That was actually my first commit but I allowed myself to be convinced by Greptile that doing a platform skip was more clearly showing intent. 🤦 |
This change is addressing a test failure when running the graphics api tests on WSL. We simply skip the tests when running on WSL due to the driver lacking pass through capabilities on WSL.