Skip to content

Conversation

@tomerqodo
Copy link

@tomerqodo tomerqodo commented Nov 20, 2025

User description

Splits each torch library registration in the 2.10 folder into its own file -- I had a script that parsed kernel.cpp to do this but I felt like forcing this responsibility on the user might be less error prone

Compiles each file targetting 2.9 and asserts that compilation fails. (There are 2 2.9 kernels we use as negative tests where compilation is expected to succeed)

Stack from ghstack (oldest at bottom):


PR Type

Tests, Enhancement


Description

  • Refactored monolithic kernel.cpp into individual function files

  • Created comprehensive test suite validating PyTorch 2.10+ version guards

  • Added negative tests to verify backward compatibility with 2.9.0

  • Implemented dynamic test generation for all C++ and CUDA source files


Diagram Walkthrough

flowchart LR
  A["kernel.cpp<br/>monolithic file"] -->|split| B["Individual .cpp files<br/>one per function"]
  B -->|tested by| C["test_version_compatibility.py"]
  C -->|compiles with<br/>TORCH_TARGET_VERSION=2.9| D["Expected: Compilation fails<br/>Actual: Validates 2.10+ guards"]
  C -->|negative tests| E["mv_tensor_accessor_cpu.cpp<br/>mv_tensor_accessor_cuda.cu<br/>Expected: Compilation succeeds"]
Loading

File Walkthrough

Relevant files
Tests
3 files
test_version_compatibility.py
New test suite for version compatibility validation           
+300/-0 
mv_tensor_accessor_cpu.cpp
Added negative test file for 2.9 compatibility                     
+40/-0   
mv_tensor_accessor_cuda.cu
Added negative test CUDA file for 2.9 compatibility           
+47/-0   
Refactoring
16 files
kernel.cpp
Deleted monolithic kernel file                                                     
+0/-205 
my__foreach_mul.cpp
Extracted foreach_mul function into separate file               
+20/-0   
my__foreach_mul_.cpp
Extracted foreach_mul_ function into separate file             
+19/-0   
make_tensor_clones_and_call_foreach.cpp
Extracted make_tensor_clones_and_call_foreach function     
+41/-0   
my_empty.cpp
Extracted my_empty function into separate file                     
+25/-0   
my_reshape.cpp
Extracted my_reshape function into separate file                 
+17/-0   
my_view.cpp
Extracted my_view function into separate file                       
+20/-0   
test_tensor_device.cpp
Extracted test_tensor_device function into separate file 
+17/-0   
test_device_constructor.cpp
Extracted test_device_constructor function into separate file
+37/-0   
test_device_equality.cpp
Extracted test_device_equality function into separate file
+14/-0   
test_device_set_index.cpp
Extracted test_device_set_index function into separate file
+17/-0   
test_device_index.cpp
Extracted test_device_index function into separate file   
+14/-0   
test_device_is_cuda.cpp
Extracted test_device_is_cuda function into separate file
+14/-0   
test_device_is_cpu.cpp
Extracted test_device_is_cpu function into separate file 
+14/-0   
test_get_num_threads.cpp
Extracted test_get_num_threads function into separate file
+14/-0   
test_parallel_for.cpp
Extracted test_parallel_for function into separate file   
+49/-0   
Enhancement
1 files
tensor_accessor_kernel.h
Added shared tensor accessor kernel header                             
+28/-0   

@qodo-merge-pro
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No auditing: The new test and helper methods perform compilations and environment checks without
emitting any audit logs of critical actions, though this may be acceptable for test code
rather than production paths.

Referred Code
import os
import subprocess
import tempfile
from pathlib import Path

from torch.testing._internal.common_utils import run_tests, TestCase
from torch.utils.cpp_extension import CUDA_HOME, include_paths as torch_include_paths


class FunctionVersionCompatibilityTest(TestCase):
    """Test that all function files require PyTorch 2.10+."""

    @classmethod
    def setUpClass(cls):
        """Set up test environment once for all tests."""
        cls.csrc_dir = Path(__file__).parent / "libtorch_agnostic_2_10" / "csrc"
        cls.build_dir = Path(tempfile.mkdtemp(prefix="version_check_"))

        cls.pytorch_includes = [
            f"-I{path}" for path in torch_include_paths(device_type="cpu")
        ]


 ... (clipped 122 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Subprocess errors: Subprocess compilation errors are captured but only parsed for select strings and surfaced
via assertions, which may be sufficient for tests but lacks comprehensive handling or
retries for timeouts and CUDA absence.

Referred Code
    result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)

    if result.returncode == 0:
        return True, ""
    else:
        return False, result.stderr

def _compile_cu_file(
    self, source_file: Path, output_file: Path
) -> tuple[bool, str]:
    """
    Compile a CUDA file with TORCH_TARGET_VERSION=2.9.0.
    Returns (success, error_message).
    """
    if not CUDA_HOME:
        return False, "CUDA_HOME not set"

    torch_version_2_9 = "0x0209000000000000"

    cmd = [
        os.path.join(CUDA_HOME, "bin", "nvcc"),


 ... (clipped 17 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-merge-pro
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Consider a more robust compilation approach

Replace direct g++ and nvcc subprocess calls with the more robust
torch.utils.cpp_extension.load or build_extension API. This change will improve
the compilation test's portability by abstracting away compiler specifics and
environment dependencies.

Examples:

test/cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility.py [66-96]
    def _compile_cpp_file(
        self, source_file: Path, output_file: Path
    ) -> tuple[bool, str]:
        """
        Compile a C++ file with TORCH_TARGET_VERSION=2.9.0.
        Returns (success, error_message).
        """
        torch_version_2_9 = "0x0209000000000000"

        cmd = [

 ... (clipped 21 lines)
test/cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility.py [97-127]
    def _compile_cu_file(
        self, source_file: Path, output_file: Path
    ) -> tuple[bool, str]:
        """
        Compile a CUDA file with TORCH_TARGET_VERSION=2.9.0.
        Returns (success, error_message).
        """
        if not CUDA_HOME:
            return False, "CUDA_HOME not set"


 ... (clipped 21 lines)

Solution Walkthrough:

Before:

# test_version_compatibility.py

class FunctionVersionCompatibilityTest(TestCase):
    def _compile_cpp_file(self, source_file, output_file):
        # Manually construct g++ command with hardcoded paths
        cmd = [
            "g++",
            "-c",
            f"-DTORCH_TARGET_VERSION=...",
            *self.pytorch_includes,
            str(source_file),
        ]
        result = subprocess.run(cmd, ...)
        return result.returncode == 0, result.stderr

    def _test_function_file(self, source_file):
        success, error_msg = self._compile_cpp_file(...)
        self.assertFalse(success, "Compilation should fail")

After:

# test_version_compatibility.py
from torch.utils.cpp_extension import load

class FunctionVersionCompatibilityTest(TestCase):
    def _test_function_file(self, source_file):
        extra_cflags = [f"-DTORCH_TARGET_VERSION=..."]
        
        try:
            load(
                name="version_check_module",
                sources=[str(source_file)],
                extra_cflags=extra_cflags,
                extra_cuda_cflags=extra_cflags,
                verbose=False
            )
            self.fail("Compilation should fail but succeeded")
        except Exception:
            # Expected compilation failure
            pass
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that hardcoding compiler commands via subprocess is brittle and proposes using the torch.utils.cpp_extension API, which is a significant improvement for the test's robustness and portability.

Medium
Possible issue
Avoid race conditions in tests

In _test_function_file, change the object file naming from f"{func_name}.o" to
f"{source_file.name}.o" to prevent race conditions between files with the same
stem but different extensions.

test/cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility.py [128-141]

 def _test_function_file(self, source_file: Path):
     """Test that a function file fails to compile with TORCH_TARGET_VERSION=2.9.0."""
     func_name = source_file.stem
-    obj_file = self.build_dir / f"{func_name}.o"
+    obj_file = self.build_dir / f"{source_file.name}.o"
 
     # Choose the appropriate compiler based on file extension
     if source_file.suffix == ".cu":
         if not self.cuda_available:
             self.skipTest(f"CUDA not available, skipping {source_file.name}")
         success, error_msg = self._compile_cu_file(source_file, obj_file)
     else:
         success, error_msg = self._compile_cpp_file(source_file, obj_file)
 
     obj_file.unlink(missing_ok=True)
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a potential race condition where tests for files with the same stem (e.g., .cpp and .cu) would overwrite the same object file, making the test suite more robust.

Medium
Use unique object file names

In test_mv_tensor_accessor_cuda_works_with_2_9, change the hardcoded object file
name cuda_kernel.o to a unique name like mv_tensor_accessor_cuda.o to avoid
potential race conditions.

test/cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility.py [195-217]

 def test_mv_tensor_accessor_cuda_works_with_2_9(self):
     """Test that mv_tensor_accessor_cuda.cu compiles successfully with 2.9.0.
 
     This is a negative test - it ensures that a .cu file we expect to work with 2.9.0
     actually does compile. This validates that our test infrastructure correctly
     compiles CUDA files and distinguishes between files that require 2.10+ and those
     that don't.
     """
     if not self.cuda_available:
         self.skipTest(
             "CUDA not available, skipping mv_tensor_accessor_cuda.cu test"
         )
 
     cu_file = self.csrc_dir / "mv_tensor_accessor_cuda.cu"
 
     if not cu_file.exists():
         self.skipTest(f"{cu_file} not found - this is a test file only")
 
-    obj_file = self.build_dir / "cuda_kernel.o"
+    obj_file = self.build_dir / "mv_tensor_accessor_cuda.o"
     success, error_msg = self._compile_cu_file(cu_file, obj_file)
 
     # Clean up
     obj_file.unlink(missing_ok=True)
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that a hardcoded object file name can cause race conditions, and proposes using a unique name derived from the source file, which improves test reliability.

Medium
Remove unnecessary CUDA include paths

In _compile_cpp_file, remove the logic that adds CUDA include paths to the g++
command, as these should only be used by nvcc for .cu files.

test/cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility.py [66-95]

 def _compile_cpp_file(
     self, source_file: Path, output_file: Path
 ) -> tuple[bool, str]:
     """
     Compile a C++ file with TORCH_TARGET_VERSION=2.9.0.
     Returns (success, error_message).
     """
     torch_version_2_9 = "0x0209000000000000"
 
     cmd = [
         "g++",
         "-c",
         "-std=c++17",
         f"-DTORCH_TARGET_VERSION={torch_version_2_9}",
         f"-I{source_file.parent}",  # For includes in same directory
         *self.pytorch_includes,
     ]
 
-    # Add CUDA flags if available
-    if self.cuda_available:
-        cmd.extend(self.cuda_includes)
-
     cmd.extend([str(source_file), "-o", str(output_file)])
 
     result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
 
     if result.returncode == 0:
         return True, ""
     else:
         return False, result.stderr
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that adding CUDA include paths to the g++ compilation command for .cpp files is incorrect and can lead to build failures, improving the test's robustness.

Low
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants