Fix check_kernel_launches.py for macros and provide extended context (#…

…49365) Summary: `check_kernel_launches.py` currently gives a false positive in instances such as: ``` 735: <<<smallIndexGrid, smallIndexBlock, 0, stream>>>( \ 736: outInfo, selfInfo, indicesInfo, \ 737: outSelectDim, selfSelectDim, static_cast<TYPE>(sliceSize), \ 738: selfSelectDimSize); \ 739: C10_CUDA_KERNEL_LAUNCH_CHECK(); ``` because the newlines after the last `\` are not consumed by the regex. This fixes that. In addition, the regex is modified to provide greater context for the start of the kernel launch. This changes the context from: ``` 157: ( 158: size, X_strides, Y_dims, X, Y); ``` to ``` 157: <<<M, CAFFE_CUDA_NUM_THREADS, 0, context->cuda_stream()>>>( 158: size, X_strides, Y_dims, X, Y); ``` Pull Request resolved: #49365 Test Plan: ``` buck test //caffe2/test:kernel_launch_checks -- --print-passing-details ``` Reviewed By: aakshintala Differential Revision: D25545402 Pulled By: r-barnes fbshipit-source-id: 76feac6a002187239853752b892f4517722a77bf
pytorch · Dec 15, 2020 · 39a10fb · 39a10fb
1 parent 25bc906
commit 39a10fb
Show file tree

Hide file tree

Showing 2 changed files with 11 additions and 3 deletions.
diff --git a/test/test_kernel_launch_checks.py b/test/test_kernel_launch_checks.py
@@ -26,9 +26,16 @@ def test_check_code(self):
         """))
 
         # Does it work for macros?
-        self.assertEqual(0, check_code_for_cuda_kernel_launches("""
-#define SOME_MACRO(x) some_function_call<<<1,2>>> ( x ) ;  \\
+        self.assertEqual(0, check_code_for_cuda_kernel_launches(r"""
+#define SOME_MACRO(x) some_function_call<<<1,2>>> ( x ) ;  \
     C10_CUDA_KERNEL_LAUNCH_CHECK();
+
+#define SMALL_INDEX(TENSOR_TYPE, INDICES_TYPE, TYPE, SELF_DIM, SOURCE_DIM, IDX_DIM)  \
+  indexAddSmallIndex<TENSOR_TYPE, INDICES_TYPE, TYPE, SELF_DIM, SOURCE_DIM, IDX_DIM> \
+    <<<smallIndexGrid, smallIndexBlock, 0, stream>>>(                                \
+      selfInfo, sourceInfo, indexInfo,                                               \
+      selfAddDim, sourceAddDim, sliceSize, selfAddDimSize);                          \
+  C10_CUDA_KERNEL_LAUNCH_CHECK();
         """))
 
     def test_check_cuda_launches(self):

diff --git a/torch/testing/check_kernel_launches.py b/torch/testing/check_kernel_launches.py
@@ -18,12 +18,13 @@
 # But this should be sufficient to detect and fix most problem
 # instances and can be refined before the test is made binding
 kernel_launch_regex = re.compile(r"""
-    >>>           # Identifies kernel launch
+    ^.*>>>        # Identifies kernel launch
     \s*           # Maybe some whitespace (includes newlines)
     \([^;]+\);    # And then arguments in parens and semi-colon
     (?!           # Negative lookahead: we trigger if we don't find the launch guard
         \s*                                  # Maybe some whitespace (includes newlines)
         \\?                                  # 0 or 1 backslashes (for launches in preprocessor macros)
+        \s*                                  # Maybe some whitespace (includes newlines)
         (?:[0-9]+: )?                        # Detects and ignores a line numbering, if present
         \s*                                  # Maybe some whitespace (includes newlines)
         C10_CUDA_KERNEL_LAUNCH_CHECK\(\);  # Kernel launch guard!