[AUTOGENERATED] develop_IFU_20260211#2969
Merged
Merged
Conversation
Implements ONNX export for `torch.ops.higher_order.invoke_subgraph`, which is created by `torch.compiler.nested_compile_region`.
Actual function preservation needs update in onnxscript optimizer and version converter to prevent inlining.
## Example
```python
class Model(torch.nn.Module):
def forward(self, x, y):
def inner_fn(a, b):
return torch.mul(a, b) + a
# Function preserved as separate entity in ONNX graph, not inlined (when onnxscript is updated)
return torch.compiler.nested_compile_region(inner_fn)(x, y)
onnx_program = torch.onnx.export(Model(), (x, y), dynamo=True)
```
Replaces pytorch#172715
Fixes pytorch#172459
Pull Request resolved: pytorch#174283
Approved by: https://github.com/titaiwangms
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
On riscv64, installing lintrunner 0.12.7 from sdist fails because its build dependency maturin<0.13 cannot be installed: pip install "maturin>=0.12,<0.13" fails with: BackendUnavailable: Cannot import 'setuptools.build_meta' This can be reproduced on x86 also. Upgrading to maturin >= 1.0 (as done in lintrunner 0.12.11) resolves the issue. Pull Request resolved: pytorch#173658 Approved by: https://github.com/malfet
…173558) **Context** Previously, list / dict comprehensions were treated as a function call and would add a new frame to the Python stack. As a result, if there was a graph break in the comprehension, Dynamo would only skip tracing the comprehension code. In Python 3.12, comprehensions are inlined into their surrounding function, so when we graph break, the entire function is skipped. This PR handles list / dict comprehensions in Dynamo by only skipping tracing for the bytecode related to the comprehension. References - PEP709: https://peps.python.org/pep-0709/ **Solution** 1. Ops BUILD_LIST and BUILD_MAP are always at the beginning of list or dict comprehensions, respectively. When processing these ops in Dynamo, we check if the preceding instructions indicate a comprehension. This is done by `_is_comprehension_start`. `_is_comprehension_start `dynamically retrieves the bytecode prefix for a comprehension using `get_comprehension_bytecode_prefix`. `get_comprehension_bytecode_prefix` builds a dummy list comprehension and gets the associated instruction opnames. 2. If we identify that we are in a comprehension, we check if we can speculate and if we are not in a nested comprehension. If these checks pass and speculation is not failed, we set a checkpoint via speculation. 3. If a graph break is triggered, we handle it in the normal way by restarting tracing. Once we reach the checkpoint set in BUILD_LIST / BUILD_MAP, we handle the graph break in `_handle_comprehension_graph_break`. 4. At a high level, this function compiles the graph up to the comprehension, adds the comprehension bytecode to be run eagerly, generates code to load any locals created in the comprehension, and creates a resume function for code after the comprehension. 5. Handling the comprehesion graph break involves analysis of the bytecode to determine the instruction that ends the comprehension bytecode, the result variable (if there is one), whether the result should stay on the stack, what happens to the result, iterator variables that need to be restored, other locals produced / modified in the comprehension, and vars read from the outer scope. To help with this, we create dataclass `ComprehensionAnalysis` that is returned by `_analyze_comprehension`. This function also dynamically retrieves example bytecode sequences during analysis, ensuring that it is resilient to bytecode changes across Python versions. 6. Finally, we resume tracing as usual. **Edge Cases Handled** 1. Multiple comprehensions with a graph break in only one 7. Multiple comprehensions with graphs breaks in all 8. Comprehension that calls a function that produces a graph break 9. Nested comprehensions with graph breaks 10. Comprehensions with multiple iterators 11. Comprehensions discarded without usage (as opposed to being assigned to a variable) 12. Comprehensions that are used in an expression before being stored to a variable 13. Comprehensions that are directly returned 14. 1 or more Walrus operators (creating side effects) in comprehension 15. Side effects nested in comprehensions. 16. Comprehensions that mutate or read outer variables 17. Comprehensions that mutate or read global variables 18. Comprehensions that modify closure variables 19. List and dict comprehensions together **Edge Cases Unimplemented** 1. Comprehension graph break in resume function with captured variables (e.g. test_torch.py::TestTorchDeviceTypeCPU::test_cauchy_kstest_cpu_bfloat16) 2. Comprehension with captured tensor not in local slot (e.g. test_autograd.py::TestAutograd::test_pickle) **Test Cases** New test cases are added in test_comprehensions.py. These cases test for the production of the correct number of graphs and the correct number of specific operators in each graph. **Misc Notes** 1. One extension of this system is to skip tracing for arbitrary sequences of bytecode such as in loops, try blocks, generic context managers, etc. This code is currently highly specific to comprehensions and would need significant refactoring for this purpose. **Next Steps** 1. Add support for torch._dynamo.config.nested_graph_breaks=True. In the currently implementation, we fall back to skipping the entire frame when nested_graph_breaks=True. As a follow up, we would like to have this functionality supported. 5. Add support for set comprehensions. We currently only support list and dict comprehensions. Fixes pytorch#171822 Pull Request resolved: pytorch#173558 Approved by: https://github.com/williamwen42
Also updated test logic, as OpSchema was replaced by OpSignature for onnx functions. Required for pytorch#165083 Pull Request resolved: pytorch#173828 Approved by: https://github.com/titaiwangms, https://github.com/malfet
Pull Request resolved: pytorch#174213 Approved by: https://github.com/liangel-02
Pull Request resolved: pytorch#174214 Approved by: https://github.com/liangel-02 ghstack dependencies: pytorch#174213
Pull Request resolved: pytorch#174215 Approved by: https://github.com/liangel-02 ghstack dependencies: pytorch#174213, pytorch#174214
Pull Request resolved: pytorch#174216 Approved by: https://github.com/liangel-02 ghstack dependencies: pytorch#174213, pytorch#174214, pytorch#174215
Pull Request resolved: pytorch#174217 Approved by: https://github.com/bdhirsh, https://github.com/atalman ghstack dependencies: pytorch#174213, pytorch#174214, pytorch#174215, pytorch#174216
…els (pytorch#174316) Pull Request resolved: pytorch#174316 Approved by: https://github.com/albanD
…t a security issues (pytorch#174318) Pull Request resolved: pytorch#174318 Approved by: https://github.com/albanD ghstack dependencies: pytorch#174316
Add a test for pytorch#158029 Pull Request resolved: pytorch#174225 Approved by: https://github.com/eqy, https://github.com/galv, https://github.com/eellison, https://github.com/BoyuanFeng
…172160) The `same_meta` function was missing checks for `is_conj()` and `is_neg()` tensor flags. This caused `remove_noop_ops` to incorrectly remove `clone` operations that were resolving conjugation (from `resolve_conj()`). When complex convolution is compiled, the C++ implementation calls `resolve_conj()` before `view_as_real()`. The `resolve_conj()` traces to a `clone` operation. Without the conjugate bit check, this clone was being removed as a "no-op", causing `view_as_real` to be called on a still-conjugated tensor, which fails with: "view_as_real doesn't work on unresolved conjugated tensors" Added regression tests: - test_complex_real_imag_conj: tests real/imag extraction from conj tensors - test_complex_conv2d_conj: tests complex convolution with conj inputs Fixes pytorch#171665 Pull Request resolved: pytorch#172160 Approved by: https://github.com/eellison
Fixes pytorch#134173 NOTE: Uncommenting the following https://github.com/pytorch/pytorch/blob/d8039170f00cf084e4af91f1db84497bfccdf149/test/inductor/test_compiled_autograd.py#L5215 https://github.com/pytorch/pytorch/blob/d8039170f00cf084e4af91f1db84497bfccdf149/test/inductor/test_compiled_autograd.py#L5313 and running `python test/inductor/test_compiled_autograd.py TestAutogradWithCompiledAutograd.test_graph_save_on_cpu` fails for a different reason ``` torch._dynamo.exc.Unsupported: Attempted to call function marked as skipped Explanation: Dynamo developers have intentionally marked that the function `save_on_cpu.__init__.<locals>.unpack_from_cpu` in file `/opt/pytorch/pytorch/torch/autograd/graph.py` should not be traced. Hint: Avoid calling the function `save_on_cpu.__init__.<locals>.unpack_from_cpu`. Hint: Apply `@torch._dynamo.dont_skip_tracing` to the function `save_on_cpu.__init__.<locals>.unpack_from_cpu` to force tracing into the function. More graph breaks may occur as a result of attempting to trace into the function. Hint: Please file an issue to PyTorch. ``` Pull Request resolved: pytorch#172578 Approved by: https://github.com/ezyang
Changes: Add launch_pdl: True to combo kernel triton_meta when PDL is enabled Fix missing shape=() parameter in _handle_pdl_after_load() Add tests for PDL + combo kernel integration See, example kernel: https://gist.github.com/eellison/50fea54d1096b0ece3c97f6e8ee02d5b written with claude Pull Request resolved: pytorch#174232 Approved by: https://github.com/karthickai, https://github.com/v0i0
# Motivation Move EmptyTensor to PyTorch for better maintenance. # Additional Context The pin commit intel/torch-xpu-ops@83c9813 is from a viable strict [branch](https://github.com/intel/torch-xpu-ops/commits/viable/strict/). The flow is to first land this PR, then land intel/torch-xpu-ops#2836, and finally update the pin commit from the main branch. Pull Request resolved: pytorch#174194 Approved by: https://github.com/EikanWang
Fixes pytorch#173995 Pull Request resolved: pytorch#174009 Approved by: https://github.com/malfet, https://github.com/ngimel, https://github.com/albanD
lintrunner now provides official riscv64 wheels from 0.13.0, so it can be safely enabled on riscv64 Pull Request resolved: pytorch#173993 Approved by: https://github.com/Skylion007, https://github.com/cyyever
… for external template buffers (pytorch#174148) Design doc: pytorch/helion#1346. Add two extension points for external template buffers (e.g. Helion kernel): - `codegen_template_override()` in SIMDKernel - allows custom template code generation - `emit_kernel_override()` in Kernel - allows custom kernel emission to wrapper These hooks enable external template buffers to integrate with Inductor's template fusion without modifying core Inductor code. After this PR, we will add Helion dynamo variable and HOP handling in pytorch/helion#1351. Pull Request resolved: pytorch#174148 Approved by: https://github.com/jansel
…#174077) Addresses the TODO in `test_local_tensor.py` by adding view ops testing for LocalTensor Pull Request resolved: pytorch#174077 Approved by: https://github.com/dzmitry-huba
fixes pytorch#166387 As pytorch moved to [new API](https://docs.pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices) for tf32, many like transformer started using them. It seems pytorch inductor is still using old allow_tf32, so when new API is invoked and read happens for old API we see error like ERROR: PyTorch is checking whether allow_tf32_new is enabled for cuBlas matmul,Current status indicate that you have used mix of the legacy and new APIs to set the TF32 status for cublas matmul. We suggest only using the new API to set the TF32 flag. See also: https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices My PR is addressing this by using new API in inductor. currently I have only made changes pytorch issue lined above and transofrmer, you can see traceback in [comment](huggingface/transformers#42371 (comment)) section here, huggingface/transformers#42371 Can I can start changing more allow allow_tf32 in inductor Pull Request resolved: pytorch#173731 Approved by: https://github.com/jansel, https://github.com/isuruf Co-authored-by: Isuru Fernando <isuruf@gmail.com>
This fixes pytorch#173879 by using the proposed formula and indicating the shape of the result. Examples showing that the shape indication is correct: ```python >>> import torch >>> a = torch.randn(2, 3, 4, 5, 6) >>> b = torch.randn(5, 6, 7) >>> torch.tensordot(a, b, dims=2).shape torch.Size([2, 3, 4, 7]) >>> a.shape[:-2] torch.Size([2, 3, 4]) >>> b.shape[2:] torch.Size([7]) >>> a = torch.randn(2, 3, 4) >>> b = torch.randn(2, 3, 4, 5, 6) >>> torch.tensordot(a, b, dims=3).shape torch.Size([5, 6]) >>> a.shape[:-3] torch.Size([]) >>> b.shape[3:] torch.Size([5, 6]) ``` Pull Request resolved: pytorch#173893 Approved by: https://github.com/mikaylagawarecki
As the title Pull Request resolved: pytorch#173199 Approved by: https://github.com/atalman, https://github.com/malfet
Use torch._check instead of direct comparison in squareCheckInputs to defer validation to runtime for unbacked symbolic dimensions. Also use sym_min/sym_max in linalg_lu_factor_ex_meta and make_contiguous_strides_for to handle symbolic dimensions properly. This enables the following 18 ops to work with unbacked symbolic dimensions: - cholesky_inverse - linalg.cholesky, linalg.cholesky_ex - linalg.det, linalg.slogdet - linalg.eig, linalg.eigh, linalg.eigvals, linalg.eigvalsh - linalg.inv, linalg.inv_ex - linalg.ldl_factor, linalg.ldl_factor_ex - linalg.lu_factor, linalg.lu_factor_ex - lu, triangular_solve - matrix_exp Pull Request resolved: pytorch#173399 Approved by: https://github.com/aorenste
…ests (pytorch#171625) Otherwise some paddings seem to fail pattern-match Pull Request resolved: pytorch#171625 Approved by: https://github.com/ngimel, https://github.com/eellison
…k_size (pytorch#174285) # Motivation Fix pytorch#174268 introduced by pytorch#171671, which breaks XPU CI. Pull Request resolved: pytorch#174285 Approved by: https://github.com/desertfire, https://github.com/jansel
… inputs (pytorch#174334) When caching an AOTAutograd entry for a model where an output is a view of an input with dynamic shapes, pickle fails because view_meta_sequence contains SymInt references that create a chain to unpicklable objects (WeakValueDictionary). The fix clears view_meta_sequence in make_runtime_safe() when it has symbolic inputs. This is safe because gen_alias_from_base() already skips view replay for symbolic inputs and falls back to as_strided(). This PR was authored with Claude. Fixes: pytorch#174299 Pull Request resolved: pytorch#174334 Approved by: https://github.com/aorenste
Reduced Dynamo compile time from 14.71 seconds to 13.896 seconds. 1) Cache only on Source object - makes lookup faster 2) Extend the variable tracker cache to lazy variable trackers. Earlier, we were creating duplicate copies of VT, and unnecessary calling the __call__ method of LazyVT to construct the variable tracker many times. Now, the cache just returns the cached lazy VT, and if its realized, we just use the realized VT. Pull Request resolved: pytorch#174242 Approved by: https://github.com/Lucaskabela, https://github.com/williamwen42
Fixes pytorch#174296. Pull Request resolved: pytorch#174300 Approved by: https://github.com/atalman, https://github.com/malfet
Differential Revision: D92291036 Pull Request resolved: pytorch#174302 Approved by: https://github.com/zhxchen17
Purely claude-coded using metal-kernel writing skill. Performance comparison collected using `python test/bench_mps_ops.py grid_sampler_2d` | Benchmark | MPSGraph (us) | Metal Shader (us) | |---|---|---| | grid_sample-bilinear-64x64 (torch.float16) | 114.7 | 120.0 | | grid_sample-bilinear-128x128 (torch.float16) | 180.4 | 151.1 | | grid_sample-bilinear-256x256 (torch.float16) | 423.5 | 364.9 | | grid_sample-bilinear-512x512 (torch.float16) | 2393.1 | 1145.3 | | grid_sample-nearest-64x64 (torch.float16) | 107.7 | 112.3 | | grid_sample-nearest-128x128 (torch.float16) | 131.6 | 124.3 | | grid_sample-nearest-256x256 (torch.float16) | 215.1 | 204.2 | | grid_sample-nearest-512x512 (torch.float16) | 1089.2 | 565.0 | | grid_sample-bilinear-64x64 (torch.float32) | 117.4 | 139.5 | | grid_sample-bilinear-128x128 (torch.float32) | 165.4 | 188.9 | | grid_sample-bilinear-256x256 (torch.float32) | 462.0 | 398.8 | | grid_sample-bilinear-512x512 (torch.float32) | 4311.3 | 1483.5 | | grid_sample-nearest-64x64 (torch.float32) | 113.6 | 100.3 | | grid_sample-nearest-128x128 (torch.float32) | 134.6 | 122.1 | | grid_sample-nearest-256x256 (torch.float32) | 263.4 | 208.6 | | grid_sample-nearest-512x512 (torch.float32) | 2289.0 | 896.6 | | grid_sample-bilinear-64x64 (torch.bfloat16) | 114.3 | 132.9 | | grid_sample-bilinear-128x128 (torch.bfloat16) | 152.4 | 182.5 | | grid_sample-bilinear-256x256 (torch.bfloat16) | 343.4 | 369.3 | | grid_sample-bilinear-512x512 (torch.bfloat16) | 2333.9 | 1155.2 | | grid_sample-nearest-64x64 (torch.bfloat16) | 107.5 | 106.1 | | grid_sample-nearest-128x128 (torch.bfloat16) | 130.4 | 114.0 | | grid_sample-nearest-256x256 (torch.bfloat16) | 211.9 | 190.3 | | grid_sample-nearest-512x512 (torch.bfloat16) | 795.9 | 540.7 | TODOs: - Code sharing for interpolation mode between upsample and grid-sampler Fixes pytorch#174339 and pytorch#125098 Pull Request resolved: pytorch#174343 Approved by: https://github.com/manuelcandales ghstack dependencies: pytorch#174676, pytorch#174677, pytorch#174678
Pull Request resolved: pytorch#174606 Approved by: https://github.com/jansel
As the title. Pull Request resolved: pytorch#174059 Approved by: https://github.com/EikanWang, https://github.com/albanD
Pull Request resolved: pytorch#173481 Approved by: https://github.com/soulitzer, https://github.com/fegin
Add XPU_DRIVER activity to the profiler so it reports XPU L0 driver activities. It is counterpart to CUDA_DRIVER activity. Updates the third_party/kineto submodule. Add test. Pull Request resolved: pytorch#172940 Approved by: https://github.com/guangyey, https://github.com/sraikund16
As the title suggests, for better documentation. Pull Request resolved: pytorch#174453 Approved by: https://github.com/EikanWang
…ch#174705) This is to fix the CI: [https://github.com/pytorch/pytorch/actions/runs/21844168344/job/63036440860?pr=174628](https://www.google.com/url?sa=D&q=https%3A%2F%2Fgithub.com%2Fpytorch%2Fpytorch%2Factions%2Fruns%2F21844168344%2Fjob%2F63036440860%3Fpr%3D174628) Pull Request resolved: pytorch#174705 Approved by: https://github.com/oulgen
fix pytorch#168329 Pull Request resolved: pytorch#174675 Approved by: https://github.com/Skylion007
More optimizations will follow ! This one is simple: if we are evaluating a+b+c+... >0 and all terms are symbols/constants with var range >0 then return true before calling into expensive static evaluator. ***results*** export time 5m4.868s -> 3m4.165s (two minutes saved) Pull Request resolved: pytorch#174615 Approved by: https://github.com/Lucaskabela
…4610) There is an interesting use case I need to call out here: FlexAttention BlockMask's pytree registration contains arbitrary user defined mask_mod function. This gets problematic when we are exporting via dynamo_graph_capture_for_export because we re-run the model code multiple times where the output bytecode contains a logic to reconstruct user defined mask_mod. This doesn't work with aot_export's pytree thunkify logic as it would receive an spec that has different id for the mask_mod (because we reconstructed multiple times). This was not a problem for torch.compile because we always just re-run the inner graph module without inp/out processing. I think this is a result of our independent API's working correctly but the integration point between them is little awkward. (torch IR API + aot_autograd) The way we fix it is we wrap the user defined function with _MaskMod wrapper that does value based checking instead of identity so that two different reconstructions of mask_mod still returns True. I had to special case _MaskMod for the old export path since torch.export.export is still on the _dynamo_graph_capture_for_export. Pull Request resolved: pytorch#174610 Approved by: https://github.com/zhxchen17, https://github.com/drisspg
fix hipify import Differential Revision: D92366141 Pull Request resolved: pytorch#174706 Approved by: https://github.com/drisspg, https://github.com/shunting314, https://github.com/mlazos
Update the torch-xpu-ops commit to [intel/torch-xpu-ops@077a6c](intel/torch-xpu-ops@077a6ce), includes: - Adjust layer_norm_backward_kernel interface to match that of PyTorch - Fix incorrect Tensor Size for NestedTensor QKV Transform - Support calling oneCCL AllToAll API directly - Add NaN input checks to prevent false singular matrix errors in oneMKL linear algebra operations Pull Request resolved: pytorch#174591 Approved by: https://github.com/EikanWang
Pull Request resolved: pytorch#174466 Approved by: https://github.com/Skylion007, https://github.com/zpcore, https://github.com/wconstab
The recursion limit has to be unset before exitting `subTest` or it may fail inside of pytest due to the low limit set in the test. Pull Request resolved: pytorch#174693 Approved by: https://github.com/Lucaskabela, https://github.com/Skylion007
copies the step in _sharding_prop.py to properly cache? https://github.com/pytorch/pytorch/blob/ed0b1fec7e3b9e3b8d767506696506059b1ad2b0/torch/distributed/tensor/_sharding_prop.py#L500 Pull Request resolved: pytorch#174616 Approved by: https://github.com/wconstab
…74447) Pull Request resolved: pytorch#174447 Approved by: https://github.com/jansel ghstack dependencies: pytorch#173685
Pull Request resolved: pytorch#174155 Approved by: https://github.com/jansel ghstack dependencies: pytorch#174154
This PR fixes `RuntimeError: CUDA driver error: invalid argument` when combo kernels have large ynumels that exceed grid.y limit. Added y/z grid overflow handling similar to `Grid2DWithYZOverflow` This issue happens when `combo_kernel_per_subkernel_blocks = False` (which is False by default). After the flatten dispatch PR pytorch#172527 is added, `combo_kernel_per_subkernel_blocks = True` will make this issue obsolete. Pull Request resolved: pytorch#174354 Approved by: https://github.com/mlazos
…170575) Pull Request resolved: pytorch#170575 Approved by: https://github.com/eellison
…#174533) Differential Revision: D92629416 Support tlparse's fx_graph_runnable with nested user defined triton kernels and constexprs. Also fixes some edge cases with user defined triton kernels. Pull Request resolved: pytorch#174533 Approved by: https://github.com/eellison
This converts NanCheck into an op so it can be used from outside of ProcessGroupNCCL. This can be used from torchcomms.
Misc changes:
* add CPU implementation
* use CUDA_KERNEL_ASSERT macro so it logs a more helpful message when nancheck fires
Test plan:
CI
```
$ python -c "import torch; torch.ops.c10d.check_for_nan(torch.tensor(float('nan'), device='cuda')); torch.cuda.synchronize()" (pytorch-3.12)
/home/tristanr/pytorch/torch/csrc/distributed/c10d/NanCheck.cu:217: checkForNaN: block: [0,0,0], thread: [0,0,0] Assertion `!isnan(tailPtr[threadIdx.x])` failed.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/tristanr/pytorch/torch/cuda/__init__.py", line 1165, in synchronize
return torch._C._cuda_synchronize()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: CUDA error: device-side assert triggered
Search for `cudaErrorAssert' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```
Pull Request resolved: pytorch#174736
Approved by: https://github.com/kwen2501, https://github.com/allenwang28
Changes: Add launch_pdl: True to combo kernel triton_meta when PDL is enabled Fix missing shape=() parameter in _handle_pdl_after_load() Add tests for PDL + combo kernel integration See, example kernel: https://gist.github.com/eellison/50fea54d1096b0ece3c97f6e8ee02d5b written with claude Pull Request resolved: pytorch#174232 Approved by: https://github.com/karthickai, https://github.com/v0i0
# Conflicts: # .ci/docker/requirements-ci.txt # requirements-build.txt # torch/utils/hipify/cuda_to_hip_mappings.py
|
Jenkins build for 241aa87f0fde758bc85bd988fb3812d02a1f43a2 commit finished as FAILURE |
|
Jenkins build for 3ee04a9830bea722779f6591ffb9a2386afcfc14 commit finished as FAILURE |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
rocm_base: fe101ec