Use weakref in storing tensors as keys (follow-up to #111470) #112076

pearu · 2023-10-25T20:32:43Z

This PR addresses the discussion items in #111470 (comment), that is,

use weakref when storing tensors as keys,
add storage_offset to the key data,
and revise the description of the TensorAsKey utility.

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2023-10-25T20:32:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112076

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4902f95 with merge base f5088d2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…)" This PR addresses the discussion items in #111470 (comment), that is, - use weakref when storing tensors as keys, - add `storage_offset` to the key data, - and revise the description of the `TensorAsKey` utility. [ghstack-poisoned]

ghstack-source-id: 387ab65 Pull Request resolved: #112076

albanD · 2023-10-26T19:47:28Z

torch/sparse/_triton_ops.py

        a = non_zero_row_indices * (Ms * N)
        r_offsets = (a + b).view(-1)
-        c_indices = crow_indices
+        # crow_indices consitutes a part of a key in lru_cache as well


I'm confused about what this means tbh.

The definition of _bsr_scatter_mm_indices_data function is:

@lru_cache(maxsize=128) def _bsr_scatter_mm_indices_data(indices_format, M, K, N, Ms, Ks, nbatches, SPLIT_N, crow_indices_as_key, col_indices_as_key): ...

which means that a key to LRU cache constitutes of all arguments to the function:

lru_cache_key = (indices_format, M, K, N, Ms, Ks, nbatches, SPLIT_N, crow_indices_as_key, col_indices_as_key)

while the value of the LRU cache is the result of the function. Notice that the argument crow_indices_as_key is TensorAsKey(crow_indices).
On the other hand, the result of the function is a tuple (indices_format, c_indices, r_offsets, q_offsets) where originally c_indices was crow_indices.

So, this means that the tensor crow_indices is stored both in the key as well as in the value of LRU cache. Now, when the crow_indices variable goes out of scope, the corresponding tensor ought to be garbage collected (assuming that crow_indices appears as weakref instance in the key). However, the tensor cannot be garbage collected because it is referenced by the value of the LRU cache. Therefore, the crow_indices tensor must be cloned so that garbage collection would be effective when the crow_indices variable goes out of the scope, and later, when the LRU cache will overfill, the cloned tensor will be cleaned up as well.

Hmm, after thinking about it, I think the crow_indices tensor in the value part of the LRU cache ought to be weakref as well. Then we don't have to wait for an overfill for the tensor to be garbage collected. I'll fix it unless I missed something right now..

This comment is now removed. It turns out that it is not sufficient to use crow_indices and col_indices as keys because these have shorter lifetime than the original sparse compressed tensor. Now the original tensor is wrapped with TensorAsKey to make sure that the cached item exists in the life time of the original tensor.

…)" This PR addresses the discussion items in #111470 (comment), that is, - use weakref when storing tensors as keys, - add `storage_offset` to the key data, - and revise the description of the `TensorAsKey` utility. [ghstack-poisoned]

ghstack-source-id: 8045429 Pull Request resolved: #112076

…)" This PR addresses the discussion items in #111470 (comment), that is, - use weakref when storing tensors as keys, - add `storage_offset` to the key data, - and revise the description of the `TensorAsKey` utility. [ghstack-poisoned]

ghstack-source-id: 0e4f72b Pull Request resolved: #112076

…)" This PR addresses the discussion items in #111470 (comment), that is, - use weakref when storing tensors as keys, - add `storage_offset` to the key data, - and revise the description of the `TensorAsKey` utility. [ghstack-poisoned]

ghstack-source-id: 112bf9e Pull Request resolved: #112076

…2337) This PR fixes ``` RuntimeError: Triton Error [CUDA]: an illegal memory access was encountered ``` that appears when using large non-contiguous tensor arguments in `scatter_mm` kernel launch. Pull Request resolved: #112337 Approved by: https://github.com/cpuhrsch ghstack dependencies: #112154, #112076

…ytorch#112076) This PR addresses the discussion items in pytorch#111470 (comment), that is, - use weakref when storing tensors as keys, - add `storage_offset` to the key data, - and revise the description of the `TensorAsKey` utility. Pull Request resolved: pytorch#112076 Approved by: https://github.com/cpuhrsch ghstack dependencies: pytorch#112154

…orch#112337) This PR fixes ``` RuntimeError: Triton Error [CUDA]: an illegal memory access was encountered ``` that appears when using large non-contiguous tensor arguments in `scatter_mm` kernel launch. Pull Request resolved: pytorch#112337 Approved by: https://github.com/cpuhrsch ghstack dependencies: pytorch#112154, pytorch#112076

…ytorch#112076) This PR addresses the discussion items in pytorch#111470 (comment), that is, - use weakref when storing tensors as keys, - add `storage_offset` to the key data, - and revise the description of the `TensorAsKey` utility. Pull Request resolved: pytorch#112076 Approved by: https://github.com/cpuhrsch ghstack dependencies: pytorch#112154

…orch#112337) This PR fixes ``` RuntimeError: Triton Error [CUDA]: an illegal memory access was encountered ``` that appears when using large non-contiguous tensor arguments in `scatter_mm` kernel launch. Pull Request resolved: pytorch#112337 Approved by: https://github.com/cpuhrsch ghstack dependencies: pytorch#112154, pytorch#112076

Use weakref in storing tensors as keys (follow-up to #111470)

8bfe69e

[ghstack-poisoned]

pytorch-bot bot added release notes: sparse release notes category labels Oct 25, 2023

pytorchbot added the open source label Oct 25, 2023

pearu added the topic: not user facing topic category label Oct 25, 2023

pearu mentioned this pull request Oct 25, 2023

Use lru_cache to cache indices data for bsr_scatter_mm. #111470

Closed

cpuhrsch requested a review from albanD October 25, 2023 21:01

pearu added a commit that referenced this pull request Oct 26, 2023

Use weakref in storing tensors as keys (follow-up to #111470)

5c1cc34

ghstack-source-id: 387ab65 Pull Request resolved: #112076

pearu mentioned this pull request Oct 26, 2023

Eliminate try-catch block around triton::_triton_bsr_dense_mm_out call. #112154

Closed

albanD reviewed Oct 26, 2023

View reviewed changes

pearu added a commit that referenced this pull request Oct 27, 2023

Use weakref in storing tensors as keys (follow-up to #111470)

38a19f2

ghstack-source-id: 8045429 Pull Request resolved: #112076

pearu added a commit that referenced this pull request Oct 27, 2023

Use weakref in storing tensors as keys (follow-up to #111470)

1414ef8

ghstack-source-id: 0e4f72b Pull Request resolved: #112076

pearu added a commit that referenced this pull request Oct 29, 2023

Use weakref in storing tensors as keys (follow-up to #111470)

c2fe01e

ghstack-source-id: 112bf9e Pull Request resolved: #112076

pearu mentioned this pull request Oct 29, 2023

Fix scatter_mm kernel failure on non-contiguous tensor arguments #112337

Closed

pearu requested a review from albanD October 29, 2023 09:53

pearu mentioned this pull request Oct 30, 2023

Memory leak from bsr_scatter_mm_indices_data argument cache #112301

Closed

cpuhrsch approved these changes Oct 30, 2023

View reviewed changes

pytorchmergebot added the Merged label Oct 30, 2023

pytorchmergebot closed this in cf6041e Oct 30, 2023

facebook-github-bot deleted the gh/pearu/126/head branch November 3, 2023 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use weakref in storing tensors as keys (follow-up to #111470) #112076

Use weakref in storing tensors as keys (follow-up to #111470) #112076

Uh oh!

pearu commented Oct 25, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 25, 2023 •

edited

Loading

Uh oh!

albanD Oct 26, 2023

Uh oh!

pearu Oct 26, 2023

Uh oh!

pearu Oct 26, 2023

Uh oh!

pearu Oct 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Use weakref in storing tensors as keys (follow-up to #111470) #112076

Use weakref in storing tensors as keys (follow-up to #111470) #112076

Uh oh!

Conversation

pearu commented Oct 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112076

✅ No Failures

Uh oh!

albanD Oct 26, 2023

Choose a reason for hiding this comment

Uh oh!

pearu Oct 26, 2023

Choose a reason for hiding this comment

Uh oh!

pearu Oct 26, 2023

Choose a reason for hiding this comment

Uh oh!

pearu Oct 29, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pearu commented Oct 25, 2023 •

edited

Loading

pytorch-bot bot commented Oct 25, 2023 •

edited

Loading