Skip to content

Conversation

amjames
Copy link
Collaborator

@amjames amjames commented Aug 9, 2022

Stack from ghstack (oldest at bottom):

Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for dim()>= 3 inputs and does not change behavior in any way.

cc @alexsamardzic @nikitaved @pearu @cpuhrsch @bhosmer

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 9, 2022

🔗 Helpful links

❌ 1 New Failures

As of commit e1cfaa6 (more details on the Dr. CI page):

Expand to see more
  • 1/1 failures introduced in this PR

🕵️‍♀️ 1 failure not recognized by patterns:

The following CI failures may be due to changes from the PR
Job Step
CircleCI Checks build Unknown

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@amjames amjames requested review from bhosmer and nikitaved August 9, 2022 17:17
@amjames amjames added the module: sparse Related to torch.sparse label Aug 9, 2022
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
Copy link

@bhosmer bhosmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just a perf improvement but (going by the deleted comments) also relaxes the same-sparsity-pattern restriction, right? Which if so is as big a deal as decent perf I'd say

// This requirement is not included in Pearu's blog post on BSR invariants.
// He specifically states that different batches may have different sparsity
// patterns as long as the number of specified elements is the same for all
// If the input is n-d we require that nnz is the same for all
Copy link
Collaborator Author

@amjames amjames Aug 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhosmer if this is the comment you are referring to. The sparsity pattern restriction was already relaxed at the time when batch dim support was added. The first three lines were from work by @cpuhrsch which added support for 1 batch dim (3-D) inputs. I relaxed the requirement to match the BSR invariants document, after I took over that work to generalize it to arbitrary number of batch dims, but left the original comment + my note so I would remember to point it out in review.

So this change does not introduce the relaxation of the restriction on the sparsity pattern, only removes the confusing comment taking about it.

Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
Copy link
Collaborator

@nikitaved nikitaved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Faster and less code - double win!

Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
@amjames
Copy link
Collaborator Author

amjames commented Sep 1, 2022

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch cherry-pick -x ef0f5a1d95580baf5ea4d0f2eaf283fa2f20c429 returned non-zero exit code 1

Auto-merging aten/src/ATen/native/TensorConversions.cpp
Auto-merging test/test_sparse_csr.py
CONFLICT (content): Merge conflict in test/test_sparse_csr.py
error: could not apply ef0f5a1d95... Dense -> CSR support batch dimensions
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".

If you believe this is an error, you can use the old behavior with @pytorchbot merge -g (optionally with the ciflow/trunk to get land checks) or use @pytorchbot merge -f "some reason here". For more information, see the bot wiki.

Please reach out to the PyTorch DevX Team with feedback or questions!

Details for Dev Infra team Raised by workflow job

Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Sep 1, 2022
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.
Pull Request resolved: #83085
Approved by: https://github.com/bhosmer, https://github.com/nikitaved
pytorchmergebot pushed a commit that referenced this pull request Sep 2, 2022
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.
Pull Request resolved: #83085
Approved by: https://github.com/bhosmer, https://github.com/nikitaved
@facebook-github-bot facebook-github-bot deleted the gh/amjames/14/head branch September 6, 2022 14:20
facebook-github-bot pushed a commit that referenced this pull request Sep 6, 2022
Summary:
Applies the algorithm for re-batching compressed indices to avoid n-batch kernel launches. This is an optimization for `dim()>= 3` inputs and does not change behavior in any way.

Pull Request resolved: #83085
Approved by: https://github.com/bhosmer, https://github.com/nikitaved

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/a5a01e443ce1dd8e31ef7d0b3fd6a2359881a922

Reviewed By: mehtanirav, izaitsevfb

Differential Revision: D39277549

fbshipit-source-id: d3df886ac6ba688afd2fc8539a57ad6878fbc23e
@kit1980 kit1980 added the Merged label Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants