-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[not ready for review yet] torch.compile support for parseSemiStructuredTensor #104974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/104974
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Unrelated FailureAs of commit bb20c9b with merge base 7921243 ( NEW FAILURE - The following job has failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
e755cfa
to
7aa167d
Compare
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
…SemiStructuredTensor" cc jcaip, this is an E2E test of compiling a small model with a `SparseSemiStructuredTensor` subclass tensor used as one of the parameters. The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it. A few things to note: (1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both). (1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode. [ghstack-poisoned]
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
cc @jcaip, this is an E2E test of compiling a small model with a
SparseSemiStructuredTensor
subclass tensor used as one of the parameters.The generated inductor code looks like this (P788647425): you can see that the subclass desugars the matmul into a sparse_mm() + contiguous() call, and inductor is able to fuse the contiguous() call into the relu() that follows it.
A few things to note:
(1) the test actually... fails. I haven't figured out why, but the results don't match the non-sparse version. FWIW, the code that inductor outputs looks reasonable, so it's not immediately clear if it's a compile() bug, or something to do with sparsity giving less accurate results (or both).
(1) inference mode is a bit broken with torch.compile still: in the test, i needed to make sure that the model was instantiated, compiled and run all inside of inference_mode.
Stack from ghstack (oldest at bottom):