New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a few typos in SPIRV translation comments. #4
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Dec 16, 2020
Closed
Merged
Merged
Closed
erman-gurses
pushed a commit
to erman-gurses/iree
that referenced
this pull request
May 6, 2022
-Adding tuned apple M1 Config for spirv kernels Co-authored-by: nodlabs <nodlabs@mini0.local>
[misconfigured testing infra] |
qedawkins
pushed a commit
to qedawkins/iree
that referenced
this pull request
Feb 10, 2023
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 15, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. Missing bits: - More comments - Add tests - Fix the subviews sizes for non-unary loads (although it doesn't break anything this is technically incorrect.) - LLVM reassociate undo some of the thing we improve here. Need to file a bug for that, investigate and fix. Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 17, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. Missing bits: - More comments - Add tests - Fix the subviews sizes for non-unary loads (although it doesn't break anything this is technically incorrect.) - LLVM reassociate undo some of the thing we improve here. Need to file a bug for that, investigate and fix. Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 21, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. TODO: - Add support for memref.store, vector.transfer_xxx Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 24, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. TODO: - Add support for memref.store, vector.transfer_xxx Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 24, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. TODO: - Add support for vector.transfer_xxx Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
ScottTodd
added a commit
that referenced
this pull request
Aug 22, 2023
Caught by ASan: ``` 370: ================================================================= 370: ==3911909==ERROR: LeakSanitizer: detected memory leaks 370: 370: Direct leak of 376 byte(s) in 1 object(s) allocated from: 370: #0 0x6a9b022 in calloc (iree-build/tools/iree-run-mlir+0x6a9b022) 370: #1 0x6ad5d47 in iree_allocator_system_alloc iree/runtime/src/iree/base/allocator.c:104:17 370: #2 0x6ad5d47 in iree_allocator_system_ctl iree/runtime/src/iree/base/allocator.c:144:14 370: #3 0x6ad56ad in iree_allocator_issue_alloc iree/runtime/src/iree/base/allocator.c:27:10 370: #4 0x6ad56ad in iree_allocator_malloc iree/runtime/src/iree/base/allocator.c:32:10 370: #5 0x1acf2486 in iree_vm_bytecode_module_create iree/runtime/src/iree/vm/bytecode/module.c:836:3 370: #6 0x6afdf31 in iree_tooling_create_run_context iree/runtime/src/iree/tooling/run_module.c:107:9 370: #7 0x6afdf31 in iree_tooling_run_module_with_data iree/runtime/src/iree/tooling/run_module.c:340:3 370: #8 0x6ad2a24 in iree::(anonymous namespace)::CompileAndRunFile(iree_compiler_session_t*, char const*) iree/tools/iree-run-mlir-main.cc:359:3 370: #9 0x6ad2a24 in main iree/tools/iree-run-mlir-main.cc:520:20 370: #10 0x7fce3bc456c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 ```
stellaraccident
pushed a commit
that referenced
this pull request
Sep 24, 2023
* Presently schedules for 7 hours after IREE's nightly release is cut (which should be ample time to build).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.