forked from intel/intel-graphics-compiler
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] master from intel:master #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
22f9c81 to
9bbc7c9
Compare
89e978f to
809f6c4
Compare
a031151 to
0265002
Compare
b2bc28e to
6bcd3c8
Compare
659983d to
f179289
Compare
cc7d017 to
66b520c
Compare
067fb01 to
cc30341
Compare
7ab2f49 to
bd5532c
Compare
e5bc891 to
49fed10
Compare
e166627 to
e30ad65
Compare
af0d1e1 to
fb42ffb
Compare
Create a new CSE to remove redundant WaveBallot for performance.
Add EarlyCSE to pass pipeline without generating weird IR patterns that degrade performance
…dify cr0 on debug SIP exit Only modify cr0 on debug SIP exit
Currently flag value was being overriden in code so it was unusable.
Enable MAXNUM by default in IGCVectorizer
…blem in split barrier
Fixed problem in split barrier when we are using with regular barrier.
Case:
splitbarrier.signal()
regularbarrier()
splitbarrier.wait()
was causing the hang due assingning the same ID of the barrier in the regular barrier and split barrier.
Now, the split barrier will take other ID than the regular one.
When the destination type is byte (UB or B), destination sunbregnum can be aligned to 2 or 3 of the (DWORD) execution channel.
Enable abort on spills to SIMD16 for more platforms.
Add missing lit for GenSpecificPattern, also align clang fmt.
For subroutine, there is no need add live out dependence of call BB
…at datatype Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
Group the dpas instructions which have no dependence between each others and can be in same macro block in instruction scheduling
…rands alignment issues for SIMD2 instructions with 64b or float datatype Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
Fix issue that align=1 can not be parsed correctly
Changes: * UseNewInlineRaytracing is now a mask that lets user selectively enable new inline raytracing for particular shader type * New regkey AddDummySlotsForNewInlineRaytracing forces increased number of slots required for rayqueries to test if UMD allocated the HW stacks necessary
…the SWSB compilation time when there is subroutine For subroutine, there is no need add live out dependence of call BB
Fix non-determinism in metadata
…a new CSE to remove redundant WaveBallot Create a new CSE to remove redundant WaveBallot for performance.
For subroutine, there is no need add live out dependence of call BB
…failing When adding Opaque Pointers support to JointMatrix I've found that 4 test were failing due to this assert: info: error, assertion failed: bits == elementSize file: Source\IGC\Compiler\Optimizer\OpenCLPasses\PrivateMemory\PrivateMemoryResolution.cpp function: TransposeHelperPrivateMem::handleLoadInst line: 665 Failed Tests (4): SYCL :: Matrix/SG32/joint_matrix_bf16_fill_k_cache_unroll.cpp SYCL :: Matrix/SG32/joint_matrix_bf16_fill_k_cache_unroll_init.cpp SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_unroll.cpp SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_unroll_init.cpp My investigation showed that such resolution path: alloca -> gep -> load used invalid vector elements count value, which caused this assert to fail. To my understanding the reason for this was that we used elementSize saved in "TransposeHelperPrivateMem" instance, But when we were going thru instructions (alloca->gep->load) then they weren't updated, so there was mismatch.
…the SWSB compilation time when there is subroutine For subroutine, there is no need add live out dependence of call BB
Adding internal options: `-cl-intel-disable-sendwarwa, -ze-opt-disable-sendwarwa` to turn off PVCSendWARWA
Cleaned up dead code that's related to patch token binary format deprecation. Removed unused code, adjusted some comments. Most of these changes are related to previous commits that deprecated the format in VC and OCL. Some parts are still to be refactored, this doesn't cover all patch token code.
Create a new CSE to remove redundant WaveBallot for performance.
Upgrade IGC C++ standard from 17 to 20
For subroutine, there is no need add live out dependence of call BB
…at datatype Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
…SIMD16 drop for more platforms Enable abort on spills to SIMD16 for more platforms.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )