Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Nov 17, 2022

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@trafico-bot trafico-bot bot added the 🔍 Ready for Review Pull Request is not reviewed yet label Nov 17, 2022
@pull pull bot added ⤵️ pull and removed 🔍 Ready for Review Pull Request is not reviewed yet labels Nov 17, 2022
@trafico-bot trafico-bot bot added the 🔍 Ready for Review Pull Request is not reviewed yet label Nov 17, 2022
@VPG-SWE-Github VPG-SWE-Github force-pushed the master branch 2 times, most recently from 89e978f to 809f6c4 Compare December 7, 2022 13:19
@VPG-SWE-Github VPG-SWE-Github force-pushed the master branch 2 times, most recently from a031151 to 0265002 Compare December 21, 2022 19:06
@VPG-SWE-Github VPG-SWE-Github force-pushed the master branch 2 times, most recently from 659983d to f179289 Compare January 20, 2023 22:46
@VPG-SWE-Github VPG-SWE-Github force-pushed the master branch 4 times, most recently from 067fb01 to cc30341 Compare May 24, 2023 12:10
@VPG-SWE-Github VPG-SWE-Github force-pushed the master branch 3 times, most recently from 7ab2f49 to bd5532c Compare August 30, 2023 12:06
@VPG-SWE-Github VPG-SWE-Github force-pushed the master branch 4 times, most recently from e5bc891 to 49fed10 Compare October 6, 2023 19:09
ichenkai and others added 29 commits July 9, 2025 19:36
Create a new CSE to remove redundant WaveBallot for performance.
Add EarlyCSE to pass pipeline without generating weird IR patterns that
degrade performance
…dify cr0 on debug SIP exit

Only modify cr0 on debug SIP exit
Currently flag value was being overriden in code so it was unusable.
Enable MAXNUM by default in IGCVectorizer
…blem in split barrier

Fixed problem in split barrier when we are using with regular barrier.
    Case:
    splitbarrier.signal()
    regularbarrier()
    splitbarrier.wait()

    was causing the hang due assingning the same ID of the barrier in the regular barrier and split barrier.
    Now, the split barrier will take other ID than the regular one.
When the destination type is byte (UB or B), destination sunbregnum can
be aligned to 2 or 3 of the (DWORD) execution channel.
Enable abort on spills to SIMD16 for more platforms.
Add missing lit for GenSpecificPattern, also align clang fmt.
For subroutine, there is no need add live out dependence of call BB
…at datatype

Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
Group the dpas instructions which have no dependence between each others
and can be in same macro block in instruction scheduling
…rands alignment issues for SIMD2 instructions with 64b or float datatype

Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
Fix issue that align=1 can not be parsed correctly
Changes:
* UseNewInlineRaytracing is now a mask that lets user selectively enable new inline raytracing for particular shader type
* New regkey AddDummySlotsForNewInlineRaytracing forces increased number of slots required for rayqueries to test if UMD allocated the HW stacks necessary
…the SWSB compilation time when there is subroutine

For subroutine, there is no need add live out dependence of call BB
Fix non-determinism in metadata
…a new CSE to remove redundant WaveBallot

Create a new CSE to remove redundant WaveBallot for performance.
For subroutine, there is no need add live out dependence of call BB
…failing

When adding Opaque Pointers support to JointMatrix I've found that 4 test were failing due to this assert:

	info: error, assertion failed: bits == elementSize
	file: Source\IGC\Compiler\Optimizer\OpenCLPasses\PrivateMemory\PrivateMemoryResolution.cpp
	function: TransposeHelperPrivateMem::handleLoadInst
	line: 665

	Failed Tests (4):
	  SYCL :: Matrix/SG32/joint_matrix_bf16_fill_k_cache_unroll.cpp
	  SYCL :: Matrix/SG32/joint_matrix_bf16_fill_k_cache_unroll_init.cpp
	  SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_unroll.cpp
	  SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_unroll_init.cpp

My investigation showed that such resolution path:

alloca -> gep -> load

used invalid vector elements count value, which caused this assert to fail.
To my understanding the reason for this was that we used elementSize saved in "TransposeHelperPrivateMem" instance,
But when we were going thru instructions (alloca->gep->load) then they weren't updated, so there was mismatch.
…the SWSB compilation time when there is subroutine

For subroutine, there is no need add live out dependence of call BB
Adding internal options: `-cl-intel-disable-sendwarwa, -ze-opt-disable-sendwarwa`
to turn off PVCSendWARWA
Cleaned up dead code that's related to patch token binary format deprecation. Removed unused code, adjusted some comments.
Most of these changes are related to previous commits that deprecated the format in VC and OCL.

Some parts are still to be refactored, this doesn't cover all patch token code.
Create a new CSE to remove redundant WaveBallot for performance.
Upgrade IGC C++ standard from 17 to 20
For subroutine, there is no need add live out dependence of call BB
…at datatype

Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
…SIMD16 drop for more platforms

Enable abort on spills to SIMD16 for more platforms.
@pull pull bot merged commit 398538e into ConnectionMaster:master Jul 15, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⤵️ pull 🔍 Ready for Review Pull Request is not reviewed yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.