Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add preparation support of row export for mesh shader
Future generation allows row export for mesh shader, we don't have to set thread group size to max(vertexCount, primitiveCount) to launch enough threads to do vertex/primitive export. Consider this mesh shader test: the mesh shader export 256 points while the mesh shader workgroup size is (8, 2, 4). In wave32 mode, only 2 waves could be launched. That is wave0 exports row0, 2, 4, 6 and wave1 exports row1, 3, 5, 7. The future generation is able to achieve this. In this change, we add a loop structure to handle row export. It is something like this: loopIndex = 0 primOrVertexIndex = threadIdInSubgroup while (primOrVertexIndex < primOrVertexCount) { Export primitive/vertex loopIndex += numWaves primOrVertexIndex += loopIndex * waveSize } The row export will be distributed uniformly on existing waves. This loop structure will be degenerated if row export is disabled and the final optimized CFG is equivalent to previous mesh shader implementation without unnecessary control flow instructions.
- Loading branch information
Showing
1 changed file
with
136 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters