Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upUnaligned copies on 8/16 bpp formats on DX11 #2318
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
fkaa commentedAug 12, 2018
The copy shaders at
shaders\copy.hlsl
work in 32bit strides and load 1 uint per thread. The addressing logic is also quite simple, and doesn't really work on formats that are smaller than 32bpp when the x offset and/or width is not aligned to 4.Currently these smaller sized formats dispatch calls are scaled down so that they fit with the "1 load per thread", storing and reading 4/2 texels at a time, but a potential solution could be to not scale at all and "waste" 24 bits of bandwidth and do 1 load per texel, like the other format copies.
Note that raw buffers support unaligned 32bit loads at byte granularity.