Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native compressed custom texture support #5279

Merged
merged 12 commits into from
Apr 29, 2017
Merged

Native compressed custom texture support #5279

merged 12 commits into from
Apr 29, 2017

Conversation

stenzek
Copy link
Contributor

@stenzek stenzek commented Apr 16, 2017

This PR allows custom textures that use block compression (DXT1/3/5) to be stored and uploaded to the GPU in their compressed form, saving both RAM and VRAM, and significantly reducing prefetch/loading time. Stutter when uploading large textures may be slightly reduced as well.

Assuming all textures in a custom pack are compressed, this should decrease the memory usage by 8 times for DXT1, or 4 times for DXT5, or somewhere in between for a mix.

Supports all backends where the driver/graphics card exposes support for DXT formats. D3D11 and 12 both check for support, rather than assuming it, in case we ever have a Windows on ARM port.

There is only one issue with this implementation. If a custom texture uses different formats for different mipmap levels in a single game texture, it will now refuse to load that texture. It could be worked around, but it's better if texture pack authors fix their packs instead.

  • Handle DX10 fourcc and extension header (how many packs use this?)
  • Support loading full mipmap chains from a single DDS file (https://bugs.dolphin-emu.org/issues/9254)
  • Support loading full mipmap chains of non-compressed DDS textures (again, any packs that use this?)

Should be functional enough for testing and review, however.

@@ -262,6 +262,21 @@ std::vector<DXGI_SAMPLE_DESC> EnumAAModes(ID3D12Device* device)
return aa_modes;
}

bool SupportsS3TCTextures(ID3D12Device* device)
{
auto CheckForFormat = [](ID3D12Device* device, DXGI_FORMAT format) -> bool {

This comment was marked as off-topic.

if (upload_pitch != source_pitch)
{
VkDeviceSize copy_pitch = std::min(source_pitch, upload_pitch);
for (unsigned int row = 0; row < height; row++)

This comment was marked as off-topic.

@@ -29,20 +30,25 @@ class HiresTexture

~HiresTexture();

HostTextureFormat GetFormat() const { return m_levels.at(0).format; }

This comment was marked as off-topic.

bool HiresTexture::LoadDDSTexture(Level& level, const std::vector<u8>& buffer)
{
u32 magic;
std::memcpy(&magic, &buffer[0], sizeof(magic));

This comment was marked as off-topic.

// Refer to the license.txt file included.

#include "VideoCommon/HiresTextures.h"
#include <cstdint>

This comment was marked as off-topic.


// Copy to the final storage location. The deallocator here is simple, nothing extra is
// needed, compared to the SOIL-based loader.
level.data = SOILPointer(new u8[level.data_size], [](u8* data) { delete[] data; });

This comment was marked as off-topic.

This comment was marked as off-topic.

@@ -338,8 +340,28 @@ TextureCache::TCacheEntry::~TCacheEntry()
g_command_buffer_mgr->DeferFramebufferDestruction(m_framebuffer);
}

void TextureCache::TCacheEntry::Load(const u8* buffer, unsigned int width, unsigned int height,
unsigned int expanded_width, unsigned int level)
inline void CopyImageToStagingBuffer(u32 height, const u8* buffer, u8* upload_buffer,

This comment was marked as off-topic.

This comment was marked as off-topic.

@stenzek
Copy link
Contributor Author

stenzek commented Apr 16, 2017

Added support for loading a full mipmap chain from a single .dds file. If anyone has any texture packs handy and feels like testing, it would be appreciated.

@ghost
Copy link

ghost commented Apr 16, 2017

Many thanks!

@BigheadSMZ
Copy link

This seems to work well so far, but I did find some issues. Lower internal mipmap levels are loading properly, tested with Ether Mine textures that have varying images on lower levels. For the first 3 minutes, along with GPU decoding, performance was better than I ever remembered it. I didn't try using prefetch, I will try it out later and if anything is different, I'll report back.

First issue I came across was a bunch of wall/floor textures not loading in Xenoblade's Tephra Cave. Most textures were working outside (near bionis leg), but entering the cave everything showed up as black. So I unchecked GPU Texture Encoding, then started to be spammed with messages like this: "size of level 0 must be 256x128, but 128x64 requested". These now persist even after rebooting without it checked.

I was thinking maybe there are issues with the textures, but I can't seem to find any. The only comparison I have to test them is Ishiiruka, where everything is working properly. Here is the latest Xenoblade pack in DDS format with internal mipmaps if anyone else wants to test it out.
http://www.mediafire.com/file/v7xsqv9xyirgkns/SX4-Internal-mipmaps.7z

@stenzek
Copy link
Contributor Author

stenzek commented Apr 18, 2017

@BigheadSMZ Thanks for the feedback! Feel free to try the latest build, it should fix the size of level 0 error.

Also adds some further validation to the DDS loader, which may help with the black texture issue, as it looks like some of those cave textures have one mip level too many.

@BigheadSMZ
Copy link

Tested again for about 20 minutes and found no issues, at least with the Xenoblade pack. I do not know much about the DX10+ DDS formats, I'll have to look into them more. When I get some time I'll try other packs, although most of them on the Dolphin forums using DDS are probably similar to the Xenoblade pack since a lot of people use my script to create DDS textures. Which leads me to the somewhat off-topic question about mipmaps. I apologize in advance if this is not the place for this.

How low should mipmaps go? It's been years and I still don't know the answer to this question. I've seen many mixed answers, with several saying they should end at 1x1, even if the dimensions are uneven (like an image at 32x16). This is where I settled, but conflicting answers can say that these types should end at 2x1. But I also seen answers saying that it is a waste to have levels lower than 8x8 or 4x4. I'm not sure if there's even a correct answer to this question anymore, and it's more of a "preference" than anything.

@phire
Copy link
Member

phire commented Apr 19, 2017

How low should mipmaps go? ...... But I also seen answers saying that it is a waste to have levels lower than 8x8 or 4x4

My understand of mipmaping and sampling theory is that stopping at 8x8 or 4x4 will produce incorrect results.

Imagine a texture that was solid black on the left hand side, and solid white on the right hand side with a sharp transition in the middle. The 1x1 mipmap will obviously be gray (the average of the whole texture), while the 8x8 mipmap will still be half black and half white (depending on the exact location of the transition the 8x8 mipmap may or may not have gray in the middle).

If you are far enough away that the 1x1 mipmap is needed (object has a screensize of 1px or less), then that pixel should show as gray. If you are instead using the 8x8 mipmaps, the gpu only does one sample and that sample will result in either black, white or even gray depending on where that sample lands (sub-pixel precision). Small movements of that object (even sub-pixel movements) will result on that pixel flashing between black, white and gray.

Now, if you can guarantee that a texture will not be visible at such small minification levels, then you can skimp on the mips. Or perhaps if the 8x8 mipmap just so happens to be so blurred that there is little difference between a random sample on the 8x8 and the 1x1 mipmap.

I'm not sure about the 1x2 vs 1x1 issue (my gut says 1x1), but 8x8/4x4 is definitely wrong.

@phire
Copy link
Member

phire commented Apr 19, 2017

Actually thinking about it some more, you do really need 1x1 mips instead of stopping at 1x2.

And it's not just small 1px objects where this might show up, tiling a texture across a surface will show up issues (moire patterns) when you get far enough away.

@stenzek
Copy link
Contributor Author

stenzek commented Apr 19, 2017

From the quick look I had at that texture pack there were some textures that had two 1x1 mip levels, which would cause a GL error. From the spec:

GL_INVALID_OPERATION is generated if target is GL_TEXTURE_2D_ARRAY, GL_PROXY_TEXTURE_2D_ARRAY, GL_TEXURE_CUBE_ARRAY, or GL_PROXY_TEXTURE_CUBE_MAP_ARRAY and levels is greater than log2(max(width, height)) + 1.

@BigheadSMZ
Copy link

@phire Thank you that was very insightful. I've always had a feeling flooring to 1x1 was the way to go, worst case scenario maybe some mipmaps would not be used.

I've tested DDS packs for Xenoblade, Skies of Arcadia, Paper Mario TTYD, Zelda: Collectors Edition, Wind Waker, Luigi's Mansion, Zelda TP 4k Pack, Super Mario Sunshine, and Star Wars: Rogue Leader. I didn't notice any issues, using a mix of all the backends, with/without prefetch and gpu decoding.

Also wanted to point out this PR also gives Dolphin the ability to load DDS textures that would previously only work with Ishiiruka (e.g. DDS textures created with Ishiiruka Tool). http://i.imgur.com/TXig3tf.png

@stenzek
Copy link
Contributor Author

stenzek commented Apr 21, 2017

Now supports loading internal mipmap chains from uncompressed DDS textures as well. The preloading-on-GPU ideas are out of scope for this PR, so if the testers are happy with it I'd consider it finished now.

@BigheadSMZ
Copy link

No known packs use DX10 textures, but I figured I'd test it anyway. I used GPUOpen-Tools/Compressonator v2.5 (its on Github) to create a DDS with the DX10 header. I might be creating it with the wrong options or something, I tried BC4, BC5, and BC7 which does not load at all. This is what it looks like: http://i.imgur.com/OJsgD5h.jpg

@stenzek
Copy link
Contributor Author

stenzek commented Apr 21, 2017

@BigheadSMZ I used texconv from DirectXTex (https://github.com/Microsoft/DirectXTex) and forced it to emit a DX10 header and it worked fine. DXT1 -> BC1, DXT3 -> BC2, DXT5 -> BC3 should all be supported.

BC4/5/7 aren't supported, so I'd guess what's happening is the DDS loader is ignoring it, and SOIL is mangling the textures as they come in.

If you think it's worth it, it's simple enough to add support for. BC4/5 wouldn't be much use, but IIRC BC7 can offer higher quality over BC1/2/3, but likely outside the scope of this PR (and given no texture packs are currently using it?)

@BigheadSMZ
Copy link

Okay so that's where I was going wrong. The only way to force a DX10 header with Compressonator is to use BC4+. I tried texconv.exe and confirmed DXT1/3/5 work with this header. It's probably not worth adding support for the other block compression types unless they offered a significant quality boost over DXT5 (especially for smaller images with transparency). Without a way to actually view these textures, I'm not sure how to compare them visually (BC3 vs BC7).

There are probably several reasons no packs use the DX10 header. Hobbyist texture artists for fan-based projects don't really need to know much about the specifics. And there is a lack of exposure/support, as there doesn't seem to be much intelligent discussion on the internet for the DDS format in general. For most cases, DXT1/3/5 just works, and most programs will always use the older header by default when choosing BC1-3.

My guess is this: if the option to load DDS textures with BC7 exists, and it offers higher quality over BC3, then someone is bound to make use of it eventually. I don't know enough to decide that whether or not its worth the effort.

mip_width = std::max(mip_width / 2, 1u);
mip_height = std::max(mip_height / 2, 1u);
mip_count++;
for (u32 x = 0; x < level->row_length; x++)

This comment was marked as off-topic.

This comment was marked as off-topic.

{
mip_width = std::max(mip_width / 2, 1u);
mip_height = std::max(mip_height / 2, 1u);
mip_count++;

This comment was marked as off-topic.

// If the .dds file does not contain a full mip chain, we'll fall back to the old path.
u32 mip_width = std::max(info.width / 2, 1u);
u32 mip_height = std::max(info.height / 2, 1u);
for (u32 i = 1; i < info.mip_count; i++)

This comment was marked as off-topic.


for (u32 row = 0; row < level->height; row++)
{
for (u32 x = 0; x < level->row_length; x++)

This comment was marked as off-topic.

@BigheadSMZ
Copy link

I've tested many packs, as well as a few hand picked texture cases that I thought might be problematic and have found no issues. So this LGTM (from a user's perspective).

@@ -239,6 +239,19 @@ D3D_FEATURE_LEVEL GetFeatureLevel(IDXGIAdapter* adapter)
return feat_level;
}

bool SupportsS3TCTextures(ID3D11Device* device)

This comment was marked as off-topic.

@@ -29,6 +30,42 @@ static std::unique_ptr<PSTextureEncoder> g_encoder;
const size_t MAX_COPY_BUFFERS = 32;
ID3D11Buffer* efbcopycbuf[MAX_COPY_BUFFERS] = {0};

static u32 GetLevelPitch(HostTextureFormat format, u32 row_length)

This comment was marked as off-topic.

@@ -87,6 +88,53 @@ bool SaveTexture(const std::string& filename, u32 textarget, u32 tex, int virtua
return TextureToPng(data.data(), width * 4, filename, width, height, true);
}

static bool IsCompressedTextureFormat(HostTextureFormat format)
{
return format >= HostTextureFormat::DXT1 && format <= HostTextureFormat::DXT5;

This comment was marked as off-topic.

@@ -176,6 +177,10 @@ bool VideoBackend::FillBackendInfo()
return false;
}

// S3TC for native DXT1/3/5 custom texture support.
g_Config.backend_info.bSupportsST3CTextures =
GLExtensions::Supports("GL_EXT_texture_compression_s3tc");

This comment was marked as off-topic.

{
break;
current_mip_width = std::max(current_mip_width / 2, 1u);

This comment was marked as off-topic.

}

// It is invalid to have more than a single 1x1 mipmap.
u32 max_mip_levels = CalculateMipCount(first_mip.width, first_mip.height);

This comment was marked as off-topic.

};

#pragma pack(pop)

This comment was marked as off-topic.


tex->m_levels.push_back(std::move(level));
mip_width = std::max(mip_width / 2, 1u);
mip_height = std::max(mip_height / 2, 1u);

This comment was marked as off-topic.

This leaves DDS textures using DXT1/3/5 compressed in-memory, which can
be passed directly to the backend.
This also removes an extra copy of the image for custom textures.
{
UINT bc1_support, bc2_support, bc3_support;
if (FAILED(device->CheckFormatSupport(DXGI_FORMAT_BC1_UNORM, &bc1_support)) ||
FAILED(device->CheckFormatSupport(DXGI_FORMAT_BC3_UNORM, &bc2_support)) ||

This comment was marked as off-topic.

This removes the need for multiple texture files to store the mipmap
chain for a texture. As many mipmaps will be loaded as are present in
the DDS file, and any remaining mipmaps will fall back to the old
behavior.
This way that the mip count check occurs on .png and uncombined DDS
textures as well.
Will load files with formats RGBA8, RGBX8, BGRA8, BGRX8, RGB8.
The appropriate place for these would be AbstractTexture, once it is
finished.
@stenzek stenzek merged commit a2cba6d into dolphin-emu:master Apr 29, 2017
@stenzek stenzek deleted the compressed-custom-textures branch June 13, 2017 15:01
@ghost ghost mentioned this pull request Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants