Skip to content

Commit

Permalink
Allow multiple texture cache entries for textures at the same address
Browse files Browse the repository at this point in the history
This is the same trick which is used for Metroid's fonts/texts, but for all textures. If 2 different textures at the same address are loaded during the same frame, create a 2nd entry instead of overwriting the existing one. If the entry was overwritten in this case, there wouldn't be any caching, which results in a big performance drop.

The restriction to textures, which are loaded during the same frame, prevents creating lots of textures when textures are used in the regular way. This restriction is new. Overwriting textures, instead of creating new ones is faster, if the old ones are unlikely to be used again.

Since this would break efb copies, don't do it for efb copies.

Castlevania 3 goes from 80 fps to 115 fps for me.

There might be games that need a higher texture cache accuracy with this, but those games should also see a performance boost from this PR.

Some games, which use paletted textures, which are not efb copies, might be faster now. And also not require a higher texture cache accuracy anymore. (similar sitation as PR dolphin-emu#1916)
  • Loading branch information
mimimi085181 committed Feb 8, 2015
1 parent b790151 commit be4742e
Show file tree
Hide file tree
Showing 2 changed files with 84 additions and 22 deletions.
104 changes: 83 additions & 21 deletions Source/Core/VideoCommon/TextureCacheBase.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ static const u64 TEXHASH_INVALID = 0;
static const int TEXTURE_KILL_THRESHOLD = 200;
static const int TEXTURE_POOL_KILL_THRESHOLD = 3;
static const u64 FRAMECOUNT_INVALID = 0;
static const int MAX_TEXTURES_PER_ADDRESS = 20;

TextureCache *g_texture_cache;

Expand Down Expand Up @@ -329,31 +330,81 @@ TextureCache::TCacheEntryBase* TextureCache::Load(const u32 stage)
palette_size = TexDecoder_GetPaletteSize(texformat);
u64 tlut_hash = GetHash64(&texMem[tlutaddr], palette_size, g_ActiveConfig.iSafeTextureCache_ColorSamples);

// Mix the tlut hash into the texture hash. So we only have to compare it one.
// Mix the tlut hash into the texture hash. So we only have to compare it once.
tex_hash ^= tlut_hash;

// NOTE: For non-paletted textures, texID is equal to the texture address.
// A paletted texture, however, may have multiple texIDs assigned though depending on the currently used tlut.
// This (changing texID depending on the tlut_hash) is a trick to get around
// an issue with Metroid Prime's fonts (it has multiple sets of fonts on each other
// stored in a single texture and uses the palette to make different characters
// visible or invisible. Thus, unless we want to recreate the textures for every drawn character,
// we must make sure that a paletted texture gets assigned multiple IDs for each tlut used.
//
// EFB copys however didn't know anything about the tlut, so don't change the texID if there
// already is an efb copy at this source. This makes those textures less broken when using efb to texture.
// Examples are the mini map in Twilight Princess and objects on the targetting computer in Rogue Squadron 2(RS2).
// TODO: Convert those textures using the right palette, so they display correctly
auto iter = textures.find(texID);
if (iter == textures.end() || !iter->second->IsEfbCopy())
texID ^= ((u32)tlut_hash) ^(u32)(tlut_hash >> 32);
// TODO: Convert paletted textures, which are efb copies, using the right palette, so they display correctly
}

// GPUs don't like when the specified mipmap count would require more than one 1x1-sized LOD in the mipmap chain
// e.g. 64x64 with 7 LODs would have the mipmap chain 64x64,32x32,16x16,8x8,4x4,2x2,1x1,0x0, so we limit the mipmap count to 6 there
tex_levels = std::min<u32>(IntLog2(std::max(width, height)) + 1, tex_levels);

TCacheEntryBase*& entry = textures[texID];
TCacheEntryBase* entry = nullptr;

// Find all texture cache entries for the current texture address, and decide whether to use one of
// them, or to create a new one
//
// In most cases, the fastest way is to use only one texture cache entry for the same address. Usually,
// when a texture changes, the old version of the texture is unlikely to be used again. If there were
// new cache entries created for normal texture updates, there would be a slowdown due to a huge amount
// of unused cache entries. Also thanks to texture pooling, overwriting an existing cache entry is
// faster than creating a new one from scratch.
//
// Some games use the same address for different textures though. If the same cache entry was used in
// this case, it would be constantly overwritten, and effectively there wouldn't be any caching for
// those textures. Examples for this are Metroid Prime and Castlevania 3. Metroid Prime has multiple
// sets of fonts on each other stored in a single texture and uses the palette to make different
// characters visible or invisible. In Castlevania 3 some textures are used for 2 different things or
// at least in 2 different ways(size 1024x1024 vs 1024x256).
//
// To determine whether to use multiple cache entries or a single entry, use the following heuristic:
// If the same texture address is used more than once during one frame, assume that the address is used
// for different purposes and use multiple cache entries. Once there's more than one entry for one
// address, keep using several entries. If the current texture is found in the cache, use that entry.
//
// For efb copies, the entry created in CopyRenderTargetToTexture always has to be used, or else it was
// done in vain.
std::pair <TexCache::iterator, TexCache::iterator> iter_range = textures.equal_range(texID);
TexCache::iterator iter = iter_range.first;
if (iter != iter_range.second)
{
if (iter->second->IsEfbCopy() || (iter->second->frameCount != FRAMECOUNT_INVALID && std::next(iter, 1) == iter_range.second))
{
entry = iter->second;
}
else
{
u32 counter = 0;
TexCache::iterator oldest_entry = iter;
int temp_frameCount = 0x7fffffff;

while (!entry && counter < MAX_TEXTURES_PER_ADDRESS && iter != iter_range.second)
{
if (tex_hash == iter->second->hash)
{
entry = iter->second;
}
else
{
if (iter->second->frameCount != FRAMECOUNT_INVALID && iter->second->frameCount < temp_frameCount)
{
temp_frameCount = iter->second->frameCount;
oldest_entry = iter;
}
++counter;
++iter;
}
}

if (!entry && counter == MAX_TEXTURES_PER_ADDRESS)
{
iter = oldest_entry;
entry = iter->second;
}
}
}

if (entry)
{
// 1. Calculate reference hash:
Expand All @@ -379,6 +430,7 @@ TextureCache::TCacheEntryBase* TextureCache::Load(const u32 stage)

// pool this texture and make a new one later
FreeTexture(entry);
textures.erase(iter);
}

std::unique_ptr<HiresTexture> hires_tex;
Expand Down Expand Up @@ -431,9 +483,12 @@ TextureCache::TCacheEntryBase* TextureCache::Load(const u32 stage)
config.width = width;
config.height = height;
config.levels = texLevels;

entry = AllocateTexture(config);
GFX_DEBUGGER_PAUSE_AT(NEXT_NEW_TEXTURE, true);

textures.insert(TexCache::value_type(texID, entry));

entry->SetGeneralParameters(address, texture_size, full_format);
entry->SetDimensions(nativeW, nativeH, tex_levels);
entry->hash = tex_hash;
Expand Down Expand Up @@ -791,9 +846,14 @@ void TextureCache::CopyRenderTargetToTexture(u32 dstAddr, unsigned int dstFormat
unsigned int scaled_tex_w = g_ActiveConfig.bCopyEFBScaled ? Renderer::EFBToScaledX(tex_w) : tex_w;
unsigned int scaled_tex_h = g_ActiveConfig.bCopyEFBScaled ? Renderer::EFBToScaledY(tex_h) : tex_h;

TCacheEntryBase*& entry = textures[dstAddr];
if (entry)
FreeTexture(entry);
// remove all texture cache entries at dstAddr
std::pair <TexCache::iterator, TexCache::iterator> iter_range = textures.equal_range(dstAddr);
TexCache::iterator iter = iter_range.first;
while (iter != iter_range.second)
{
FreeTexture(iter->second);
iter = textures.erase(iter);
}

// create the texture
TCacheEntryConfig config;
Expand All @@ -802,7 +862,7 @@ void TextureCache::CopyRenderTargetToTexture(u32 dstAddr, unsigned int dstFormat
config.height = scaled_tex_h;
config.layers = FramebufferManagerBase::GetEFBLayers();

entry = AllocateTexture(config);
TCacheEntryBase* entry = AllocateTexture(config);

// TODO: Using the wrong dstFormat, dumb...
entry->SetGeneralParameters(dstAddr, 0, dstFormat);
Expand All @@ -812,6 +872,8 @@ void TextureCache::CopyRenderTargetToTexture(u32 dstAddr, unsigned int dstFormat
entry->frameCount = FRAMECOUNT_INVALID;

entry->FromRenderTarget(dstAddr, dstFormat, srcFormat, srcRect, isIntensity, scaleByHalf, cbufid, colmat);

textures.insert(TexCache::value_type(dstAddr, entry));
}

TextureCache::TCacheEntryBase* TextureCache::AllocateTexture(const TCacheEntryConfig& config)
Expand Down
2 changes: 1 addition & 1 deletion Source/Core/VideoCommon/TextureCacheBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ class TextureCache
static TCacheEntryBase* AllocateTexture(const TCacheEntryConfig& config);
static void FreeTexture(TCacheEntryBase* entry);

typedef std::map<u32, TCacheEntryBase*> TexCache;
typedef std::multimap<u32, TCacheEntryBase*> TexCache;
typedef std::unordered_multimap<TCacheEntryConfig, TCacheEntryBase*, TCacheEntryConfig::Hasher> TexPool;

static TexCache textures;
Expand Down

0 comments on commit be4742e

Please sign in to comment.