Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GS: Add hash based texture cache #5545

Merged
merged 8 commits into from
Feb 21, 2022
Merged

GS: Add hash based texture cache #5545

merged 8 commits into from
Feb 21, 2022

Conversation

stenzek
Copy link
Contributor

@stenzek stenzek commented Feb 19, 2022

Description of Changes

This PR adds a hash-based cache to the texture cache for sources, as addition to the existing TEX0-key-based-cache.

As the hash cache entries are based on the texture data and not the location in VRAM, it eliminates uploads in games that stream textures to different locations in VRAM across frames (e.g. GTA: SA). Some games like GTA: LCS are an extreme case of this, where texture uploads previously exceeded 1,000 per frame.

Since it's based on the texture data, this has another nice property; we can use them for texture replacements. I have already done a proof-of-concept and it works quite well.

Rationale behind Changes

Making upload heavy GS games faster.

Suggested Testing Steps

There should be no regressions in rendering or performance in the default config. Partial preloading is equivalent to the current preloading setting, and full uses the hash cache. Test this in games with heavy uploads (e.g. SA, LCS) and examine performance.

@Blackbird88
Copy link
Contributor

Blackbird88 commented Feb 19, 2022

I tested LCS with the Hash Cache option and the lag spikes when loading "new" parts of the city are completely gone for me. This is at default EE clockspeed and 4x Native

Copy link
Member

@JordanTheToaster JordanTheToaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good a nice speedup in some games and lower GS % overall.

@AaronBPaden
Copy link

I get a segfault past the initial loading screen on Need for Speed - Hot Persuit 2 on Linux with both OpenGL and Vulkan and both Partial and Full preloading. I've got a Radeon RX 570 on Mesa 21.3.6.

Backtrace, if it helps:

(gdb) bt
#0  0x00007f725743534c in __pthread_kill_implementation () at /usr/lib/libc.so.6
#1  0x00007f72573e84b8 in raise () at /usr/lib/libc.so.6
#2  0x000056174ca584c2 in pxTrap() () at /home/aaron/Documents/progs/pcsx2/common/Exceptions.cpp:84
#3  SysPageFaultSignalFilter(int, siginfo_t*, void*) (signal=<optimized out>, siginfo=0x7f7229719560) at /home/aaron/Documents/progs/pcsx2/common/Linux/LnxHostSys.cpp:75
#4  0x00007f72573e8560 in <signal handler called> () at /usr/lib/libc.so.6
#5  0x000056174c8359ed in _mm256_store_si256(long long __vector(4)*, long long __vector(4)) (__A=..., __P=0x7f71ae6ff050) at /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/include/avxintrin.h:923
#6  GSVector8i::store<true>(void*, GSVector8i const&) (v=<optimized out>, p=0x7f71ae6ff050) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSVector8i.h:1235
#7  GSBlock::ReadBlock4P(unsigned char const*, unsigned char*, int) (dstpitch=16, dst=0x7f71ae6ff040 "", src=<optimized out>) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSBlock.h:749
#8  operator() (__closure=<optimized out>, __closure=<optimized out>, src=<optimized out>, read_dst=0x7f71ae6ff040 "") at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSLocalMemory.cpp:1716
#9  foreachBlock<GSLocalMemory::ReadTexture4P(const GSOffset&, const GSVector4i&, u8*, int, const GIFRegTEXA&)::<lambda(u8*, const u8*)> >
    (bpp=8, fn=<optimized out>, dstpitch=16, dst=0x7f71ae6ff040 "", r=<optimized out>, mem=0x7f721cd2f760, off=<optimized out>) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSLocalMemory.cpp:40
#10 GSLocalMemory::ReadTexture4P(GSOffset const&, GSVector4i const&, unsigned char*, int, GIFRegTEXA const&) (this=0x7f721cd2f760, off=<optimized out>, r=<optimized out>, dst=<optimized out>, dstpitch=16, TEXA=<optimized out>)
    at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSLocalMemory.cpp:1714
#11 0x000056174c88b30c in HashTextureLevel(GSRenderer*, GIFRegTEX0 const&, GIFRegTEXA const&, BlockHashState&, u8*) (renderer=0x7f721cd2da60, TEX0=<optimized out>, TEXA=<optimized out>, hash_st=..., temp=0x7f71ae6ff040 "")
    at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/Renderers/HW/GSTextureCache.cpp:2630
#12 0x000056174c87eb24 in GSTextureCache::HashTexture(GSRenderer*, GIFRegTEX0 const&, GIFRegTEXA const&) (TEXA=..., TEX0=..., renderer=0x7f721cd2da60) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/Renderers/HW/GSTextureCache.cpp:2667
#13 GSTextureCache::Source::PreloadLevel(int) (level=0, this=0x7f721d3f2920) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/Renderers/HW/GSTextureCache.cpp:1979
#14 GSTextureCache::Source::Update(GSVector4i const&, int) (this=0x7f721d3f2920, rect=<optimized out>, level=0) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/Renderers/HW/GSTextureCache.cpp:1771
#15 0x000056174c880f9a in GSTextureCache::LookupSource(GIFRegTEX0 const&, GIFRegTEXA const&, GSVector4i const&, GSVector2T<int> const*) (this=0x7f721d1b0b30, TEX0=<optimized out>, TEXA=<optimized out>, r=..., lod=<optimized out>)
    at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/Renderers/HW/GSTextureCache.cpp:431
#16 0x000056174c885db1 in GSRendererHW::Draw() (this=0x7f721cd2da60) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/Renderers/HW/GSRendererHW.cpp:1425
#17 0x000056174c841603 in GSState::FlushPrim() (this=0x7f721cd2da60) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSState.cpp:1475
#18 0x000056174c8437e9 in GSState::Flush() (this=0x7f721cd2da60) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSState.cpp:1375
#19 GSState::GIFRegHandlerTRXDIR(GIFReg const*) (this=0x7f721cd2da60, r=0x7f7251c2ef10) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSState.cpp:1338
#20 0x000056174c830201 in GSState::Transfer<3>(unsigned char const*, unsigned int) (size=65, mem=0x7f7251c2ef10 "", this=0x7f721cd2da60) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GSState.cpp:1897
#21 GSgifTransfer(unsigned char const*, unsigned int) (mem=0x7f7251c2eed0 "\004", size=<optimized out>) at /home/aaron/Documents/progs/pcsx2/pcsx2/GS/GS.cpp:428
#22 0x000056174c6552af in SysMtgsThread::ExecuteTaskInThread() (this=0x56174d1a5450 <mtgsThread>) at /home/aaron/Documents/progs/pcsx2/pcsx2/MTGS.cpp:395
#23 0x000056174cba03ca in Threading::pxThread::_try_virtual_invoke(void (Threading::pxThread::*)()) [clone .constprop.0] (this=0x56174d1a5450 <mtgsThread>, method=<optimized out>)
    at /home/aaron/Documents/progs/pcsx2/common/ThreadTools.cpp:571
#24 0x000056174ca4b017 in Threading::pxThread::_internal_execute() (this=0x56174d1a5450 <mtgsThread>) at /home/aaron/Documents/progs/pcsx2/common/ThreadTools.cpp:670
#25 Threading::pxThread::internal_callback_helper(void*) (itsme=0x56174d1a5450 <mtgsThread>) at /home/aaron/Documents/progs/pcsx2/common/ThreadTools.cpp:721
#26 Threading::pxThread::_internal_callback(void*) (itsme=0x56174d1a5450 <mtgsThread>) at /home/aaron/Documents/progs/pcsx2/common/ThreadTools.cpp:709
#27 0x00007f72574335c2 in start_thread () at /usr/lib/libc.so.6
#28 0x00007f72574b8584 in clone () at /usr/lib/libc.so.6

PCSX2 is also reporting that it is disabling the cache in Baldur's Gate: Dark Alliance for both full and partial because it is using about a gig of VRAM.

I've got 4, so only about 3.5ish available for PCSX2. So maybe I am just running out. Does this check to see how much VRAM is available or does it automatically disable itself past a certain threshold?

@TellowKrinkle
Copy link
Member

PCSX2 is also reporting that it is disabling the cache in Baldur's Gate: Dark Alliance for both full and partial because it is using about a gig of VRAM.

I think 1gb is just a cutoff for the difference between games that are compatible with the system and games that do things that make them incompatible with it (allowing the system to use more vram will just leave more useless images in vram, and won't make things go faster)

I can say for sure that Baldur's Gate is incompatible with the system (it suballocates textures and updates small parts of them at a time, so hashing whole textures would just result in them constantly changing and needing to be reuploaded anyways)

@stenzek
Copy link
Contributor Author

stenzek commented Feb 20, 2022

I get a segfault past the initial loading screen on Need for Speed - Hot Persuit 2 on Linux with both OpenGL and Vulkan and both Partial and Full preloading. I've got a Radeon RX 570 on Mesa 21.3.6.

Let me guess, AVX2 build? Should be fixed with the last push.

@AaronBPaden
Copy link

Yup, you got it! :)

@arronh4599
Copy link

I just wanna say this is amazing. Gave me a huge boost with Gran Turismo 4.

@lightningterror lightningterror merged commit 9d51c64 into PCSX2:master Feb 21, 2022
@stenzek stenzek deleted the gs-hash-cache branch April 15, 2022 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants